A Day in the Life of an Undergraduate Statistical Consultant
Seth Huiras, Kate Virkler, and Lizzy Zahn
It is out of the ordinary for undergraduate social science students to receive statistical consulting on academic research projects. Furthermore, it is unusual for the consultants to be undergraduate statistics students. This past academic year, however, both opportunities came together at St. Olaf College.
St. Olaf is a small liberal arts college in Northfield, Minnesota, that prides itself on interdisciplinary studies and service to others. One of the programs offered toward that mission is the Center for Interdisciplinary Research (CIR), funded through a grant from the National Science Foundation. Each year, the CIR fellowship supports 18–24 undergraduate, upper-level statistics students working on various research projects with faculty members across campus. These research collaborations vary in academic discipline and range from mapping the tuberculosis genome to assessing critical thinking skills to analyzing ethics in the computing profession.
Additionally, CIR students hold weekly office hours to help faculty and students with their independent research projects. The students also meet as a group every Monday night to develop research and consulting skills and listen to speakers with statistical experience while enjoying food from local establishments.
In addition to CIR, St. Olaf offers a strong curriculum of about a dozen statistics courses across several departments that can lead to a four-course concentration in the subject. Because statistics is not a major at St. Olaf, all three of us on the statistical consulting team have different areas of expertise: Seth’s major is chemistry, Kate’s is mathematics, and Lizzy’s is psychology.
Starting Our Consulting Project
During the 2008–2009 academic year, our CIR team was guided by statistical mentor and professor Paul Roback as we served as statistical consultants for fellow undergraduates in the Foundations in Social Science Research course within the sociology/anthropology major. With the leadership of their course instructor and our research advisor, Ryan Sheppard, the students conducted a cross-sectional study during the fall semester of 2008, investigating social and intimate relationships on campus through a randomized campus survey. Within the study, sociology students in the research methods course formed seven research groups, and each group chose to investigate specific topics ranging from dating and hook-ups to homophobic attitudes.
Each of us was assigned to work with 2–3 of the research groups individually from the beginning stages of forming research questions and hypotheses to the final presentation of the findings at the Midwest Sociological Society Conference that was held in Des Moines, Iowa. All three of us had minimal exposure to sociological studies, thus we started our consulting experience by learning some of the basics of sociological research from Sheppard, Roback, and previous studies.
We began with a review of the campus survey from last year’s sociology research methods course that investigated electronic and World Wide Web social networking. We discussed last year’s survey to search for ways to improve the campus survey. Our team came to a consensus with Sheppard and Roback that we would administer the campus survey to a randomly generated list of students over the St. Olaf network this year, rather than issue paper copies as in previous years. Our decision to administer the survey electronically produced a response rate of approximately two-thirds, compared to a response rate near one-third last year. The electronic survey also saved us time that would have been spent entering data by hand. We also learned from last year’s survey that it would be important for us to advise the research groups to ask questions in ways that yielded outcomes relevant to their primary and secondary research questions. To create relevant survey questions, we found ourselves communicating the fundamental differences between quantitative and categorical data and soon learned the importance of ordinal data in sociological studies.
While reviewing the literature of other published sociological survey studies, we discovered many of the responses are ordinal in nature, often from a Likert scale. The use of Likert scales to measure levels of agreement or disagreement is a beneficial technique in sociology because it more accurately quantifies responses that are inherently difficult to quantify. Examples of such responses from this project include feelings of intimacy and emotions involving loneliness. From these Likert responses, we often created Likert indexes by summing related Likert responses, resulting in a larger range of response values for the index. For example, one of the research groups was interested in how students’ perceived social distance might be associated with perceived loneliness. To produce results that could be analyzed, we created a summation of seven Likert sub-questions relating to students’ perceived social distances. Thus, we combined seven Likert questions ranging from 0–4 into a single response ranging from 0–28 that produced more precise and powerful analyses.
Early Challenges: Back to the Basics
One of our more definitive challenges involved learning the utilities of a statistical software program used within the sociology discipline, SPSS. Each of us was familiar with STATA, Minitab, and R, having used them in our statistics courses; however, we were all new to SPSS. While the program is relatively user-friendly, we struggled to find ways to organize and code the data to create graphical representations during our preliminary and exploratory analysis. While two of us stayed within the SPSS program and learned the intricacies of using it is a statistical software tool, one of us brought the data into R to create representations in a more familiar environment.
Another challenge was learning how to communicate across two disciplines. For instance, it was difficult for us to use terms such as “nonparametric,” “quantitative response,” or “logistic regression” when our peers had few experiences with statistical analysis prior to this study. To be effective, we had to think of basic ways to explain particular concepts and terms that made sense to the research groups. This challenge took us back to the basics of our statistical education. We had to remember how our first statistics professors explained the difference between categorical and quantitative data. Additionally, we had to define nonparametric as a “less restrictive” technique to understand associations between response and predictor variables. We also remembered our basic understanding of the odds as “the probability of a success divided by the probability of a failure.” We did have some humorous misunderstandings, such as when Sheppard stared at us in shock as we discussed the “gender discrimination” that occurred during our statistical test. Consequently, the communication challenges forced us to have a complete understanding of some of the statistical tools and terms we thought we already knew.
Challenges also arose when we met with the research groups to discuss their primary research questions. When we listened to their hypotheses, we became familiar with the etiologies of some of the terms used within sociology. For instance, we learned the different connotations that terms such as “gender,” “a college hook-up,” and “transgender and transsexual roles” could have within our campus society and how important it was to accurately define them in our study if we wanted to produce relevant data. Overall, our ability to define and learn different terms across the two disciplines was a beneficial and necessary first step in the consulting process.
Defining Our Role: Forming Questions
Upon meeting with the groups, our first task was to help them brainstorm about how they might improve their research questions and hypotheses. With limited space on the survey reserved for each group, it was important that they have specific questions in mind. We also encouraged them to think ahead to the data analysis stage—do the questions they ask provide the outcomes necessary to assess their hypotheses? We worked together to formulate questions that would yield responses addressing their primary and secondary research questions.
As statisticians, we tried to quantify as many variables as possible, which can be challenging when it comes to asking questions about attitudes or emotions. Eventually, we were able to find solutions that would satisfy both parties, such as building Likert scale indexes or including quantitative responses for outcomes such as the total number of relationships a subject had during a given period. It was helpful for us to have a look at the electronic questionnaire before it was sent out because we were able to think ahead to make suggestions for how to collect and code the data. We also were able to start thinking as a consulting group about how we might analyze the data.
What was more important to explain early on was our intended role as undergraduate statistical consultants. After all, the sociology students were our peers and we did not want to step on any toes. As it turned out, some groups came expecting us to simply guide them through the stages, while others came under the impression that we would do the complete statistical analysis for them. Fortunately, we were somewhat prepared for the diversity of statistical clients from our experience through CIR.