Home » President's Corner

MOOCs and Statistics Education: Fad or Opportunity?

1 November 2013 2,014 views One Comment
Marie Davidian

Marie Davidian

In 1983, as a graduate student at The University of North Carolina at Chapel Hill, I was handed the book Statistics—by David Freedman, Robert Pisani, and Roger Purves—and told I would have sole responsibility for teaching a section of STAT 11, the undergraduate introductory course for non-majors. I was to develop lectures for an audience of mostly non-quantitative students for whom the course was an unwelcome requirement. The technology at my disposal would be chalk and a blackboard (they were all black back then). At most, I would reach about 40 students (if they came to class).

In the 30 years since, things have changed dramatically. Tools and platforms for creating and disseminating lecture materials, assignments, and outside resources and advances in classroom technology have transformed how we communicate with students and relegated those black (and white) boards to auxiliary status. Data analysis and visualization software has spearheaded a revolution in how we introduce concepts. Interest in learning statistics has skyrocketed, even among students like those I taught in Chapel Hill.

And the Internet has facilitated reaching students who are not physically present in the classroom.

Although distance education courses have become commonplace over the past few decades, widespread interest in online learning among not only the higher education community, but the general public on a much broader scale, is more recent. Many institutions are launching entire online degree programs; several master’s programs in statistics are already under way.

Perhaps the most radical development has been the massive open online course, or MOOC. A MOOC is meant to reach an enormous audience of students worldwide, including those for whom traditional learning opportunities are infeasible, and provide open access to course content and platforms for interacting with fellow students. Articles about MOOCs abound in the popular media; The New York Times even named 2012 “the year of the MOOC”.

Several providers of MOOCs, both for- and not-for-profit, have emerged—including Coursera, Udacity, and edX —and involve partnerships with consortia of major institutions whose faculty deliver the courses. The business model for MOOCs—including financial sustainability, issuing of credit, and so forth—is still evolving, and there is both excitement and skepticism over whether MOOCs will revolutionize higher education.

What do MOOCs mean for statistics education? Many of the first MOOCs on introductory statistics were taught by computer scientists, psychologists, and others. However, several statisticians have since entered the MOOC arena and have a first-hand perspective on the implications for our discipline.

I asked Brian Caffo, Jeff Leek, and Roger Peng of the department of biostatistics at Johns Hopkins to share their experiences. All three are teaching MOOCs through the Hopkins collaboration with Coursera. Jeff teaches Data Analysis and Roger teaches Computing for Data Analysis, both of which are applied courses, and Brian teaches the more technical Mathematical Biostatistics Boot Camp 1 and 2. All are based on in-classroom courses they teach to first-year graduate students, with lectures delivered on video.

They admit that this happened a bit by accident. Jeff and Brian had been discussing incorporating blended learning into their courses, in which some content would be delivered via video, and looked into campus resources for implementing this approach. Simultaneously, the university was finalizing the agreement with Coursera, and, given they were already videoing lectures, they were asked to direct their efforts toward developing MOOCs. They recruited Roger to join them, and all had mere months to prepare their first MOOC offerings.

The courses Roger and Jeff teach attract breathtaking numbers of students. Jeff’s last offering of Data Analysis enrolled more than 100,000 and is #9 in cumulative enrollment among all Coursera offerings. Five days before the most recent offering of Computing for Data Analysis began, Roger had more than 68,000. Both remarked on the “stunning,” “maximal” variation in student backgrounds, from complete novices to experts. Roger’s course introduces R for data analysis, and students range from the completely R-naïve to computer scientists. Brian’s courses require calculus; nonetheless, his Boot Camp 1 attracted 16,000, 20,000, and 25,000 students the first three times it was offered and the current Boot Camp 2 has 9,000 enrolled. Some are mathematics professors who teach statistics; others are computer scientists, engineers, and physicists.

Not all enrollees complete the courses. Jeff reports that about 5%–10% of Data Analysis students earn a statement of completion, while another 30%–40% actively participate. That sounds low until one considers that 5% of the course enrollment is 5,000.

Managing this many students is daunting, and all discourage communication by email. They praise the course-dedicated online message boards, on which they spend 1/2 to 1 hour per day answering questions. The boards facilitate interaction among students. Often, a question is answered by scores of fellow students, many from the other side of the globe, while the instructor sleeps! International study groups form, and all express amazement at the extent to which students who have never met help each other learn. They see the size and diversity of the learning community, which in a seeming contradiction can be more personal than that of a traditional class because of this mechanism for discussion and sharing, as a real value for students.

The biggest challenge? Developing meaningful assessments when individual grading is not feasible. Multiple choice is easiest to implement, but devising questions that reflect knowledge gained is not. Roger uses special tools to automate evaluation of the quality of student programming. Jeff’s course involves numerous data analysis projects. He uses a “peer-grading” system, whereby each student’s work is randomly assigned to four others, who evaluate it using a rubric Jeff has developed. This leads to valuable discussions among students about what constitutes a “good” data analysis, further enhancing learning.

All cite the opportunity to reach a vast audience and the efficiency with which statistical content can be widely disseminated, particularly to those who otherwise would not have access, as immensely satisfying. And approaches learned in a MOOC are brought back to the classroom, improving traditional instruction.

What about the future? All see an enormous potential for enhancing global statistical literacy. A MOOC for the lay public, a “Citizen’s Guide to Statistics,” could draw hundreds of thousands of students. However, they see doing this right as a challenge that would likely require a team effort, major time commitment, and significant dedicated institutional resources.

Roger notes that a model for support of MOOCs has still not been developed—at Hopkins or elsewhere. So far, taking advantage of willing faculty like he and his colleagues has worked, but it is not sustainable. However, the benefits are worth determining how. Jeff sees the competition MOOCs can create among universities to offer the highest quality instruction as a positive force for encouraging all institutions to take teaching in all settings more seriously. As Brian points out, competing internationally to deliver quality instruction will inspire innovation and motivate instructors to continually update and refresh their courses.

On the flip side, concerns that a few stellar instructors will dominate teaching of a subject and have undue influence already have surfaced. As institutions seek to integrate provider-offered MOOCs into existing curricula, faculty have decried the resulting diminished in-person instructor-student engagement, as in the recent case (http://bit.ly/1hA8h3N) of philosophy professors at San Jose State refusing to use a famous MOOC developed by a Harvard professor for edX.

All acknowledge that the future is unknown. How MOOCs will affect degree programs remains to be seen. Roger notes that the MOOCs he, Jeff, Brian, and others offer seem to attract many students who would likely not enter a degree program at Hopkins, regardless, so may be filling a niche that will not result in increased degree enrollments. But Brian notes that their MOOC involvement has brought extensive exposure to the Hopkins Department of Biostatistics—for many people the world over, Hopkins biostatistics is statistics.

But is our profession sufficiently engaged? Roger, Brian, and Jeff are concerned that we are not, that we are failing to “future proof” ourselves against the threat of other fields co-opting the rich opportunities to disseminate statistical principles to diverse, vast audiences. They would love to see talented, innovative members of our profession become involved.

No one knows what the “mature” version of the MOOC revolution will look like. But Jeff, Roger, and Brian emphasize that they do not see MOOCs as a passing fad, and they are convinced by their experience that the demand for online statistics courses is enormous. In fact, all concur this demand may even extend to advanced courses. They see meeting the demand at all levels as a positive development for our field that will not take over or replace other forms of statistics instruction, but only serve to enhance and expand knowledge of our discipline across the globe.

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 5.00 out of 5)

One Comment »

  • Vincent Granville said:

    What about no course at all? I learned SAS, R, C++, SQL, Perl, XML, HTML, Markov Chain Monte Carlo, Hierarchical Bayesian Models, Constrained Logistic Regression, Time Series (including spectral analysis), Penalized Likelihood, Stochastic Point Processes, Web Crawlers, and more without ever attending a lecture on the subject. Even when I was a university student, I showed up only to pass the exams, which I passed more successfully than those who attended the lectures.

    Sure, not all students are like me, but I would imagine 50% are self-learners: all these students need only a list of resources – web sites, open source tools, list of projects to work on and books – to get started and become a data scientist or statistician or computer scientist. Of course you don’t get an official diploma if you learn all by yourself, but if you want to become a consultant, freelancer or entrepreneur, jobs of the future indeed, who care?