Home » Featured

New Master’s or Doctoral Data Science/Analytics Programs

1 December 2018 9,875 views One Comment
Steve Pierson, ASA Director of Science Policy

The proliferation of master’s and doctoral programs in data science and analytics continues, seemingly due to the insatiable demand of employers for data scientists. Amstat News started reaching out two years ago to those in the statistical community who are involved in such programs to find out more. Given their interdisciplinary nature, we identified programs involving faculty with expertise in different disciplines—including statistics, given its foundational role in data science—to jointly reply to our questions. We have profiled many universities in our AprilJune, and December 2017 issues and January and April 2018 issues; here are three more.

    COLUMBIA

    Tian Zheng is a professor of statistics and associate director for education for the Data Science Institute at Columbia. She develops novel methods for studying complex data from different application domains and is currently the chair-elect for the ASA’s Statistical Learning and Data Science Section.

     

    Jeannette Wing is Avanessians Director of the Data Science Institute and professor of computer science at Columbia University. Before Columbia, she was corporate vice president of Microsoft Research. She is widely recognized for her intellectual leadership in trustworthy computing.

     


    Cliff Stein is a professor of industrial engineering and operations research and computer science at Columbia and chair of the curriculum subcommittee of the Data Science Institute’s education committee. He has been conducting research in combinatorial optimization, scheduling, and algorithms for large data.

     


    Daniel Hsu is an associate professor in the Computer Science Department and a member of the Data Science Institute, both at Columbia University. His research interests are in algorithmic statistics and machine learning.

     
    Degree name: Master of Science in Data Science
    Year in which first students graduated/expected to graduate: December 2015
    Number of students currently enrolled: 327 (two cohorts)
    Partnering departments: Data Science Institute (lead), Computer Science, Statistics, Industrial Engineering and Operations Research
    Program format: In-person; 30 credit hours required; a capstone project at the end of the program
    Full-time/Part-time: We have both full-time and part-time students from a wide range of backgrounds (e.g., arts, humanities, business, science and engineering). Our students are at different career stages, from recent college graduates to mid-career managers.

      What are the basic elements of your data science/analytics curriculum, and how was the curriculum developed?

      An interdisciplinary education committee has been an important part of Columbia’s Data Science Institute (DSI) since the beginning, with members from computer science (CS), statistics, industrial engineering and operations research (IEOR), and other departments. This education committee discussed and developed the curriculum for the MS in data science program. Twenty-one credits of the program are core required classes, and nine credits are electives. The core required classes include three courses from statistics (two foundational courses in probability and statistics, one course on exploratory data analysis and visualization), three courses from CS (algorithms, machine learning, and computer systems), and one course on capstone with a curriculum component in data ethics.

      Prerequisites for admission include mathematical preparation and some familiarity with computing. Prior industry experience is valued during the admission process, but not required. During the program, students with prior coursework in statistics or CS can be granted waivers for some of the core required courses to provide the flexibility to take more electives.

      As the DSI emphasizes interdisciplinary research and collaboration, we provide students with the flexibility to look across campus at domain areas to fulfill elective requirements. In addition to taking advanced coursework in CS, statistics, and math, DSI students have taken technical classes in business, law, journalism, architecture, bioinformatics, and various departments throughout the university.

      Students often take advantage of the many research opportunities across campus to gain additional hands-on experience, which can be used for elective credit. Many students will intern during the summer. DSI offers career support to obtain internships, including hosting a DSI internship fair in the spring.

      MS students are required to complete a capstone project during their final semester. This course provides a unique opportunity for students in the MS in data science program to apply their knowledge of the foundations, theory, and methods of data science to address data science problems in industry, government, and the nonprofit sector. The course activities focus on a semester-length data science project sponsored by a faculty member, nonprofit organization, or industry affiliate of DSI. The project synthesizes the statistical, computational, and engineering challenges and the social issues involved in solving complex real-world problems.

      Data ethics is embedded in our curriculum as discussions in individual courses and a more focused mini-curriculum in the capstone course.

      What was your primary motivation(s) for developing a master’s data science/analytics program? What’s been the reaction from students so far?

      Data science is emerging as a vital intellectual discipline driven by the increasing demand in all sectors for skilled practitioners who can extract value from today’s data. As a highly interdisciplinary field, aspiring students need training in computer science, statistics, and optimization algorithms to become data scientists who can solve applied problems around understanding, exploring, and forming predictions from data.

      The Columbia University MS program in data science aims to shape an academic program that prepares a workforce of data scientists for a career in this rising field. Graduates of this program will pursue careers as data scientists, analysts, and researchers across all sectors.

      Our program attracts students from a diverse pool. While the majority of applicants have an engineering or technical background, about 21 percent of the fall 2018 applicants earned a degree in math or statistics and 19 percent hold degrees in nontechnical disciplines, including biology, business, economics, law, medicine, philosophy, physics, psychology, religious studies, and urban planning.

      The fall 2018 admissions cycle had 1,624 applications with a 17 percent acceptance rate. Of the 174 MS students who enrolled this fall, 24 percent are US citizens or permanent residents and 34 percent are female. Our international students are comprised of individuals from 16 countries, including China, India, France, South Korea, Mexico, and Thailand.

      How do you view the relationship between statistics and data science/analytics?

      Statistics is a foundational area for data science that provides theory and methods for understanding variation and trends in observed data and deriving inferential insights about the data-generating mechanism behind the data.

      It is especially essential for drawing interpretable inferences and predictions based on statistical models and machine learning methods and addressing the biases and uncertainty in a data science application.

      Statistics complements other areas of data science, such as machine learning and optimization, which provide the algorithmic and mathematical tools that enable the statistical methodologies, as well as nonstatistical models, for data science applications.

      Subjects that may not have traditionally been in the purview of classical statistics, such as computational complexity, have become active research areas of statistics, in part due to increased interactions with other data science disciplines.

      What types of jobs are you preparing your graduates for?

      The Data Science Institute programs prepare graduates for roles throughout the data science lifecycle of a company. Our graduates have placed in roles such as data scientist, data engineer, data strategist, software engineer on a machine learning team, machine learning engineer, strategic consultant, and quant analyst.

      The advanced technical and statistical training our students receive prepare them well for companies in need of big data support in every industry and throughout the world.

      DSI grads are contributing to recommendation engines at large tech companies, detecting fraud and inappropriate content at social media companies, mapping the needs of underserved neighborhoods using Twitter data, creating new investing strategies at finance firms, managing algorithms for post-disaster response for large cities, creating fraud detection software, and solving many other corporate and societal challenges.

      What advice do you have for students considering a data science/analytics degree?

      At Columbia, we believe data science should touch all fields, professions, and sectors. We consider applications from students of all academic backgrounds as long as they are motivated to learn data science and well-prepared in math and computing, which can be demonstrated in one’s application through nontraditional preparations such as nondegree courses, working, and/or research experiences. Our program’s core ensures rigorous training in data science, while our personalized advising model provides flexibility to support different learning trajectories. Students from fields that are different from CS, statistics, and IEOR are all welcome to inquire and apply.

      Our program’s core provides students with a set of skills that overlaps with programs such as CS, statistics, or IEOR but has its own distinct flavor. Every year, there are more jobs in a variety of areas that require the distinct blend of skills emphasized in our data science program.

      For future data scientists who are considering a data science degree, our advice is to look for programs that are well grounded in the foundations of data science (including statistics), provide experiences with real-world data science applications, and have data ethics embedded in the curriculum.

      Describe the employer demand for your graduates/students.

      Employer demand for data science graduates is high and critical to the success of evolving businesses. Every industry—including finance, tech, health care, media, government, and nonprofits—is growing their data science talent pool.

      DSI graduates fill roles that fall within the data lifecycle of a company. In the more than four years since our academic programs have launched, more than 500 companies have recruited directly from our academic programs, with our students placing at 98 percent in the field, demonstrating the high demand for our graduates.

      Do you have advice for institutions considering the establishment of such a degree?

      Data science is a “team sport.” It takes substantial collaboration to create an interdisciplinary program in data science. Institutions should create incentives for departments and individual faculty and provide resources for the program to support such a collaboration.

      For example, administratively, academic programs may need to be hosted in an academic school/department. Having an interdepartmental program housed in a single-discipline department adds an additional burden to the host department and creates different “classes” of students within the same department/or shared space that compete for limited resources.

      At Columbia, although the MS in data science program is administratively hosted in the CS department, the DSI serves as the primary operating unit. This provides our students undivided support for their academic life on campus, ranging from advising and collaborative space to career development.

       

      UNIVERSITY OF KANSAS

      Mandy Rametta, Matthew Mayo, Shana Palla, and Jo Wick

      The graduate education team in the department of biostatistics at the University of Kansas Medical Center consists of Matthew Mayo, director, professor, and founding department chair; Jo Wick, associate director and associate professor; Shana Palla, assistant director and teaching associate; and Mandy Rametta, education coordinator. Together, they oversee the PhD in biostatistics; MS in biostatistics; MS in applied statistics and analytics; and graduate certificates in biostatistics, applied statistics, data science, and biostatistical applications.

      Degree name: Applied Statistics and Analytics Master’s Program – Data Science Emphasis
      Year in which first students graduated/are expected to graduate: 2019–2020
      Partnering departments: None. Hired statisticians with computer science backgrounds within the department of biostatistics to assist with curricular development and teaching.
      Program format: 100 percent online. Requires 30 credit hours: 12 in core areas including linear regression, categorical data analysis, professionalism, leadership and ethics, and multivariate analysis; 12 in emphasis areas including programming in R, data visualization and acquisition, data mining, and statistical learning; and six hours of elective courses.

      This program will provide students integrated training that spans the domains of statistics, machine learning, data visualization, and workflow management. The statistics core courses will give graduates exposure to commonly used statistical methodologies for continuous, categorical, and multivariate response data, along with providing students with hands-on experience working with and analyzing large data sets.

      The computing courses will provide graduates with experience working with the most common statistical software, and the data science emphasis courses will provide graduates with hands-on training in acquiring, visually exploring, and analyzing big data and building predictive models.

      In addition, online access to all required coursework provides flexibility to accommodate working professionals seeking advanced training in data science.

      This program, when enrolled full-time, can be completed in four semesters, plus one summer semester. Assuming part-time enrollment is equivalent to 9–12 credit hours per calendar year, the program can be completed in 2.5–3 years. Students may have any undergraduate background, but are required to have a B average or higher in Calculus I and II, as well as a course with a computer programming requirement.

        What was your primary motivation(s) for developing a master’s data science/analytics program? What’s been the reaction from students so far?

        Students from the MS in applied statistics and analytics program were asked if they would be interested in the new data science emphasis. In the first semester the data science emphasis was offered, 32 of the 112 (29 percent) students enrolled in the MS in applied statistics and analytics program switched to the data science emphasis from the other two emphasis options: statistics and analytics.

        How do you view the relationship between statistics and data science/analytics?

        The field of data science has grown in popularity so quickly that the formal definition of a data scientist is still developing, but it is viewed as an integrated discipline that draws on statistics and computer science (and sometimes other areas of expertise). As a department built on statistics, we think a solid foundation in traditional statistical methodology is paramount to being a successful data scientist.

        What types of jobs are you preparing your graduates for?

        MS in applied statistics and analytics graduates with a data science emphasis can be employed in many sectors, including business (e.g., marketing, economics, engineering), biomedical, finance (e.g., accounting, billing), manufacturing, insurance, professional services, and information technology. According to Forbes, data is growing faster than ever and every human being is creating 1.7 megabytes of data. IBM predicts the demand for data scientists will increase 28 percent by 2020. They also state that 39 percent of data scientist and advanced analyst positions will require a minimum of a master’s degree.

        What advice do you have for students considering a data science/analytics degree?

        We would advise prospective students to contact working professionals in the field of data science to get first-hand insight into the responsibilities, challenges, and rewards of being a data scientist. We would also encourage prospective students to peruse a recently published article in Harvard Business Review, titled “What Data Scientists Really Do, According to 35 Data Scientists.”

        Describe the employer demand for your graduates/students.

        There is a rapidly growing demand in the workforce for graduate-trained data scientists with excellent hands-on skills in statistical and computational methods for the acquisition and analysis of big data coupled with strong communication skills. This MS in applied statistics and analytics program with a data science emphasis is designed to produce master’s-level trained data scientists with knowledge, experience, and skills sufficient to make immediate impact within the workforce.

        Do you have any advice for institutions considering the establishment of such a degree?

        Data science is a rapidly developing discipline. As such, it is paramount to keep a data science program flexible, so it can evolve to meet the needs of students who will soon be entering the workforce. Therefore, we recommend forging industry partnerships, as we are doing, that allow the institution to stay abreast of changes in the profession.

         

        UNIVERSITY OF TORONTO

        Arvind Gupta is a professor of computer science at the University of Toronto and co-founder of Palette. He has served as UBC’s president and vice chancellor and is the former CEO and founder of Mitacs, a leading Canadian nonprofit research and training organization. Photo by Ryan Perez

         

        Matt Medland is an assistant professor, teaching stream, at the University of Toronto’s Department of Computer Science and the managing director of the Master of Science in Applied Computing (MScAC) program. Photo by Paul Hillier

         

        Nathan Taback is an associate professor in the University of Toronto Department of Statistical Sciences and the department’s director of data science programs. His interests include data science, statistical consulting, biostatistics, and statistical education.

         

        Degree name: Master of Science in Applied Computing, Data Science Concentration (MScAC-DS)
        Year in which first students graduated/expected to graduate: 2012
        Number of students currently enrolled: 115
        Partnering departments: Lead for MScAC is Computer Science; Lead for MScAC-DS is Statistics
        Program format: In-person, full-time, 16 months

        • First eight months: four technical graduate courses (two in statistics, two in computer science)
        • Two professional skills courses over the lifetime of the degree
        • Last eight months: research internship

        What are the basic elements of your data science/analytics curriculum, and how was the curriculum developed?

        A good data scientist requires expertise in statistical reasoning and inference; training in data management, manipulation, computation and analysis; and experience in scientific or industrial collaboration. We designed our curriculum with this ideal set of skills in mind.

        Our students take four graduate courses. Students choose two of the four courses from the department of computer science and the other two from the department of statistical sciences. Coursework should also include a course in data science methods, collaboration, and communication.

        A unique aspect of the program is the opportunity to collaborate on an applied research project in an industrial setting. The program’s eight-month internship with one of our many industrial partners allows students to gain experience in applying their research skills in an industrial setting. These paid internships involve supervision by both an academic and industrial adviser.

        What was your primary motivation(s) for developing a master’s data science/analytics program? What’s been the reaction from students so far?

        Our media constantly reminds us of the emergence of large-scale complex data in nearly every facet of daily life. There has been a massive increase in the amount of data available from new technologies that seem to be emerging on a daily basis. New data sources such as network data, image data, and streaming data are all part of a trend set to intensify. So too will the need and demand for data scientists.

        The significant number of research problems arising from the industry, industrial demand for data scientists, and students’ interest in the field were reasons that spurred us to develop the MScAC-DS.

        Additionally, Ontario is planning a 25 percent increase in the number of STEM graduates over the next five years, which includes boosting the number of graduates in AI-related fields such as data science. The University of Toronto has become an internationally recognized center of excellence in AI-related fields, and Toronto has a vibrant ecosystem of companies in this area.

        The program grew organically with students in computer science and statistics taking courses in both departments to essentially try to complete what we are now calling a data science degree. In 2013, for example, a student who had enrolled in the MSc program in statistics took machine learning in computer science and has been working as a data scientist at Amazon since graduation.

        There has been an overwhelming interest from students in the program.

        How do you view the relationship between statistics and data science/analytics?

        The emergence of large-scale complex data in every facet of academic and daily life has been accompanied by an increasing demand for expertise at the interface of the computational and statistical sciences, particularly machine learning. The importance of this interface has only grown in time—a trend that argues for the continuing integration of the two disciplines.

        No professional statistician can possibly hope to make a meaningful contribution in collaboration or research without serious computational skills. Conversely, computer scientists need far more in-depth statistical training to fully understand the behavior and impact of the tools and algorithms they invent. The MScAC-DS serves to meet the above demand.

        What types of jobs are you preparing your graduates for?

        The main goal of the MScAC-DS program is to teach students how to apply their knowledge of statistics and computer science to real-world problems in an industrial setting. The demand for well-rounded data scientists can’t be overstated.

        For example, the financial technology (FinTech) sector in Toronto has seen a rapid increase in demand for data scientists in recent years. In 2016, a total of 25 data science FinTech internship positions were posted to the MScAC cohort; only 14 were filled due to a lack of available students.

        FinTech is not the only source of demand for data science industry positions. Demand is strong across a wide range of sectors, including the biomedical, mobile and IoT, manufacturing, and IT sectors. Over the last five years, more than half of MScAC internships involved data science.

        What advice do you have for students considering a data science/analytics degree?

        The MScAC-DS requires students to take graduate-level courses in both computer science and statistics, so students should have taken appropriate courses at the undergraduate level in preparation. The majority of students entering our program have some work experience.

        The intersection of statistics and computer science is an intellectually rich, diverse, and growing area. Demand will likely continue to grow for quite some time as we are still at the start of the data revolution.

        Our data science students acquire analytical and problemsolving skills along with the demonstrated ability to apply their knowledge in real-world settings. The MScAC-DS program welcomes students with a wide range of academic backgrounds from statistics, math, computer science, economics, and engineering.

        Do you have any advice for institutions considering the establishment of such a degree?

        The basis for the success of the MScAC-DS is the true partnership between U of T’s department of computer science and department of statistical sciences. Part of our partnership’s strength and success lies in our shared focus on meeting a real-life societal demand for data science research expertise.

        What is also working in our favor is the balancing of responsibilities—with computer science leading the MScAC program and statistics leading the data science concentration. Statistics is responsible for ensuring the right breadth of courses for students, while computer science is responsible for ensuring students have the appropriate computer science courses available to them. Only one team puts together research internships for all students in MScAC, which ensures research standards are met.

        1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
        Loading...

        One Comment »

        • Jocelyn said:

          There are lots of different outcomes we have to take into account and blend to discover our EV.