Home » Section News, Statistical Learning and Data Science

Statistical Learning and Data Science Has Year of Successful Webinars

1 December 2021 No Comment
Jaime Lynn Speiser, Assistant Professor of Biostatistics and Data Science, Wake Forest School of Medicine

    The Section on Statistical Learning and Data Science launched a monthly webinar series in 2021. The webinars, free for attendees, highlighted topics in data science across industry and academia. Each webinar included both introductory and advanced material.

    Invited speakers were selected based on their expertise in mathematics, statistics, biostatistics, computer science, and industry. They represented a diverse pool of experts at differing levels in their careers, each with a unique perspective to share.

    The webinars took place on Zoom toward the end of each month. Attendees registered for each webinar through a link shared on the section’s discussion board. In total, the section sponsored 11 webinars in 2021. Four main themes emerged: career advice; reproducibility; interpretability; and methods in data science.

    The first webinar theme was career advice for success in data science. Helen Zhang spoke about building data science teams based on her experience with the University of Arizona Transdisciplinary Research in the Principles of Data Science network. Wayne Lee shared his experiences across industry and gave advice about careers in data science in classical industries (e.g., agriculture and manufacturing) versus modern industries (e.g., technology). Sarah Kalicin, the section’s president, gave a webinar about successful career relationships involving mentoring, coaching, and sponsorship. All the speakers gave advice about communicating within data science teams, seeking different types of careers in industry and academia, and leveraging relationships for career advancement.

    Two webinars focused on reproducibility in data science. Brian Lee Yung Rowe provided a framework for reproducibility and automation in data science projects based on computational graphs. Byron Jaeger gave an overview of Git and GitHub that facilitate easy version control and collaboration. Given the current reproducibility crisis in our field, these webinars were timely and informative for improving reproducibility in data science projects across industry and academia.

    The next theme emerging from the series was interpretable data science and artificial intelligence. Beth Wolf discussed variable importance measures for common machine learning models with examples in medicine. Polo Chau shared projects involving scalable, secure, and interpretable artificial intelligence with applications in image recognition and cyber security. These webinars highlighted the importance of interpretability in data science and understanding inputs and outputs of data science models.

    The final theme for the series was advancing data science methods. Naomi Brownstein presented a testing framework for clusterability that allows for determining if unlabeled data has a cluster structure. Andreas Ziegler discussed calibration techniques for binary outcome models that can be used to adjust predictions and increase accuracy for validation with external data. Jean Feng gave an overview of deep learning and shared her work involving methods for variable selection with deep learning for small-scale data. Nesime Tatbul presented about challenges and opportunities for time series analysis with data science and introduced new methods for anomaly detection.

    These four webinars feature novel methods being developed in data science. A take-away is that there are many opportunities for novel methods development in data science.

    The section’s first year of monthly webinars were recorded and are available.

    If you have suggestions for webinar topics and/or speakers, email sldswebinar@gmail.com.

    1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)

    Comments are closed.