Home » Additional Features

Put the Data in DataFest

1 July 2019 428 views No Comment

Rob Gould

    An informational meeting will take place Monday, July 29, from 9:00 a.m. to 10:30 a.m. in Mineral Hall A of the Hyatt at the Joint Statistical Meetings in Denver for those interested in having their students compete in the next DataFest or “donating” data.

    The ASA DataFest—held every spring at more than 40 colleges and universities around the world—brings together a community of undergraduate students, faculty, graduate students, and data professionals. But without the data, there is no DataFest, which is why we are asking for help in finding data for future DataFests.

    A data set and challenge contributed by an organization are the heart of ASA DataFest. Previous data donors include the Los Angeles Police Department, Kiva.com, eHarmony, GridPoint, Edmunds.com, Ticketmaster, Expedia.com, and Indeed.com.

    The Canadian National Sports Institute sponsored the 2019 event by providing GPS and accelerometer data for the Canadian National Women’s Rugby 7 team for every game in the previous season. The data also included daily health and training data.

    Ming-Chang Tsai, the researcher who provided the data, challenged students to describe the role of fatigue in the team’s play. Several thousand students worked on this problem from a variety of viewpoints. Some applied factor analysis or other approaches to convert the daily self-reported information about exertion, sleep quality, and mood into a single “fatigue” score. They then used this score to evaluate the effectiveness of training drills. Others provided striking visualizations that examined the multivariate relationships between self-reported measures and training-day outcomes. One team from Pomona College scraped players’ Instagram feeds to discover when and where the team traveled for away games and found fatigue spiked when games were too close to long flights.

    The primary quality we look for in a data set is personality; we want students to know there are humans behind the data who are truly interested in what the students might discover. A good data set will be rich and complex. Ideally, it will have at least 100 relevant and unique variables and provide students of all levels and backgrounds with numerous “pathways” for meeting the challenge.

    Due to logistical constraints, we cannot ask students to sign nondisclosure agreements, so we expect to work with the data donor through an iterative process from August through February to provide a data set that meets our criteria while also satisfying the data donors’ confidentiality and proprietary interests.

    Donating data provides your organization with the opportunity to reach out to thousands of talented and up-and-coming data professionals from a variety of institutions.

    If you have access to data that might be suitable for DataFest, contact Rob Gould or Donna LaLonde.

    1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
    Loading...

    Comments are closed.