Home » Additional Features, Featured

We Love Data Competitions

1 May 2019 958 views No Comment
Christian Lucero, Virginia Tech Department of Statistics

    Students at Virginia Tech (VT) enjoy competing in the ASA DataFest every spring. This enthusiasm is shared by everyone involved, including the competitors, organizers, judges, and spectators. Chances are, many of you have similar feelings about this event and can share a number of anecdotes about your own experiences. However, I would like to take this opportunity to tell you more about our students and their passion for data science and data competitions in particular.

    Shortly before I joined VT, a new major titled Computational Modeling and Data Analytics (CMDA) had just taken root on the Blacksburg campus. Prior to the existence of this program, students who were interested in data science took courses in mathematics, statistics, and computer science with the hope they could find the best set of complementary tools to help them in the information age with a role as a data scientist. This new major combines the essential elements from these majors into an integrated curriculum, including 10 new courses specially designed for the major. In just a handful of years, the enrollment has reached nearly 500 undergraduates, with many students double-majoring or minoring in statistics. Ultimately, with so many interested in data science, our students decided a single competition like the ASA DataFest was not enough; they simply wanted more. This led us to develop our own local competition, which is held in the fall semester.

    Interested in Data Competitions?
    ASA DataFest is a great place to start! If you would like to attract younger students, try Statsketball, which is aimed at challenging high-school and undergraduate students to predict the outcome of NCAA basketball tournaments using statistics. Finally, the ASA hosts a biannual data exposition in which students submit a poster designed to highlight important aspects of a data set.

    After the 2017 ASA DataFest, the students who completed the event provided valuable feedback. Nearly every sentiment expressed was positive. For example, the participants overwhelmingly said they enjoyed learning a lot in a very short period of time. They were also excited to showcase their skills to their peers, the faculty, and the corporate sponsor representatives who served as judges for the event.

    While there weren’t many criticisms, there were two frequent complaints we thought possibly contributed to the competition’s high dropout rate (about 40–60% each year of the initial 40+ teams). The first criticism involved the length of the competition, as many participants expressed that the 48-hour window can be overwhelming for first-time competitors. Second, those students who only knew a handful of data visualization techniques and classical statistical methods did not think they could compete with the more advanced students.

    With these issues in mind, we set out to develop our own competition aimed at giving our students another opportunity to practice their skills while also providing a gentler introduction to data competitions.

    On November 6, 2018, we wrapped up the second annual CMDA Fall Data Competition. The primary goal was to minimize the number of participants who dropped out while helping them bolster their confidence in presenting their work before their peers and judges. The format of the competition is as follows:

    1. The competition lasts a full week. This allows students who have scheduling conflicts to find at least some amount of time to devote to the competition. This also allows students to learn new skills and consult with experts.
    2. There are two competition tracks: beginner and advanced. There is a different data set for each track. The beginners are expected to focus more on visualization methods and elementary statistical methods. The advanced participants typically use statistical learning methods and focus more on building models.
    3. Presentations are given to each judge individually as they walk around in a tri-fold poster session. The judges have more dynamic interaction with the groups than the judges are able to with the presentation format of the ASA DataFest.

    The new competition has been a hit with our students. The dropout rate has dropped to around 25–30% (now procrastination is identified as the primary factor). The biggest source of praise for this format involves the tri-fold poster session format. Students are able to discuss their work in much greater detail with all who want to listen. There is also a sense of camaraderie while groups take turns showcasing what they were able to accomplish during the week. Finally, the judges have expressed a fondness for this format, as they get a bit more time to absorb the thought process that went into the body of work in front of them.

    1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
    Loading...

    Comments are closed.