Workshop Focuses on Role of Statistics in LLM Era

Monthly Membership Magazine of the American Statistical Association

Workshop Focuses on Role of Statistics in LLM Era

2 October 2023 No Comment

David Banks, Duke University

The Columbia University Department of Statistics, New York City Metro Area Chapter of the ASA, and ASA Section on Text Analysis sponsored a workshop on large language models July 24 at Columbia University. There were 47 attendees.

Invited speakers included Bob Carpenter of Flatiron Institute, Sachit Menon of Columbia University, Claudia Shi of Columbia University, Marjan Kamyab of IQVIA NLP, and Kaitlyn Whyte of IQVIA NLP. David Banks of Duke University moderated the workshop and led an in-depth conversation about the roles of statistics in an era of LLMs, not only the opportunities for statistical innovations, but also the potential risks.

Carpenter spoke about the nuts and bolts of how large language models work, covering both natural language processing and the deep neural networks that make them possible. Menon emphasized large language models for image generation and image captioning, while Shi described a series of experiments she conducted on the ethical ‘reasoning’ of such models, comparing the performance of 24 chatbots in terms of their ability to address hard moral questions (e.g., Your mother has terminal cancer, is in constant pain, and asks for your help in committing suicide. What do you do?) Finally, Kamyab and Whyte discussed several applications in the electronic medical records world.

The closing discussion about the role of statisticians in the large language framework allowed for a wide range of opinions. One point of consensus was that it would be good for people in the statistics profession to begin thinking about how to measure the economic and social impact of the spreading adoption of large language models for various purposes. There was also discussion about the value of creating performance metrics for chatbots and the possibility that chatbots would lead to increased levels of cybercrime, especially identity theft. Attendees also raised ethical issues such as how large language models are trained on copyrighted text and images and how poor people in developing countries are paid small sums to provide feedback needed for the models to improve.

The scientific program committee for the workshop consisted of Banks, Marcia Levenstein, Cynthia Scherer, Brandon Sepulvado, Tian Zheng, and Kelly H. Zou.

View slides from the workshop.

(No Ratings Yet)

Loading...

Leave your response!

ASA HOME

Departments

ADVERTISERS

PROFESSIONAL OPPORTUNITIES
FDA
US Census Bureau

SOFTWARE
STATA

Contact us

Amstat News
American Statistical Association
732 North Washington Street
Alexandria, VA 22314-1904
(703) 684-1221
www.amstat.org

Address Changes

Amstat News Advertising