Home » Practical Significance

Practical Significance Take Two—How Can I Help You Today?

3 June 2024 No Comment
This interview with Mark Glickman, senior lecturer on statistics at Harvard University; Tian Zheng, statistics professor and statistics department chair at Columbia University; and Hongtu Zhu, biostatistics professor at The University of North Carolina, was conducted by Practical Significance co-hosts Donna LaLonde and Ron Wasserstein during a recent episode in which they discussed the implications of artificial intelligence, AI, for our profession.

Wasserstein: How should the field of statistics evolve to meet the challenges of AI development and application?

Glickman: Recently, AI has been an area statisticians have been paying pretty close attention to. Over the last 10 years or so, there have been some incredible successes in AI. The methods that underlie a lot of these procedures, particularly in neural nets and deep learning, are a little mysterious.

On the one hand, there is a solid foundation for many of these methods, but on the other hand, it doesn’t seem to correspond to things statisticians have been thinking very much about. One of the burgeoning areas in statistics and how it’s connected to AI is understanding the underpinnings of these methods in AI, which just seem to work.

So, there are certainly many statisticians doing amazing work at the forefront of understanding many of the underpinnings of these mostly deep-learning methods that are driving technology in impressive ways. The other area of AI statisticians are involved with, not so much in a direct developmental way, is helping to be able to evaluate AI methods and work with essentially the output of AI methods to develop approaches that have a bit more familiarity among statisticians.


The wind is blowing stronger, and the statistics community needs to continue at what we do well, but with a more proactive effort to accelerate and encourage more energetic participation. Big data calls for collaboration, and statisticians can contribute by being effective collaborators.

For example, the whole area of conformal inference is a really big area that’s attracting attention among statisticians, mainly because conformal inference basically starts with what could be a black box, and then starts making predictive inference without really even knowing very much about what generates observations. It’s a very powerful tool statisticians are jumping onto and a good example of an area in which statisticians continue making some very important progress in parallel with the development of AI.

Zhu: For the field of statistics to move forward in the AI era, there are many things we need to consider. First, the current curriculum needs to be modernized to fulfill the evolving demands of modern data science. Our curriculum was mainly developed 30 or 40 years ago and would need some modification, particularly in areas of engineering capabilities, practical data analysis appearances, and proficiency in data mining techniques.

Also, existing evaluation systems need some changes. Based on my observations, we need to speed up the review process in our field and make our papers more acceptable to data scientists and practitioners. There needs to be a balance between theory and application. We also need to provide more opportunities to young rising stars and active researchers.

Additionally, the systems need to include all the data science–related journals and conferences, not just focus on statistical journals, because there are so many new journals and conference proceedings. We need to be open-minded. It is very important to encourage and promote greater participation by statisticians in various study sections and train them in effective communication and equitable contribution.

Zheng: I want to build on what Mark and Hongtu said. Mark covered the exciting research trajectories we are observing and Hongtu shared needed infrastructure changes. So, I’m going to address a ‘cultural shift.’ As a discipline, and as a community, we have come a long way from 15 years ago when big data first started drawing global attention. We now recognize and promote computing-intensive, applied, data-intensive applications; machine learning; and research more than we used to. Computing, machine learning, and data-intensive applications are three important pillars of AI.

Statistics had a slow start when big data first happened, but many departments came together and took on initiatives to make change happen. However, over the past few years, there has been a feeling that we are falling behind again. We’re being left out of some of the AI conversations. This is because AI is moving at a much faster speed. A few technological breakthroughs in recent years enabled so many more fields to embrace AI, computing, and data, than in the previous big data era.


I believe the biggest threats to our discipline are the talent pipeline, faculty development, and research resources. As Hongtu said, if we’re not proactively revising our curriculum to enable our students and young researchers, statistics will not be part of today’s fast-moving national AI research effort.

The wind is blowing stronger, and the statistics community needs to continue at what we do well, but with a more proactive effort to accelerate and encourage more energetic participation. Big data calls for collaboration, and statisticians can contribute by being effective collaborators. AI is different. It is much more interdisciplinary and transdisciplinary than data science. It calls for end-to-end solutions and team science. If no one in statistics is willing to be the pioneer to go into AI and lead AI applications, then our community will not be able to keep pace with AI research.

Wasserstein: What are the biggest challenges for statistics in the future of AI?

Zhu: There are two major challenges for statistics in the future of AI, beginning with a diminishing pool of students in statistics. Another challenge is about new AI-related funding opportunities.

Zheng: I believe the biggest threats to our discipline are the talent pipeline, faculty development, and research resources. As Hongtu said, if we’re not proactively revising our curriculum to enable our students and young researchers, statistics will not be part of today’s fast-moving national AI research effort. We have students and young researchers but our talent pipeline into AI has been drying up. Because of this, we are not competitive for AI resources. If this cycle continues, we need to worry for the next generation: whether we can continue to have generation and generation of bright and excited young statisticians identify as decision-makers in data-intensive applications, grounded in the central foundational principle of statistics, and at the same time be able to embrace research in AI.

Glickman: One of my concerns has to do with a lot of the tension that has been going on between statisticians and computer scientists in the area of machine learning—deep learning in particular, where all this development has been done in a way in which the statisticians have been struggling to be at the table. It’s important to make clear to the computer science community that statisticians add value, so we’re not left behind.

Statisticians can be much more vocal about our expertise on uncertainty propagation, which most leaders in the field of AI may not be paying quite as much attention to. But it’s so ingrained in the statistician mindset, particularly when we’re working with enormous data sets being used for very personalized kinds of applications, like personalized medicine or personalized education. You do need to start worrying about the level of uncertainty in your conclusions, and that’s something statisticians can very much help with.

LaLonde: What about AI are you most excited about?

Glickman: Probably like many other people, I woke up one morning in late 2022 with the news that ChatGPT was made available to the public. The reason I paid attention was I was reading an article entirely written by ChatGPT, and then I was told, “Yes, this is something you could play around with.”

Even though I was certainly paying reasonable attention to what was going on in the world of AI, I was stunned at what some of these LLM-based generative AI algorithms could do. I’ll start by just simply saying I’m very excited about the promise of generative AI and a lot of what it can do.


Even though I was certainly paying reasonable attention to what was going on in the world of AI, I was stunned at what some of these LLM-based generative AI algorithms could do.

I regularly use generative AI to help me with my writing, at least at minimum as a way of proofreading my writing, because the language models are so good you can pretty much guarantee your writing is going to improve. I also use generative AI for coding tasks. So, if I need to implement something quickly, or I just don’t want to spend a half hour writing something up, I’ll just, in a sense, explain what I need to have coded up, and then I’ll just get my answer immediately. So, it’s a huge, huge time saver.

Zheng: I’m a data person, so I’m generally excited about cool and fun data sets people put out there. It used to be very hard to find a collaborator willing to share data. Now I have collaborators reaching out: “I heard AI and ChatGPT can help us analyze the data.” So, I got very excited about this willingness to collaborate.

In the past, when I collaborated with a new discipline, I needed to learn how they organize their data and the special format their data was in. Over the past decades of big data research, we have done such a good job of educating people—and our collaborators have also been educating themselves in machine learning—today’s data are becoming better formatted and collected using better research design.

In addition, the kind of tools Mark mentioned, such as ChatGPT, also make collaboration easier. I used to ask a lot of questions to my collaborator about simple definitions and jargon. Now I have ChatGPT as a tutor—a very patient tutor. I can simply ask anything and then really develop an appreciation for the background knowledge quickly. I’m most excited about the richness of the opportunity available to us, available to nearly anyone in statistics who wants to embrace collaboration in AI.

Zhu: We can modify and adapt many AI tools to our projects and expand our spectrum in various applications such as NLP [natural language processing] and image analysis. AI is revolutionizing these areas. Also, I do many NLP types of projects by using open AI, so you don’t need to collaborate with NLP researchers, providing us with new opportunities.

We can integrate AI with many existing statistical methods for further method development. I combine the neural networks with quantile regression for many projects in the tech company in particular, such as the treatment effects and distribution or reinforced learning type of things. The use of quantile regression allows you to look at the problems much clearer.

We can also prove many more complex theoretical problems such as these pattern recognition problems. There are many challenges behind it. This is an opportunity for us, right? Let’s create new models and tools. I’m excited about these new things. I don’t feel that threatened. I want to embrace AI tools.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...

Leave your response!

Add your comment below, or trackback from your own site. You can also subscribe to these comments via RSS.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

This is a Gravatar-enabled weblog. To get your own globally-recognized-avatar, please register at Gravatar.