Statistics: Your Chance for Happiness (or Misery)
Xiao-Li Meng, Whipple V. N. Jones Professor of Statistics and Department Chair, Harvard University
Editor’s Note: The full version of this article was originally published as an op-ed for The Harvard Undergraduate Research Journal in April 2009.
“I keep saying the sexy job in the next 10 years will be statisticians. People think I’m joking, but who would’ve guessed that computer engineers would’ve been the sexy job of the 1990s? The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades.”
— Hal Varian, Google’s chief economist
As a professor of statistics, I, of course, cannot agree more with Hal Varian. But, as a statistician, I am obligated to remind you that a professor of any subject can find quotations—tons of them—to demonstrate the importance of his or her beloved subject.
Wait! Does that reminder have anything to do with being a statistician? Well, let’s label this question as Puzzle One and read on. And while we are at it, let me throw in another quote, this time from a recruiter representing Wall Street—yes, they are still hiring—but read this carefully:
Here, the word randomness is what brings statistics and statisticians into the picture. Statistics, in a nutshell, is a discipline that studies the best ways of dealing with randomness, or more precisely and broadly, variation. As human beings, we tend to love information, but we hate uncertainty—especially when we need to make decisions. Information and uncertainty, however, are actually two sides of the same coin. If I ask you to go to the airport to pick up a student you have never met, my description of her is information only because there are variations; if everyone at the airport looks identical, my description has no value. On the other hand, the same variation causes uncertainty. If all I tell you is to pick up a Chinese female student by the name of Xiao-Li (meaning “little beauty” in Chinese, not “plough at dawn” as my name means), then my description is not informative enough because it still allows too many variations. There may be a substantial number of individuals at the airport who look like a Chinese female student. You then need to do something creative to pick up the right one, such as making a name sign.
Then again, the name sign is useful for her to identify you as the one picking her up only because there is variation among names. Indeed, if it happens that there are two Xiao-Li name signs outside the terminal, she will need to do something creative to find you. This is, of course, trivial, and any of us would recognize and deal with the situation upon encountering it. But, we may or may not recognize the deeper principle behind it: Information is there for the same reason uncertainty is there.
While we are at the airport, let me throw in this almost well-known joke. Mr. Skerry needs to take a flight, but he is terrified by the possibility—however small—that someone will bring a bomb onto his plane. So, he decides to pack a bomb himself, as he reasons that the chance two individuals bringing bombs onto the same plane is much smaller than that of one individual bringing on a bomb.
You, of course, are chuckling at this. However, which probabilistic/statistical principle is he trying to use, or rather violate? Can you easily explain to your fellow students why Mr. Skerry’s argument is ridiculous? If you cannot, then let’s label this Puzzle Two.
I hope the discussion above has helped you see more clearly, and fundamentally, why Google, Wall Street, and many other entities are increasingly interested in hiring statisticians. We are now squarely in the information age, with almost everything digitized. Each of us is trying to see what all the data (which don’t have to be numerical) are telling us about issues from personal health to the global economic crisis. There is so much variation in almost everything we want to know or study, one has to wonder what constitutes real information and what is just noise.
Mr. Skerry’s reasoning is surely ridiculous, but how many of us have realized that the many small probabilities reported in the media and even scientific publications—such as probabilities of DNA evidence—were based on exactly the same ridiculous reasoning, that is multiplying probabilities inappropriately?
AP Statistics Was the Most Boring Course I Took in High School
As a statistics professor, I hear this or a close variation almost every time I tell someone I teach statistics. And for nearly every one of you (i.e., undergraduates) I have spoken with, the number one reason you did not consider majoring in statistics is because the AP Statistics you took convinced you that statistics is the most boring subject. We statisticians, of course, are to be blamed for this unfortunate situation. Statistics is an urgently demanded, but vastly under-appreciated, field—urgently demanded for reasons discussed above and vastly under-appreciated because too few statisticians, relatively speaking, have effectively conveyed the excitement of statistics as a way of scientific thinking for whatever you do, instead of a collection of tools you may or may not need one day.
At Harvard, we are fortunate to have several first-class statistical educators teaching introductory statistics courses. For example, my colleague Ken Stanley, who teaches Introduction to Quantitative Methods for Economics, has been so effective that one student wrote in his/her evaluation, “It is like taking a course in Christianity and Jesus himself is teaching.” (If you can come up with more impressive praise than that, email me at firstname.lastname@example.org.) Another colleague, Joe Blitzstein, doubled the enrollment of Introduction to Probability in just three years. He is now an international sensation, so to speak. A student was telling her friend in Germany that she was taking this cool stat course with Joe and her friend responded, “Oh, you mean that YouTube stat professor?” (You can satisfy your curiosity by googling “Stat 110 at Harvard.”)
Last year, we also launched Real Life Statistics: Your Chance for Happiness (or Misery). This course was designed by my “happy team”—which consisted of eight master’s and PhD students from the statistics department—over two years and many happy dinners (not happy meals) at the best restaurants Boston offers. The course aims to introduce students to the wonderland of statistics by showcasing how it is used (and misused) in real-life situations every student should be able to relate to, either happily or miserably.
Unlike many traditional statistics courses that arrange the material by statistical topic in order of complexity, Real Life Statistics arranges the material by “real-life modules.” Last year, the five modules were finance (e.g., stock market), romance (e.g., online dating models), medical sciences (e.g., Viagra trial), law (e.g., O. J. Simpson trial), and wine and chocolate tasting (depending on your age). This semester, we are replacing the law module with an election module, given the historic election we all just witnessed (and now that O. J. is behind bars).
All these efforts are aimed at making statistics not just palatable, but delicious to all of you, who, I am 98% sure (that is the highest assurance any professional statistician would give), will need statistics in your own research, regardless of the subject, and your life. Our happiness or misery often literally depends on our understanding of statistics. Statistics, or quantitative evidence, is being used in the media, scientific publications, and elsewhere to persuade us to buy products, arguments, theories, etc. Some of the claims are statistically and scientifically sound, but many are not. A good percentage of them are deliberate lies, intended to deceive the public in order to make a profit.