## June *JASA* Gives Advice on Statistical Methods for Observational Studies

**Book Reviews**

*Chaos and Coarse Graining in Statistical Mechanics*

Patrizia Castiglione, Massimo Falcioni, Annick Lesne, and Angelo Vulpiani

*A First Course in Bayesian Statistical Methods*

Peter D. Hoff

*Longitudinal Data Analysis*

Garrett Fitzmaurice, Marie Davidian, Geert Verbeke, and Geert Molenberghs (Eds.)

*Markov Processes and Applications: Algorithms, Networks, Genome, and Finance*

Etienne Pardoux

*Meta-Analysis of Binary Data Using Profile Likelihood*

Dankmar Böhning, Ronny Kuhnert, and Sasivimol Rattanasiri

*Monte Carlo and Quasi-Monte Carlo Sampling*

Christiane Lemieux

*Random Effect and Latent Variable Model Selection*

David B. Dunson (Ed.)

*The Science of Bradley Efron: Selected Papers*

Carl N. Morris and Robert Tibshirani (Eds.)

*The EM Algorithm and Extensions* (2nd ed.)

Geoffrey J. McLachlan and Thriyambakam Krishnan

*Response Surface Methodology: Process and Product Optimization Using Designed Experiments* (3rd ed.)

Raymond H. Myers, Douglas C. Montgomery, and Christine M. Anderson-Cook

*Statistical Methods for Categorical Data Analysis* (2nd ed.)

Daniel A. Powers and Yu Xie

A recent article in *Science News* (“Odds Are, It’s Wrong”) implicates the field of statistics in the rash of scientific results that fail to hold up under scrutiny. The ASA and International Statistical Institute responded with a letter to the editor pointing out that the misuse of statistical methods is the culprit.

Observational data are particularly susceptible to misinterpretation and frequently result in findings that time reveals to have been false positives. Thus, **Paul Rosenbaum’s** paper in this issue’s Theory and Methods (T&M) section, “Design Sensitivity and Efficiency in Observational Studies,” is particularly timely.

Rosenbaum notes that an observational study draws inferences about the effects caused by a treatment when subjects are not randomly assigned to treatment or control, as they would be in a randomized trial. After adjusting for imbalances in measured covariates, the key source of uncertainty in an observational study is due to the possibility that subjects were not comparable prior to treatment in terms of some unmeasured covariate. If there is an unmeasured covariate that differs between treatment groups, then differing outcomes in treated and control groups are not necessarily due to the treatment. A sensitivity analysis sheds light on the magnitude of the departure from random assignment needed to alter the qualitative conclusions of the study. Two quantitative measures, the power of a sensitivity analysis and the design sensitivity, anticipate the outcome of a sensitivity analysis under an assumed model for treatment effect.

In practice, statisticians often choose to use statistical methods for observational studies based on their knowledge of what would be a powerful technique in a randomized experiment. Rosenbaum argues that this common practice is a mistake. The design sensitivity for such methods can be great, meaning that these procedures—though optimal for randomized experiments—need not be the best procedure under more realistic assumptions.

Rosenbaum’s article will be available for free download for a limited time.

Another T&M article addresses DNA minicircles. What is a DNA minicircle? Why are they of interest? You can find answers to these questions by reading “Second-Order Comparison of Gaussian Random Functions and the Geometry of DNA Minicircles,” by **Victor M. Panaretos**, **David Kraus**, and **John H. Maddocks**.

The authors’ research is motivated by the problem of determining whether the mechanical properties of short strands of DNA are influenced by their base-pair sequences. Although such influences are anticipated, this phenomenon has not yet been observed in 3D electron microscopy data. It transpires that insight into the relationship between sequence structure and mechanical properties of DNA can be addressed by testing whether two samples of continuous, zero-mean, iid Gaussian processes on the interval [0, 1] have the same covariance structure. In this paper, the authors show that testing whether DNA shape is determined by base-pair sequence composition involves aspects of ill-posed inverse problems and they develop an approach based on a Karhunen–Loeve approximation of the Hilbert–Schmidt distance of certain empirical covariance operators. They apply their method to a data set of DNA minicircles and the test suggests base-pair sequence makeup is related to DNA shape.

**Applications and Case Studies**

The Applications and Case Studies (ACS) section includes articles with statistical applications to weather forecasting, analysis of voting blocs in England, musical theory, and assessing the impact of a major anti-tobacco intervention.

The feature article, “Probabilistic Weather Forecasting for Winter Road Maintenance,” by **Veronica Berrocal**, **Adrian Raftery**, **Tilmann Gneiting**, and **Richard Steed**, demonstrates how the use of probabilistic forecasting allows for better decisionmaking. They use forecasts regarding temperature and precipitation on a stretch of U.S. I-90 in Washington to decide whether to apply anti-icing treatments. The weather forecasts are good, and traditional practice has been to use the mean of the forecast distribution to make the decision. The authors note, however, that a good decision rule must balance the large cost of a road closing (an error that occurs if anti-icing is not applied when it should be) and the relatively smaller cost of an inappropriate anti-icing treatment. Having a probability distribution on the forecasts allows one to optimize this decision. They find that, over a typical winter season, optimal decisionmaking would reduce expected loss over a naïve strategy from $23 million to $11.5 million. And that’s just one slice of highway in one state.

This article is also available for free download for a limited time.

Another ACS article of interest is “Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program,” by **Alberto Abadie**, **Alexis Diamond**, and **Jens Hainmueller**. The program referenced in the title was a voter initiative, Proposition 99, passed in California in 1988. It called for a tax increase on cigarettes and required the revenue from the tax to be used in consumer education programs and anti-smoking advertisements.

Per capita cigarette consumption dropped following implementation of the program. However, consumption was already dropping before 1988, and it also dropped in other states before and after 1988. Because of this, it is not obvious what the impact of the proposition was.

To assess the impact, the authors created a “synthetic control” to which the post-1988 California experience can be compared. The synthetic control is a weighted average of other states with the weights chosen so the composite matches California per capita cigarette consumption in each year up to 1988 and also matches California on a range of other pre-1988 characteristics. By comparing the synthetic California post-1988 outcomes to the real California’s post-1988 outcomes, the authors concluded that the tobacco control program led to a drop of 25 packs per person relative to what would have been observed without the program.

There are many other interesting articles in the June issue of *JASA*, not to mention the usual array of informative book reviews.

Stan Youngsaid:Over the last 20-30 years wonderful statistical technology has been published and made available in commercial software – Resampling-based adjusted p-values to deal with multiple testing and propensity score methods to deal with bias to name just two things. Most of these statistical technology advances are not used by people analyzing observational studies. Even simple strategies like using a training set and a hold out set are not used. Data from observational studies is not generally available. We may need to examine the process to see what might be done to make claims coming from observational studies more reproducible.