New Approach to Analysis of Cancer Clinical Trials in June JASA
The June 2012 issue of the Journal of the American Statistical Association features an article describing novel strategies for the analysis of sequentially randomized clinical trials in the Applications and Case Studies (A&CS) section, along with comments from experts in that research area. Other A&CS articles include contributions in ecology, economics, and brain imaging. The Theory and Methods section contains important contributions on model selection in high-dimensional problems and the theory behind portfolio selection in finance, as well as many other contributions.
Applications and Case Studies
Cancer therapy is usually conducted in stages. A physician chooses an initial treatment based on the patient’s disease severity and perhaps other patient characteristics. That treatment will be continued if successful, but may be altered if the initial treatment does not lead to a favorable response or if it has intolerable side effects. Multistage strategies are known as dynamic treatment regimes, or adaptive treatment strategies. In “Evaluation of Viable Dynamic Treatment Regimes in a Sequentially Randomized Trial of Advanced Prostate Cancer,” authors Lu Wang, Andre Rotnitzky, Xihong Lin, Randall Milliken, and Peter Thall present a new statistical analysis of data from a two-stage clinical trial of advanced prostate cancer treatments. The initial analysis of the data in 2007 identified the best treatments in stage one and the best treatments in stage two separately, but did not directly address the question of the best two-stage dynamic treatment regimes. This reanalysis carefully defines the full set of viable treatment regimes, constructs a new compound endpoint incorporating both treatment effectiveness and treatment toxicity, and addresses the patients who did not comply with their assigned viable treatment regime.
Estimation is carried out using an inverse probability of censoring weighted estimator. Two comments provide additional insight into the analysis of sequential randomized trials. Daniel Almirall, Daniel Lizotte, and Susan Murphy contribute important points regarding the design of multistage trials and alternative approaches to accommodating multiple outcomes in the data analysis. Paul Chaffee and Mark van der Laan describe an alternative inference approach for such studies: targeted minimum-loss–based estimation. The article and comments shed light on an important class of trials for evaluating cancer treatments.
Mid-study changes of a different type are considered in a second JASA A&CS paper. Lane Burgette and Jerome Reiter consider the case when the technique used to measure important study variables is changed in the middle of the study. The Healthy Pregnancy Healthy Baby Study is an observational study focused on identifying causes of adverse birth outcomes. Blood samples from expectant mothers were sent to a laboratory to measure exposure to several pollutants, including lead. Midway through the study, the investigators changed from one lab to another because the second could provide more finely quantitated exposures. When investigators began to analyze the data, they noticed there were differences in the distribution of the measurements from the two labs; the differences were larger than could be explained by chance. Unfortunately, no samples were analyzed at both labs, which makes calibrating the two sets of measurements challenging.
In “Nonparametric Bayesian Multiple Imputation for Missing Data Due to Mid-Study Switching of Measurement Methods,” Burgette and Reiter develop a set of methods that allow imputing the later measurements for those measured only using the earlier protocol. They treat the later measurements as “missing” data. Their imputation approach for these missing data is based on the reasonable assumption that the two labs accurately rank samples, although they score them on different scales. The rank preservation assumption (i.e., the same ranking would be obtained in each lab) leads to three imputation strategies. The strategies work well in simulation and are then used to estimate quantile regressions relating the distribution of infant birth weights to the level of maternal exposure to environmental contaminants.
Theory and Methods
Model selection in linear regression models is an important problem with an extensive literature. The problem has become increasingly important in modern high-dimensional problems in which there are many predictor variables from which to choose. Many Bayesian and frequentist approaches have been proposed. The standard assumptions incorporated in Bayesian model selection procedures have resulted in procedures that are not competitive with commonly used penalized likelihood methods.
In “Bayesian Model Selection in High-Dimensional Settings,” Valen Johnson and David Rossell propose modifications of the Bayesian methods by relying on nonlocal prior densities on model parameters to improve performance. The authors show that model selection procedures based on nonlocal prior densities reliably find the correct model in linear model settings when the number of possible covariates is bounded by the number of observations. The key to the improved performance is that Bayesian approaches based on standard local prior densities (which are nonzero for null values) assign positive probabilities to incorrect models that include some coefficients that are truly zero. The nonlocal prior densities avoid this behavior. In addition to consistently identifying the true model, the proposed Bayesian procedures based on nonlocal prior distributions provide accurate estimates of the posterior probability that each identified model is correct. Simulation studies demonstrate that the new model selection procedures perform as well as or better than commonly used penalized likelihood methods in a range of simulation settings.
A different type of “selection” problem is the focus of another Theory and Methods paper, “Portfolio Selection with Gross-Exposure Constraints” by Jianqing Fan, Jinjjin Zhang, and Ke Yu. Portfolio selection and optimization have been fundamental problems in finance ever since the development of Markowitz’s portfolio theory, which proposes choosing one’s portfolio (the amount of each asset to purchase/sell) to maximize the expected return of the portfolio subject to a constraint on the variability of returns. Though Markowitz’s proposal represented a significant theoretical breakthrough, it turns out there are practical problems in applying the theory because the optimal Markowitz portfolio is sensitive to errors in estimating/forecasting the expected returns and the covariance matrix of the returns. The problems are especially important in situations where there are a large number of assets from which to choose.
Fan, Zhang, and Yu develop an extension of the usual Markowitz formulation by introducing a constraint on the total amount of investment at risk (which includes money used to buy assets and also the risk in short sales of assets). The so-called “gross exposure constraint” allows the authors to investigate the circumstances under which constrained portfolio selection has good properties relative to the unconstrained solution. They show that for a range of values of the constraint parameter, the constrained approach leads to optimal portfolios that have smaller actual risk than the globally optimal portfolios while at the same time allowing for more accurate estimation of the risk.
There are many other informative articles in both sections of the June issue, as well as a set of book reviews.