Special Technometrics Issue: Conference on Data Analysis (CoDA)
Exploring data-focused research across the Department of Energy
Kary Myers, Earl Lawrence, and Hugh A. Chipman
The November 2013 issue of Technometrics is a special issue featuring some of the challenging, application-driven work presented at the inaugural Conference on Data Analysis (CoDA) in 2012 in Santa Fe, New Mexico. CoDA 2012 brought together statisticians and other data-focused scientists from across the Department of Energy national laboratories and their academic and industrial collaborators. In addition to 28 invited talks covering varied topics such as renewable energy, dark energy, and high energy physics, CoDA also highlighted the data-intensive, collaborative work of 65 poster presenters, including 20 graduate students. CoDA 2012 had 130 participants from nine national labs, 25 universities, and seven private and public companies. CoDA sponsors included Technometrics’ two sponsoring societies (the American Statistical Association’s Section on Physical and Engineering Sciences and the American Society for Quality).
Technometrics provides the perfect forum for these CoDA papers, given the journal’s long-standing connection with the applied statistics research at Department of Energy labs. Since its inception in 1959, Technometrics has published hundreds of articles by statisticians and scientists at the national laboratories, including McKay, Beckman, and Conover’s 1979 paper introducing Latin hypercube designs and the 1979 paper by Golub, Heath, and Wahba introducing generalized cross-validation. Papers written by researchers at Department of Energy labs have won both the Jack Youden Prize (Best Expository Paper) and the Frank Wilcoxon Prize (Best Practical Application Paper) on multiple occasions, and two Technometrics editors (Robert Easterling and Max Morris) have worked at national labs.
Christine Anderson-Cook, Los Alamos National Laboratory
David Banks, Duke University
Derek Bingham, Simon Fraser University
David Higdon, Los Alamos National Laboratory
Gardar Johannesson, Lawrence Livermore National Laboratory
V. Roshan Joseph, Georgia Institute of Technology
Earl Lawrence, Los Alamos National Laboratory
Jason Loeppky, University of British Columbia
Max Morris, Iowa State University
Kary Myers, Los Alamos National Laboratory
George Ostrouchov, Oak Ridge National Laboratory
Shane Reese, Brigham Young University
Curtis Storlie, Los Alamos National Laboratory
Matt Taddy, The University of Chicago Booth School of Business
Alyson Wilson, North Carolina State University
The articles in this special CoDA issue continue this tradition by highlighting the intent of Technometrics “to contribute to the development and use of statistical methods in physical, chemical, and engineering sciences as well as information sciences and technology.”
The issue begins with several papers on a variety of networks: The article “Bayesian Nonparametric Models for Community Detection” by Alyson Wilson, Jiqiang Guo, and Daniel Nordman considers networks defined by social or other interactions, while “Scan Statistics for the Online Detection of Locally Anomalous Subgraphs” by Joshua Neil, Curtis Storlie, Curtis Hash, Alexander Brugh, and Michael Fisk considers detection of small anomalies in a large, time-dependent computer network. Social networks/media are considered by Matt Taddy in “Measuring Political Sentiment on Twitter: Factor Optimal Design for Multinomial Inverse Regression,” while power distribution networks are considered by Earl Lawrence, Scott Vander Wiel, and Russell Bent in “Model Bank State Estimation for Power Grids Using Importance Sampling.”
These papers are followed by two articles about the analysis of functional data arising from interesting applications. In “Methods for Characterizing and Comparing Populations of Shock Wave Curves,” Curtis Storlie, Michael Fugate, David Higdon, Aparna Huzurbazar, Elizabeth Francois, and Douglas McHugh consider evaluation of detonators and high-energy explosives. In “A Bayesian Measurement Error Model for Misaligned Radiographic Data,” Kristin Lennox and Lee Glascoe consider micro-computed tomography data.
At the national labs, reliability is an important area of research, as the next two papers indicate: “Bayesian Methods for Estimating the Reliability of Complex Systems Using Heterogeneous Multilevel Information” by Jiqiang Guo and Alyson Wilson and “A Case Study on Selecting a Best Allocation of New Data for Improving the Estimation Precision of System and Sub-System Reliability Using Pareto Fronts” by Christine Anderson-Cook, Lu Lu, and Jessica Chapman.
Computer experiments and high-performance computing are recurring themes at the national labs. In “Computer Model Calibration Using the Ensemble Kalman Filter,” David Higdon, Matthew Pratola, James Gattiker, Earl Lawrence, Charles Jackson, Michael Tobis, Salman Habib, Katrin Heitmann, and Stephen Price adapt the ensemble Kalman filter to Bayesian computer model calibration. Computer model calibration is also the topic of “Prediction and Computer Model Calibration Using Outputs from Multi-fidelity Simulators” by Derek Bingham, Joslin Goh, James Holloway, Michael Grosskopf, Carolyn Kuranz, and Erica Rutter. In that article, field data are combined with outputs from multi-fidelity simulators, with an application in radiative shock hydrodynamics. In “A Parallel EM Algorithm for Model-Based Clustering Applied to the Exploration of Large Spatio-Temporal Data,” Wei-Chen Chen, George Ostrouchov, David Pugmire, and Prabhat and Michael Wehner develop parallel computation for EM estimation of mixture distribution models.
The issue concludes with three papers considering different and interesting data formats: rotational data in “Point Estimation of the Central Orientation of Random Rotations” by Bryan Stanfill, Ulrike Genschel, and Heike Hofmann; pairs of hyperspectral satellite images before and after release of a chemical plume in “Matched-Pair Machine Learning” by James Theiler; and forensic analysis of profilometry data in “Significance of Angle in the Statistical Comparison of Forensic Tool Marks” by Amy Lock and Max Morris.
The theme of this issue is clearly hard applications and collaborative research, with most papers written jointly by statisticians and scientists as they consider challenging, life-sized problems.
As this special issue of Technometrics goes to press, plans are under way for CoDA 2014, to be held March 5–7. The invited program will explore data-focused research across the Department of Energy, featuring sessions on energy and the environment, signature discovery, data-intensive applied science, uncertainty quantification, national security, and Big Data and exascale computing. See the CoDA website for more information, and consider joining us to present your work in Santa Fe. In the meantime, we thank the members of the CoDA guest editorial board whose hard work made this special issue possible, and we hope you enjoy this Technometrics collection of collaborative applied statistical work.