Design and Analysis of Computer Experiments Featured in February Issue
Hugh A. Chipman, Technometrics Editor
With increased use of large and complex computer models and growing interest in the field of uncertainty quantification, the design and analysis of computer experiments is seeing considerable research activity. This issue reflects this activity, with five articles and one note concerning recent extensions, generalizations, and applications of computer experiments.
Gaussian process models have become a popular statistical technique for emulating expensive computer experiments and can be used to optimize a computer model efficiently. When the computer model is subject to noise, with Monte Carlo simulators for example, the design and analysis of computer experiments takes a few twists. In the lead article, “Optimization of Noisy Computer Experiments with Tunable Precision,” Victor Picheny, David Ginsbourger, Yann Richet and Gregory Caplin address kriging-based optimization of stochastic simulators. Many of these simulators depend on factors that tune the level of precision of the response, the gain in accuracy being at a price of computational time.
The contribution of this work is two-fold. First, it proposes a quantile-based criterion for the sequential design of experiments, in the fashion of the classic expected improvement criterion, enabling an elegant treatment of heterogeneous response precisions. Second, a sequential design strategy is augmented to include a procedure for the allocation of the computational time given to each measurement, allowing a better distribution of the computational effort and increased efficiency. The optimization method is applied to an original application in nuclear criticality safety. This article features discussion by Alexander I. J. Forrester, Robert B. Gramacy, Jack P. C. Kleijnen, Peter Z. G. Qian, Pritam Ranjan, Rui Tuo, and C. F. Jeff Wu, as well as a rejoinder by the authors.
Various aspects of computer experiments are explored in several other papers in this issue. In “Sequential Design and Analysis of High-Accuracy and Low-Accuracy Computer Codes,” Shifeng Xiong, Peter Z. G. Qian, and C. F. Jeff Wu develop a methodology for multiple deterministic computer codes with different levels of accuracy. Using evaluations of the two models at a pair of nested Latin hypercube designs, an initial prediction model is estimated. Depending on the accuracy of the fitted model, the two codes are evaluated iteratively with input values chosen in an elaborate fashion so their expanded scenario sets still form a pair of nested Latin hypercube designs. The nested relationship between the two scenario sets makes it easier to model and calibrate the difference between the two sources, resulting in more accurate emulation.
In computer experiments in which simulators produce multivariate output, the common practice of specifying a Gaussian process with a separable covariance structure can lead to poor performance of the emulator, particularly when the simulator outputs represent different physical quantities. In “Multivariate Gaussian Process Emulators with Nonseparable Covariance Structures,” Thomas E. Fricker, Jeremy E. Oakley, and Nathan M. Urban develop nonseparable covariance structures based on the linear model of coregionalization and convolution methods. Using two case studies, they find that only emulators with nonseparable covariances structures have sufficient flexibility to give both good predictions and represent joint uncertainty about the simulator outputs appropriately.
In the paper, “Gaussian Process Modeling of Derivative Curves,” Tracy Holsclaw, Bruno Sansó, Herbert K. H. Lee, Katrin Heitmann, Salman Habib, David Higdon, and Ujjaini Alam develop a Gaussian process-based inverse method that allows for the direct estimation of the derivative of a one-dimensional curve. The resultant fit is computationally efficient and more accurate than fitting a curve to the data and then differentiating. An important cosmological application is used to demonstrate the method.
In mixture experiments, two or more inputs represent a percentage contribution and are constrained to sum to 100%. The resultant correlation between input variables implies that additional care must be taken when fitting statistical models or visualizing the effect of one or more inputs on the response. In “Global Sensitivity Analysis for Mixture Experiments,” Jason L. Loeppky, Brian J. Williams, and Leslie M. Moore consider the use of a Gaussian process to model the output from a computer simulator taking a mixture input. They introduce a procedure to perform global sensitivity analysis, providing main effects and revealing interactions. The resulting methodology is illustrated using a function with analytically tractable results for comparison, a chemical compositional simulator, and a physical experiment.
The three remaining papers in the issue consider problems other than computer experiments. To provide consistently high-quality service in computer networks, several aspects of the network must be monitored, including traffic volumes on its links. As network sizes expand, such monitoring becomes increasingly resource-hungry. The paper “Network-Wide Statistical Modeling, Prediction, and Monitoring of Computer Traffic,” by Joel Vaughan, Stilian Stoev, and George Michailidis, considers the monitoring of only a small subset of links, using this data to predict the traffic on other, unobserved links. Auxiliary data are used to represent important structure in the network and can significantly improve the results of prediction. An adjusted control chart methodology also is introduced, indicating a possible application of prediction results in situations where all links may be observed.
Importance sampling aids in establishing alarm thresholds for instrumentation used worldwide to deter/detect nuclear threats. In “Quantile Estimation for Radiation Portal Monitoring,” Rick Picard, Tom Burr, and Michael S. Hamada review the statistical aspects of threshold determination, discuss the intuition behind the methodology, and show when simple techniques work well and when they do not. Computational efficiencies relative to ordinary simulation are improved by orders of magnitude in many cases, and the approach is easily implemented by non-experts.
The issue concludes with an extension of the lead article from the August 2012 issue. The original paper proposed a new deterministic approximation method for Bayesian computation, known as design of experiments-based interpolation technique (DoIt). A major weakness of this method is that the approximated posterior density can become negative. In the technical note “A Note on Non-Negative DoIt Approximation,” V. Roshan Joseph modifies his DoIt approximation, guaranteeing non-negativity of the approximated density. The new approximation is much simpler and faster to compute.