Bayesian Approaches for Modeling Computer Experiments Featured in May Issue
Peihua Qiu, Technometrics Editor
Modeling data from blocked and split-plot response surface experiments requires the use of generalized least squares and the estimation of two variance components. Existing literature on the optimal design of blocked and split-plot response surface experiments focuses entirely on the estimation of the fixed factor effects. In the paper titled “Optimal Design of Blocked and Split-Plot Experiments for Fixed Effects and Variance Component Estimation,” Kalliopi Mylona, Peter Goos, and Bradley Jones introduce a new Bayesian optimal design criterion that focuses on both the fixed effects and the variance components. By incorporating prior information about the variance components through a log-normal or beta prior distribution, the resulting designs allow for a more powerful statistical inference than traditional optimal designs.
Using numerical simulations to model the behavior of large-scale complex systems is common in many fields of science and technology. Although flexible, the simulations at high resolution can be time consuming and expensive. One example is computational fluid dynamics (CFD)–based simulations of a post-combustion carbon capture unit. For such large-scale systems, each simulation may take several days, or even weeks, to run. Thus, time-efficient surrogate models derived from a finite number of simulations need to be developed. In the paper titled “Bayesian Treed Multivariate Gaussian Process with Adaptive Design: Application to a Carbon Capture Unit,” Bledar Konomi, Georgios Karagiannis, Avik Sarkar, Xin Sun, and Guang Lin develop a novel Bayesian treed multivariate Gaussian process (BTMGP) to model the uncertainty of multivariate and nonstationary computer experiment output and implement the computation using Markov chain Monte Carlo (MCMC) techniques. They also apply the proposed method to model the multiphase flow in a full-scale regenerator of a carbon capture unit.
Complex process models have been widely used in science and engineering to understand underlying processes in various systems and make predictions about their future behavior. These mathematical models implemented in computer code are referred to as a simulator. In many cases, the simulator inputs are not easily observable, and thus there is uncertainty about the values of the process model inputs. Describing and quantifying the induced uncertainty in the simulator output due to uncertainty in its inputs is known as sensitivity analysis, which is a valuable tool to identify the places in a model that can be improved by obtaining better input information. In the paper titled “Bayesian Inference for Sensitivity Analysis of Computer Simulators, with an Application to Radiative Transfer Models,” Marian Farah and Athanasios Kottas consider the global sensitivity analysis, which quantifies output uncertainty as all inputs vary continuously over the input space. The influence of each input and its uncertainty on the output are determined by calculating the main effects and sensitivity indices of the computer simulator inputs. The proposed approach is demonstrated in the sensitivity analysis of a radiative transfer model that simulates the interaction of sunlight with vegetation.
The next paper is about climate forecasting, which has become an important research topic because of its implications for political, social, and scientific decisionmaking. One area of active research is to develop models for describing the distribution of carbon dioxide (CO2) mole fraction near the Earth’s surface. In the paper titled “Spatio-Temporal Data Fusion for Very Large Remote Sensing Datasets” by Hai Nguyen, Matthias Katzfuss, Noel Cressie, and Amy Braverman, the authors are concerned about the spatio-temporal prediction of lower-atmospheric CO2 mole fraction over the United States. To this end, they propose a spatio-temporal data fusion (STDF) method for optimal prediction of CO2 mole fraction from noisy and incomplete spatio-temporal data.
The next three papers are related to image analysis in three material science applications. The first is about metal matrix nanocomposites (MMNCs), which are high-strength and light-weight materials with great potential in automotive, aerospace, and many other industries. A uniform distribution of nanoparticles in the metal matrix is critical for achieving high-quality MMNCs; hence, non-uniformity of the particle distribution in MMNCs needs to be detected for quality improvement. Most existing studies quantify and characterize the particle distribution based on a single 2D image. In the paper titled “Detecting 3D Spatial Clustering of Particles in Nanocomposites Based on Cross-Sectional Images” by Qiang Zhou, Junyi Zhou, Michael De Cicco, Shiyu Zhou, and Xiaochun Li, the authors try to assess three-dimensional uniformity of particle distribution based on a sequence of 2D images and determine the number of such images needed to reach a certain confidence level in statistical inferences.
High-resolution spectra information in images can be used to detect, identify, and characterize features of materials. One approach to material identification is the so-called temperature-emissivity separation (TES), which separates or deconvolves the material spectra from the temperature curve. In the paper titled “A Bayesian Nonparametric Model for Temperature-Emissivity Separation of Long-Wave Hyperspectral Images,” Candace Berrett, Gustavious Paul Williams, Todd Moon, and Jacob Gunther develop a Bayesian approach to use measured spectra to characterize and identify clusters of materials within an image and determine associated material emissivity and temperature.
Motivated by an analysis of nanocrystal growing processes, the paper titled “Estimating Multiple Pathways of Object Growth Using Non-Longitudinal Image Data” by Chiwoo Park proposes a new Bayesian monotonic regression model to infer a growth pathway of star-shaped objects.
Static origin-destination (OD) matrix estimation has been studied for many decades in the transportation engineering literature. The paper titled “A Bayesian Statistical Approach for Inference on Static Origin-Destination Matrices in Transportation Studies” by Luis Carvalho proposes a novel Bayesian statistical methodology that incorporates certain sources of data that are common in transportation studies, including seed matrices and trip cost distributions, for estimating OD pairwise trip counts in a transportation system. A hierarchical model and a Markov chain Monte Carlo sampler are developed to explore the posterior space of OD pairwise flows.
In industrial hygiene, a worker’s exposure to chemical, physical, and biological agents is increasingly being modeled using deterministic physical models that study exposures near and farther away from a contaminant source. A complication is that data from the workplace are usually misaligned. This means that not all time points measure concentrations near and far from the source. In the paper titled “Bayesian Modeling for Physical Processes in Industrial Hygiene Using Misaligned Workplace Data,” João V. D. Monteiro, Sudipto Banerjee, and Gurumurthy Ramachandran propose a rich class of multivariate Gaussian processes to model the discrepancies between the physical model and observed data.
In the paper titled “Univariate Dynamic Screening System: An Approach for Identifying Individuals with Irregular Longitudinal Behavior,” Peihua Qiu and Dongdong Xiang develop a dynamic screening system (DySS) for sequentially identifying subjects with irregular longitudinal patterns. The new method combines statistical process control (SPC) techniques with longitudinal data analysis methods. It makes use of all historical data of subjects under monitoring and takes into account the within-subject correlation, as well.