Home » Additional Features, Technometrics Highlights

Analysis of Shape Experiments Featured in February Issue

1 February 2011 1,523 views No Comment
Hugh A. Chipman, Technometrics Editor

Increasingly sophisticated measurement technology is yielding rich and complex data, demanding the development of novel statistical methods. For instance, in statistical process control, profiles or functional data are increasingly monitored as quality characteristics. In our featured article, Enrique del Castillo and Bianca M. Colosimo consider a different kind of complex data: the geometric shape of an object. Shape can be considered as all information that is invariant with respect to rotations, translations, and dilations. Statistical shape analysis (SSA), a combination of statistics and geometry, seeks to analyze the shapes of objects in the presence of random error.

In “Statistical Shape Analysis of Experiments for Manufacturing Processes,” the authors develop new statistical techniques for the analysis of designed experiments in which the response is the geometric shape of a manufactured part. A significant challenge is the development of techniques that can connect conventional methods such as ANOVA with rich shape data. One might consider developing parametric models to summarize the shape, but in many industrial experiments, the response of interest has a complicated shape for which finding a parametric model is a challenge. By working with the shape directly, SSA techniques avoid the parametric model definition step, allow complicated shapes to be studied, and simplify the presentation of results.

The paper develops both an F ANOVA test and a permutation ANOVA test for shapes. The permutation test relaxes the usual error assumptions made by ANOVA, which may be restrictive in practice. The fitted model can be used to optimize the response, choosing factor levels that give the desired shape. Simulations are presented, demonstrating the efficacy of the methods on “free-form” objects of complex geometry and giving power comparisons. The power study focuses on circular shapes, a data type that enables comparison with existing techniques. The permutation test provides higher power than traditional methods used in manufacturing practice.

New visualization tools, including main effect and interaction plots for shapes and deviation from nominal plots, are presented to aid in the interpretation of the experimental results. These plots give immediate visual indications of the effect of factors on shape and will be of special interest to practitioners. A machining experiment with titanium lathe turning is used to illustrate the techniques, with replicated data from a full-factorial 32 experiment.

The next two articles consider approximations and efficient computation for large and expensive models. The first, by Mark Fielding, David J. Nott, and Shie-Yui Liong, is titled “Efficient MCMC schemes for computationally expensive posterior distributions.” In this case, the “expensive model” is the posterior distribution of any Bayesian model. The authors consider Markov chain Monte Carlo (MCMC) computational schemes intended to minimize the number of evaluations of the posterior distribution. The posterior is not necessarily well behaved. In their illustrative application, calibration of a rainfall-runoff model, the posterior is multimodal and computationally expensive to evaluate. An algorithm suggested previously in the literature based on hybrid Monte Carlo and a Gaussian process approximation to the target distribution is extended in several ways. Parallel tempering, a MCMC technique involving multiple chains is shown to sample effectively and the Gaussian process approximation is used more aggressively in the sampling algorithm, reducing the number of evaluations of the true posterior.

In “Statistical Emulation of Large Dynamic Models,” Peter C. Young and Marco Ratto consider high order dynamic simulation models as the “expensive model.” Such a problem could be considered a kind of computer experiment, which typically involves emulation of a scalar output signal as a function of a set of inputs. In the case of dynamic models, however, the output is a sequence of samples over time, depending not only on the usual inputs, but also dynamical sequences of forcing inputs. Their approach exploits the technique of dominant mode analysis to identify a reduced-order, linear transfer function model that closely reproduces the linearized dynamic behavior of the large model. Based on a set of such reduced-order models identified over a specified region of the large model’s parameter space, the emulation (or meta) model can replace the large model in applications such as sensitivity analysis, forecasting, or control system design. The methods are demonstrated with two applications, one from a dynamic model used in hydrology and a more computationally intensive dynamic macro-economic model with 10 inputs and seven outputs.

The next two papers consider two rather different regression problems. In “Robust Ridge Regression for High-Dimensional Data,” Ricardo A. Maronna addresses several shortcomings of previous approaches to robust ridge regression, including sensitivity to “bad leverage observations,” requirements that sample size n be greater than the number of predictors p, and low robustness when p/n is large. A penalized MM estimate is proposed in which the quadratic loss is replaced by a more robust loss function, enabling computation for p>n and robustness for large p/n. A fast algorithm is demonstrated using data from an electron-probe X-ray microanalysis.

In “Nearly Isotonic Regression,” Ryan J. Tibshirani, Holger Hoefling, and Robert Tibshirani borrow ideas from lasso penalization to generalize isotonic regression. In isotonic regression, one seeks to summarize a sequence of data points (e.g., indexed by time) with a monotone sequence of constants, building a monotone, piecewise constant approximation. The paper implements monotonicity via a lasso-style penalty, enabling the estimation of nearly isotonic fits, with some non-monotone segments. In conjunction with model-selection tools such as cross-validation, the assumption of monotonicity can be assessed. Extensions to nearly convex fits and connections with an unbiased estimate of degrees of freedom are made. An illustration using global warming data is presented, suggesting that global temperatures have not had a strictly monotone increase since 1856.

In “Efficient Designs with Minimal Aliasing,” Bradley Jones and Christopher J. Nachtsheim develop a Bayesian design criterion to deal with lurking model terms that are potentially important, but not included in the model. For example, when constructing an optimal design for a first-order model, aliasing of main effects and interactions is not considered. This can lead to designs that are optimal for estimation of the primary effects of interest, yet have undesirable aliasing structures. Using a Bayesian formulation of the design problem, they construct exact designs that minimize expected squared bias subject to constraints on design efficiency. The method is illustrated with the construction of screening and response surface designs.

Censoring is a problem that routinely occurs in many applications, with right-censoring being the most common. An interesting example of left-censoring is that of a sample containing values below a single detection limit. In “Inference for the Lognormal Mean and Quantiles Based on Samples with Left and Right Type I Censoring,” K. Krishnamoorthy, Avishek Mallick, and Thomas Mathew focus on this problem, considering interval estimation of the mean and quantiles of a lognormal distribution. They develop inferential techniques that do not require large samples and are quite accurate. Methods using MLE-based approximate pivotal quantities, and some likelihood-based methods, are proposed and shown to be effective, even when 70% of data values are left-censored. Examples with lower detection limits and censored reliability measurements illustrate the methods.

The issue concludes with an article by Changliang Zou and Fugee Tsung, titled “A Multivariate Sign EWMA Control Chart.” Nonparametric control charts are useful in statistical process control (SPC) when there is a lack of or limited knowledge about the underlying process distribution, especially when the process measurement is multivariate. This paper develops a new multivariate SPC methodology for monitoring location parameters. It is based on adapting a powerful multivariate sign test to online sequential monitoring. The weighted version of the sign test is used to formulate the charting statistic by incorporating the exponentially weighted moving average control (EWMA) scheme, which results in a nonparametric counterpart of the classical multivariate EWMA (MEWMA). It is affine-invariant and has a strictly distribution-free property over a broad class of population models. This control chart possesses other favorable features, including fast computation, efficient detection of process shifts, and easy implementation with minimal inputs from historical data. Two real-data examples from manufacturing show it performs well in applications.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...

Comments are closed.