Home » Featured, Technometrics Highlights

Special Issue Features Anomaly Detection

1 February 2010 3,126 views 2 Comments

In “Incorporating Time-Dependent Source Profiles Using the Dirichlet Distribution in Multivariate Receptor Models,” Matthew J. Heaton, C. Shane Reese, and William F. Christensen use models to estimate profiles and contributions of pollution sources from concentrations of pollutants such as particulate matter in the air. The majority of previous approaches to multivariate receptor modeling assume pollution source profiles are constant through time. In an effort to relax this assumption, this article uses the Dirichlet distribution in a dynamic linear receptor model for pollution source profiles. The model is evaluated using simulated data sets and then applied to a physical data set of chemical species concentrations measured at the U.S. Environmental Protection Agency’s St. Louis-Midwest supersite.

The next two papers look at measurement systems. The first—by Tirthankar Dasgupta, Arden Miller, and C. F. Jeff Wu—is titled “Robust Design, Modeling, and Optimization of Measurement Systems.” The authors present an integrated approach for estimation and reduction of measurement variation in systems with a linear signal-response relationship. Noise factors are classified into a few distinct categories based on their impact on the measurement system. A random coefficients model that accounts for the effect of control factors and each category of noise factors on the signal-response relationship is proposed. A suitable performance measure is developed using this general model, and conditions under which it reduces to the usual dynamic signal-to-noise ratio are discussed. Two data analysis strategies for modeling and optimization are proposed and compared. The effectiveness of the proposed method is demonstrated with a simulation study and by application to data from an industrial experiment.

The next article, by Jeroen de Mast and Wessel N. van Wieringen, is titled “Modeling and Evaluating Repeatability and Reproducibility of Ordinal Classifications.” The authors criticize existing methods for studying ordinal measurement processes. They then propose a new approach, rooted in item response theory (IRT), a well-established reliability method in psychometrics and education. Fitted IRT models can be presented graphically, but also allow the calculation of probabilities of correct ordering and consistent classification. In addition, the model-based approach allows refined diagnostics, giving the user insight into the workings of a classification procedure. The approach is illustrated by a real-life industrial example, and the proposed analysis is contrasted with two popular alternatives.

Grzegorz Wylupek proposes a new solution for the general nonparametric k-sample problem. His paper, “Data-Driven k-Sample Tests,” introduces a net of semiparametric models, solves the testing problem for members of this approximating net, and then combines the resulting statistics via model selection rules. This approach leads to a flexible and powerful class of tests that are sensitive to a wide range of potential differences among the groups. Simulations show that an omnibus version of the test has power comparable to existing k-sample tests for detecting changes of location or scale and is more powerful for more complex changes. The author briefly discusses a variant of the new solution focused on detecting high-frequency alternations.

The final article—by Yue Cui, James S. Hodges, Xiaoxiao Kong, and Bradley P. Carlin, is titled “Partitioning Degrees of Freedom in Hierarchical and Other Richly Parameterized Models.” A measure of a hierarchical model’s complexity—degrees of freedom (DF)—that is consistent with definitions for scatterplot smoothers, can be interpreted in terms of simple models, and enables control of a fit’s complexity by means of a prior distribution on complexity that has already been developed. DF describes complexity of the whole fitted model, but it is generally unclear how to allocate DF to individual effects. This article gives a new definition of DF for arbitrary normal-error linear hierarchical models that naturally partitions the n observations into DF for individual effects and error. The new conception of an effect’s DF is the ratio of the effect’s modeled variance matrix to the total variance matrix. This gives a way to describe the sizes of different parts of a model (e.g., spatial clustering vs. heterogeneity), place DF-based priors on smoothing parameters, and describe how a smoothed effect competes with other effects. It also avoids difficulties with the most common definition of DF for residuals.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading...

2 Comments »

  • Galit Shmueli said:

    I believe that the special issue will come out in May 2010 (the Feb 2010 Technometrics issue does not seem to include these papers).

  • Eric Sampson said:

    Dear Dr. Shmueli: With my apologies, this issue ran into unexpected production problems.

    The issue is now online at http://pubs.amstat.org/toc/tech/52/1

    Please don’t hesitate to let me know if you have any questions!