Home » Statistical Analysis and Data Mining Highlights

Seriation Article Leads Off Volume 3

1 April 2010 1,752 views No Comment
Joseph S. Verduccii, Editor, Statistical Analysis and Data Mining

    Table of Contents

    • Seriation and Matrix Reordering Methods: A Historical Overview

      Innar Liiv
    • Bayesian Adaptive Nearest Neighbor

      Ruixin Guo and Sounak Chakraborty
    • Mining and Tracking Evolving Web User Trends from Large Web Server Logs

      Basheer Hawwash and Olfa Nasraoui
    • Modeling User Reputation in Wikis

      Sara Javanmardi, Cristina Videira Lopes, and Pierre Baldi

    The four papers in Volume 3, Number 2 of the journal Statistical Analysis and Data Mining span a wide range of topics, from a very general method for discovering patterns in data to very specific models for the reputation of those who update wikis.

    In the first paper, Innar Liiv reviews how seriation, or reordering of observations, has revealed the hidden structure of data in many disciplines. Typically seriation is achieved for these myriad examples by permuting the rows and/or columns of matrices to optimize interesting objective functions.

    In the second paper, Ruixin Guo and Sounak Chakraborty present a general method, Bayesian adaptive nearest neighbor (BANN), for classification in high dimensions. BANN uses a Bayesian framework to combine ideas for adapting the shape (discriminative adaptive nearest neighbor [DANN]) as well as the size (probabilistic nearest neighbor [PNN]) of neighborhoods, based on extended local patterns. BANN performs better than DANN or PNN on nine benchmark data sets.

    In the third paper, Basheer Hawwash and Olfa Nasraoui demonstrate an efficient method for mining evolving profiles, which is particularly sensitive to changes in profile patterns. They apply their method to track changes in the profiles of those accessing a library web site.

    Finally, Sara Javanmardi, Cristina Videira Lopes, and Pierre Baldi propose three nested models to estimate dynamically the reputation of contributors to a wiki site, where “reputation” ranges from 0 (vandals) to 1 (administrators). The first model simply updates the fraction of “good” contributions, the second adjusts each contribution by the length of time it has endured, and the third takes into account the reputation of the deletor.

    All in all, the papers span the range from thought-provoking to immediately useful.

    1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
    Loading...

    Comments are closed.