Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables

Statistics – Machine Learning

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Published in at http://dx.doi.org/10.1214/08-EJS194 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by t

Scientific paper

10.1214/08-EJS194

Clustering analysis is one of the most widely used statistical tools in many emerging areas such as microarray data analysis. For microarray and other high-dimensional data, the presence of many noise variables may mask underlying clustering structures. Hence removing noise variables via variable selection is necessary. For simultaneous variable selection and parameter estimation, existing penalized likelihood approaches in model-based clustering analysis all assume a common diagonal covariance matrix across clusters, which however may not hold in practice. To analyze high-dimensional data, particularly those with relatively low sample sizes, this article introduces a novel approach that shrinks the variances together with means, in a more general situation with cluster-specific (diagonal) covariance matrices. Furthermore, selection of grouped variables via inclusion or exclusion of a group of variables altogether is permitted by a specific form of penalty, which facilitates incorporating subject-matter knowledge, such as gene functions in clustering microarray samples for disease subtype discovery. For implementation, EM algorithms are derived for parameter estimation, in which the M-steps clearly demonstrate the effects of shrinkage and thresholding. Numerical examples, including an application to acute leukemia subtype discovery with microarray gene expression data, are provided to demonstrate the utility and advantage of the proposed method.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-579985

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.