Robust model-based clustering with gene ranking

Statistics – Methodology

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

18 pages, 4 figures

Scientific paper

Cluster analysis of biological samples using gene expression measurements is a common task which aids the discovery of heterogeneous biological sub-populations having distinct mRNA profiles. Several model-based clustering algorithms have been proposed in which the distribution of gene expression values within each sub-group is assumed to be Gaussian. In the presence of noise and extreme observations, a mixture of Gaussian densities may over-fit and overestimate the true number of clusters. Moreover, commonly used model-based clustering algorithms do not generally provide a mechanism to quantify the relative contribution of each gene to the final partitioning of the data. We propose a penalised mixture of Student's t distributions for model-based clustering and gene ranking. Together with a bootstrap procedure, the proposed approach provides a means for ranking genes according to their contributions to the clustering process. Experimental results show that the algorithm performs well comparably to traditional Gaussian mixtures in the presence of outliers and longer tailed distributions. The algorithm also identifies the true informative genes with high sensitivity, and achieves improved model selection. An illustrative application to breast cancer data is also presented which confirms established tumor subclasses.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Robust model-based clustering with gene ranking does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Robust model-based clustering with gene ranking, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Robust model-based clustering with gene ranking will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-343963

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.