Algorithms for Internal Validation Clustering Measures in the Post Genomic Era

Computer Science – Data Structures and Algorithms

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

Inferring cluster structure in microarray datasets is a fundamental task for the -omic sciences. A fundamental question in Statistics, Data Analysis and Classification, is the prediction of the number of clusters in a dataset, usually established via internal validation measures. Despite the wealth of internal measures available in the literature, new ones have been recently proposed, some of them specifically for microarray data. In this dissertation, a study of internal validation measures is given, paying particular attention to the stability based ones. Indeed, this class of measures is particularly prominent and promising in order to have a reliable estimate the number of clusters in a dataset. For those measures, a new general algorithmic paradigm is proposed here that highlights the richness of measures in this class and accounts for the ones already available in the literature. Moreover, some of the most representative validation measures are also considered. Experiments on 12 benchmark datasets are performed in order to assess both the intrinsic ability of a measure to predict the correct number of clusters in a dataset and its merit relative to the other measures. The main result is a hierarchy of internal validation measures in terms of precision and speed, highlighting some of their merits and limitations not reported before in the literature. This hierarchy shows that the faster the measure, the less accurate it is. In order to reduce the time performance gap between the fastest and the most precise measures, the technique of designing fast approximation algorithms is systematically applied. The end result is a speed-up of many of the measures studied here that brings the gap between the fastest and the most precise within one order of magnitude in time, with no degradation in their prediction power. Prior to this work, the time gap was at least two orders of magnitude.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Algorithms for Internal Validation Clustering Measures in the Post Genomic Era does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Algorithms for Internal Validation Clustering Measures in the Post Genomic Era, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Algorithms for Internal Validation Clustering Measures in the Post Genomic Era will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-377643

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.