Data spectroscopy: Eigenspaces of convolution operators and clustering

Statistics – Machine Learning

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Published in at http://dx.doi.org/10.1214/09-AOS700 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of

Scientific paper

10.1214/09-AOS700

This paper focuses on obtaining clustering information about a distribution from its i.i.d. samples. We develop theoretical results to understand and use clustering information contained in the eigenvectors of data adjacency matrices based on a radial kernel function with a sufficiently fast tail decay. In particular, we provide population analyses to gain insights into which eigenvectors should be used and when the clustering information for the distribution can be recovered from the sample. We learn that a fixed number of top eigenvectors might at the same time contain redundant clustering information and miss relevant clustering information. We use this insight to design the data spectroscopic clustering (DaSpec) algorithm that utilizes properly selected eigenvectors to determine the number of clusters automatically and to group the data accordingly. Our findings extend the intuitions underlying existing spectral techniques such as spectral clustering and Kernel Principal Components Analysis, and provide new understanding into their usability and modes of failure. Simulation studies and experiments on real-world data are conducted to show the potential of our algorithm. In particular, DaSpec is found to handle unbalanced groups and recover clusters of different shapes better than the competing methods.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Data spectroscopy: Eigenspaces of convolution operators and clustering does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Data spectroscopy: Eigenspaces of convolution operators and clustering, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Data spectroscopy: Eigenspaces of convolution operators and clustering will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-1386

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.