Clustering Based on Pairwise Distances When the Data is of Mixed Dimensions

Statistics – Machine Learning

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

In the context of clustering, we consider a generative model in a Euclidean ambient space with clusters of different shapes, dimensions, sizes and densities. In an asymptotic setting where the number of points becomes large, we obtain theoretical guaranties for a few emblematic methods based on pairwise distances: a simple algorithm based on the extraction of connected components in a neighborhood graph; the spectral clustering method of Ng, Jordan and Weiss; and hierarchical clustering with single linkage. The methods are shown to enjoy some near-optimal properties in terms of separation between clusters and robustness to outliers. The local scaling method of Zelnik-Manor and Perona is shown to lead to a near-optimal choice for the scale in the first two methods. We also provide a lower bound on the spectral gap to consistently choose the correct number of clusters in the spectral method.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Clustering Based on Pairwise Distances When the Data is of Mixed Dimensions does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Clustering Based on Pairwise Distances When the Data is of Mixed Dimensions, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Clustering Based on Pairwise Distances When the Data is of Mixed Dimensions will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-477510

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.