Pruning nearest neighbor cluster trees

Statistics – Machine Learning

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

Nearest neighbor (k-NN) graphs are widely used in machine learning and data mining applications, and our aim is to better understand what they reveal about the cluster structure of the unknown underlying distribution of points. Moreover, is it possible to identify spurious structures that might arise due to sampling variability? Our first contribution is a statistical analysis that reveals how certain subgraphs of a k-NN graph form a consistent estimator of the cluster tree of the underlying distribution of points. Our second and perhaps most important contribution is the following finite sample guarantee. We carefully work out the tradeoff between aggressive and conservative pruning and are able to guarantee the removal of all spurious cluster structures at all levels of the tree while at the same time guaranteeing the recovery of salient clusters. This is the first such finite sample result in the context of clustering.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Pruning nearest neighbor cluster trees does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Pruning nearest neighbor cluster trees, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Pruning nearest neighbor cluster trees will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-693228

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.