Stochastic Dimensionality Reduction for K-means Clustering

Computer Science – Data Structures and Algorithms

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

19 pages

Scientific paper

We study the topic of dimensionality reduction methods for k-means clustering. Dimensionality reduction encompasses the union of two approaches; feature selection and feature extraction. First, feature selection selects a small subset of actual features from the data and then runs the clustering algorithm only on the selected features. Second, feature extraction constructs a small set of new artificial features and then runs the clustering algorithm only on the constructed features. Despite the significance of the problem as well as the wealth of heuristic methods addressing it there exist no provably accurate feature selection methods. On the other hand, two provably accurate feature extraction methods for k-means exist: the first one is randomized and is based on Random Projections; the other, is deterministic and it is based on the Singular Value Decomposition. This paper addresses this shortcoming by presenting the first provably accurate feature selection method for k-means clustering. We also present two novel feature extraction methods: the first one is based on Random Projections and improves the existing result in terms of speed and number of features needed to be extracted; the other is based on fast approximate SVD factorizations and improves the existing result in terms of speed. All three methods of our work are randomized and, with constant probability, provide constant-factor approximation guarantees with respect to the optimal k-means objective value.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Stochastic Dimensionality Reduction for K-means Clustering does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Stochastic Dimensionality Reduction for K-means Clustering, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Stochastic Dimensionality Reduction for K-means Clustering will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-500120

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.