Indexing the Earth Mover's Distance Using Normal Distributions

Computer Science – Databases

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

VLDB2012

Scientific paper

Querying uncertain data sets (represented as probability distributions) presents many challenges due to the large amount of data involved and the difficulties comparing uncertainty between distributions. The Earth Mover's Distance (EMD) has increasingly been employed to compare uncertain data due to its ability to effectively capture the differences between two distributions. Computing the EMD entails finding a solution to the transportation problem, which is computationally intensive. In this paper, we propose a new lower bound to the EMD and an index structure to significantly improve the performance of EMD based K-nearest neighbor (K-NN) queries on uncertain databases. We propose a new lower bound to the EMD that approximates the EMD on a projection vector. Each distribution is projected onto a vector and approximated by a normal distribution, as well as an accompanying error term. We then represent each normal as a point in a Hough transformed space. We then use the concept of stochastic dominance to implement an efficient index structure in the transformed space. We show that our method significantly decreases K-NN query time on uncertain databases. The index structure also scales well with database cardinality. It is well suited for heterogeneous data sets, helping to keep EMD based queries tractable as uncertain data sets become larger and more complex.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Indexing the Earth Mover's Distance Using Normal Distributions does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Indexing the Earth Mover's Distance Using Normal Distributions, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Indexing the Earth Mover's Distance Using Normal Distributions will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-8591

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.