On the geometry of similarity search: dimensionality curse and concentration of measure

Computer Science – Information Retrieval

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

7 pages, LaTeX 2e

Scientific paper

We suggest that the curse of dimensionality affecting the similarity-based search in large datasets is a manifestation of the phenomenon of concentration of measure on high-dimensional structures. We prove that, under certain geometric assumptions on the query domain $\Omega$ and the dataset $X$, if $\Omega$ satisfies the so-called concentration property, then for most query points $x^\ast$ the ball of radius $(1+\e)d_X(x^\ast)$ centred at $x^\ast$ contains either all points of $X$ or else at least $C_1\exp(-C_2\e^2n)$ of them. Here $d_X(x^\ast)$ is the distance from $x^\ast$ to the nearest neighbour in $X$ and $n$ is the dimension of $\Omega$.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

On the geometry of similarity search: dimensionality curse and concentration of measure does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with On the geometry of similarity search: dimensionality curse and concentration of measure, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and On the geometry of similarity search: dimensionality curse and concentration of measure will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-559638

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.