Computer Science – Information Retrieval
Scientific paper
1999-01-12
Information Processing Letters 73 (2000), 47-51.
Computer Science
Information Retrieval
7 pages, LaTeX 2e
Scientific paper
We suggest that the curse of dimensionality affecting the similarity-based search in large datasets is a manifestation of the phenomenon of concentration of measure on high-dimensional structures. We prove that, under certain geometric assumptions on the query domain $\Omega$ and the dataset $X$, if $\Omega$ satisfies the so-called concentration property, then for most query points $x^\ast$ the ball of radius $(1+\e)d_X(x^\ast)$ centred at $x^\ast$ contains either all points of $X$ or else at least $C_1\exp(-C_2\e^2n)$ of them. Here $d_X(x^\ast)$ is the distance from $x^\ast$ to the nearest neighbour in $X$ and $n$ is the dimension of $\Omega$.
No associations
LandOfFree
On the geometry of similarity search: dimensionality curse and concentration of measure does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with On the geometry of similarity search: dimensionality curse and concentration of measure, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and On the geometry of similarity search: dimensionality curse and concentration of measure will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-559638