Computer Science – Data Structures and Algorithms
Scientific paper
2009-12-01
Computer Science
Data Structures and Algorithms
9 pages + abstract and references
Scientific paper
We study lower bounds for Locality Sensitive Hashing (LSH) in the strongest setting: point sets in {0,1}^d under the Hamming distance. Recall that here H is said to be an (r, cr, p, q)-sensitive hash family if all pairs x, y in {0,1}^d with dist(x,y) at most r have probability at least p of collision under a randomly chosen h in H, whereas all pairs x, y in {0,1}^d with dist(x,y) at least cr have probability at most q of collision. Typically, one considers d tending to infinity, with c fixed and q bounded away from 0. For its applications to approximate nearest neighbor search in high dimensions, the quality of an LSH family H is governed by how small its "rho parameter" rho = ln(1/p)/ln(1/q) is as a function of the parameter c. The seminal paper of Indyk and Motwani showed that for each c, the extremely simple family H = {x -> x_i : i in d} achieves rho at most 1/c. The only known lower bound, due to Motwani, Naor, and Panigrahy, is that rho must be at least .46/c (minus o_d(1)). In this paper we show an optimal lower bound: rho must be at least 1/c (minus o_d(1)). This lower bound for Hamming space yields a lower bound of 1/c^2 for Euclidean space (or the unit sphere) and 1/c for the Jaccard distance on sets; both of these match known upper bounds. Our proof is simple; the essence is that the noise stability of a boolean function at e^{-t} is a log-convex function of t.
O'Donnell Ryan
Wu Yi
Zhou Yuan
No associations
LandOfFree
Optimal lower bounds for locality sensitive hashing (except when q is tiny) does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Optimal lower bounds for locality sensitive hashing (except when q is tiny), we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Optimal lower bounds for locality sensitive hashing (except when q is tiny) will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-507919