Fast Pseudo-Random Fingerprints

Computer Science – Data Structures and Algorithms

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

We propose a method to exponentially speed up computation of various fingerprints, such as the ones used to compute similarity and rarity in massive data sets. Rather then maintaining the full stream of $b$ items of a universe $[u]$, such methods only maintain a concise fingerprint of the stream, and perform computations using the fingerprints. The computations are done approximately, and the required fingerprint size $k$ depends on the desired accuracy $\epsilon$ and confidence $\delta$. Our technique maintains a single bit per hash function, rather than a single integer, thus requiring a fingerprint of length $k = O(\frac{\ln \frac{1}{\delta}}{\epsilon^2})$ bits, rather than $O(\log u \cdot \frac{\ln \frac{1}{\delta}}{\epsilon^2})$ bits required by previous approaches. The main advantage of the fingerprints we propose is that rather than computing the fingerprint of a stream of $b$ items in time of $O(b \cdot k)$, we can compute it in time $O(b \log k)$. Thus this allows an exponential speedup for the fingerprint construction, or alternatively allows achieving a much higher accuracy while preserving computation time. Our methods rely on a specific family of pseudo-random hashes for which we can quickly locate hashes resulting in small values.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Fast Pseudo-Random Fingerprints does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Fast Pseudo-Random Fingerprints, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fast Pseudo-Random Fingerprints will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-522939

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.