Hashing Algorithms for Large-Scale Learning

Statistics – Machine Learning

Scientific paper

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Hashing Algorithms for Large-Scale Learning Hashing Algorithms for Large-Scale Learning

: 2011-06-06
: arxiv.org/abs/1106.0967v1
: Statistics
: Machine Learning

: Scientific paper
: In this paper, we first demonstrate that b-bit minwise hashing, whose estimators are positive definite kernels, can be naturally integrated with learning algorithms such as SVM and logistic regression. We adopt a simple scheme to transform the nonlinear (resemblance) kernel into linear (inner product) kernel; and hence large-scale problems can be solved extremely efficiently. Our method provides a simple effective solution to large-scale learning in massive and extremely high-dimensional datasets, especially when data do not fit in memory. We then compare b-bit minwise hashing with the Vowpal Wabbit (VW) algorithm (which is related the Count-Min (CM) sketch). Interestingly, VW has the same variances as random projections. Our theoretical and empirical comparisons illustrate that usually $b$-bit minwise hashing is significantly more accurate (at the same storage) than VW (and random projections) in binary data. Furthermore, $b$-bit minwise hashing can be combined with VW to achieve further improvements in terms of training speed, especially when $b$ is large.

Affiliated with

König Arnd Christian

Computer Science – Databases

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Li Ping

Computer Science – Learning

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Moore Joshua

Computer Science – Learning

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Shrivastava Anshumali

Physics – Nuclear Physics – Nuclear Experiment

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Hashing Algorithms for Large-Scale Learning does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Hashing Algorithms for Large-Scale Learning, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Hashing Algorithms for Large-Scale Learning will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFWR-SCP-O-389080

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure