How to do Statistics and Machine Learning on Very Large Survey Datasets

Statistics – Computation

Scientific paper

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details How to do Statistics and Machine Learning on Very Large Survey Datasets How to do Statistics and Machine Learning on Very Large Survey Datasets

: Jan 2009
: adsabs.harvard.edu/cgi-bin/nph-data_query?bibcode=2009aas...21321105g&link_type=abstract
: American Astronomical Society, AAS Meeting #213, #211.05; Bulletin of the American Astronomical Society, Vol. 41, p.283
: Statistics
: Computation

: Scientific paper
: I'll describe algorithms and data structures for allowing the most powerful machine learning methods, which often scale quadratically or even cubically with the number of data points, to be performed many orders of magnitude faster than naive implementations. Such techniques can make previously impossible statistical analyses tractable on the scale of entire sky surveys. I will discuss scalable algorithms we have developed for n-point correlations, friends-of-friends, nearest-neighbors, kernel density estimation, nonparametric Bayes classification, principal component analysis, local linear regression, isometric non-negative matrix factorization, hidden Markov models, k-means, support vector machine-like classifiers, Gaussian process regression, and Gaussian graphical model inference, among others. In addition to techniques inspired by computational geometry, fast multipole methods, and Monte Carlo integration, we employ a distributed framework which can be thought of as a higher-order version of Google's MapReduce. Our algorithms have enabled several first-of-a-kind large-scale analyses by our collaborators in astrophysics as well as other fields.

Affiliated with

Gray Alexander

Astronomy and Astrophysics – Astrophysics

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

How to do Statistics and Machine Learning on Very Large Survey Datasets does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with How to do Statistics and Machine Learning on Very Large Survey Datasets, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and How to do Statistics and Machine Learning on Very Large Survey Datasets will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFWR-SCP-O-1696538

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure