Computer Science – Numerical Analysis
Scientific paper
2011-02-28
NIPS Workshop on Low-Rank Methods for Large-Scale Machine Learning, 2010
Computer Science
Numerical Analysis
Scientific paper
With the explosion of the size of digital dataset, the limiting factor for decomposition algorithms is the \emph{number of passes} over the input, as the input is often stored out-of-core or even off-site. Moreover, we're only interested in algorithms that operate in \emph{constant memory} w.r.t. to the input size, so that arbitrarily large input can be processed. In this paper, we present a practical comparison of two such algorithms: a distributed method that operates in a single pass over the input vs. a streamed two-pass stochastic algorithm. The experiments track the effect of distributed computing, oversampling and memory trade-offs on the accuracy and performance of the two algorithms. To ensure meaningful results, we choose the input to be a real dataset, namely the whole of the English Wikipedia, in the application settings of Latent Semantic Analysis.
No associations
LandOfFree
Fast and Faster: A Comparison of Two Streamed Matrix Decomposition Algorithms does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Fast and Faster: A Comparison of Two Streamed Matrix Decomposition Algorithms, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fast and Faster: A Comparison of Two Streamed Matrix Decomposition Algorithms will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-424560