Using Variational Inference and MapReduce to Scale Topic Modeling

Computer Science – Artificial Intelligence

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

Latent Dirichlet Allocation (LDA) is a popular topic modeling technique for exploring document collections. Because of the increasing prevalence of large datasets, there is a need to improve the scalability of inference of LDA. In this paper, we propose a technique called ~\emph{MapReduce LDA} (Mr. LDA) to accommodate very large corpus collections in the MapReduce framework. In contrast to other techniques to scale inference for LDA, which use Gibbs sampling, we use variational inference. Our solution efficiently distributes computation and is relatively simple to implement. More importantly, this variational implementation, unlike highly tuned and specialized implementations, is easily extensible. We demonstrate two extensions of the model possible with this scalable framework: informed priors to guide topic discovery and modeling topics from a multilingual corpus.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Using Variational Inference and MapReduce to Scale Topic Modeling does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Using Variational Inference and MapReduce to Scale Topic Modeling, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Using Variational Inference and MapReduce to Scale Topic Modeling will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-515013

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.