A New Approach to Speeding Up Topic Modeling

Computer Science – Learning

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

12 pages, 11 figures

Scientific paper

Latent Dirichlet allocation (LDA) is a widely-used probabilistic topic modeling paradigm, and recently finds many applications in computer vision and computational biology. This paper proposes a fast and accurate algorithm, active belief propagation (ABP), for training LDA. Usually training LDA requires repeated scanning of the entire corpus and searching the complete topic space. Confronted with massive corpus with large number of topics, such a training iteration is often inefficient and time-consuming. To accelerate the training speed, ABP actively scans partial corpus and searches partial topic space for topic modeling, saving enormous training time in each iteration. To ensure accuracy, ABP selects only those documents and topics that contribute to the largest residuals within the residual belief propagation (RBP) framework. On four real-world corpora, ABP performs around 10 to 100 times faster than some of the major state-of-the-art algorithms for training LDA, while retains a comparable topic modeling accuracy.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

A New Approach to Speeding Up Topic Modeling does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with A New Approach to Speeding Up Topic Modeling, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and A New Approach to Speeding Up Topic Modeling will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-552357

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.