Text Modeling using Unsupervised Topic Models and Concept Hierarchies

Computer Science – Artificial Intelligence

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Scientific paper

Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can potentially discover a broad range of themes in a data set, the interpretability of the learned topics is not always ideal. Human-defined concepts, on the other hand, tend to be semantically richer due to careful selection of words to define concepts but they tend not to cover the themes in a data set exhaustively. In this paper, we propose a probabilistic framework to combine a hierarchy of human-defined semantic concepts with statistical topic models to seek the best of both worlds. Experimental results using two different sources of concept hierarchies and two collections of text documents indicate that this combination leads to systematic improvements in the quality of the associated language models as well as enabling new techniques for inferring and visualizing the semantics of a document.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Text Modeling using Unsupervised Topic Models and Concept Hierarchies does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Text Modeling using Unsupervised Topic Models and Concept Hierarchies, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Text Modeling using Unsupervised Topic Models and Concept Hierarchies will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-213302

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.