Statistics – Machine Learning
Scientific paper
2012-04-11
Statistics
Machine Learning
Scientific paper
In information retrieval, a fundamental goal is to transform a document into concepts that are representative of its content. The term "representative" is in itself challenging to define, and various tasks require different granularities of concepts. In this paper, we aim to model concepts that are sparse over the vocabulary, and that flexibly adapt their content based on other relevant semantic information such as textual structure or associated image features. We explore a Bayesian nonparametric model based on nested beta processes that allows for inferring an unknown number of strictly sparse concepts. The resulting model provides an inherently different representation of concepts than a standard LDA (or HDP) based topic model, and allows for direct incorporation of semantic features. We demonstrate the utility of this representation on multilingual blog data and the Congressional Record.
El-Arini Khalid
Fox Emily B.
Guestrin Carlos
No associations
LandOfFree
Concept Modeling with Superwords does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Concept Modeling with Superwords, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Concept Modeling with Superwords will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-717233