Exploring term-document matrices from matrix models in text mining

Computer Science – Information Retrieval

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

SIAM Text Mining Workshop, SIAM Conference Data Mining, 2006

Scientific paper

We explore a matrix-space model, that is a natural extension to the vector space model for Information Retrieval. Each document can be represented by a matrix that is based on document extracts (e.g. sentences, paragraphs, sections). We focus on the performance of this model for the specific case in which documents are originally represented as term-by-sentence matrices. We use the singular value decomposition to approximate the term-by-sentence matrices and assemble these results to form the pseudo-``term-document'' matrix that forms the basis of a text mining method alternative to traditional VSM and LSI. We investigate the singular values of this matrix and provide experimental evidence suggesting that the method can be particularly effective in terms of accuracy for text collections with multi-topic documents, such as web pages with news.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Exploring term-document matrices from matrix models in text mining does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Exploring term-document matrices from matrix models in text mining, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Exploring term-document matrices from matrix models in text mining will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-454104

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.