Iterative Residual Rescaling: An Analysis and Generalization of LSI

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

To appear in the proceedings of SIGIR 2001. 11 pages

Scientific paper

We consider the problem of creating document representations in which inter-document similarity measurements correspond to semantic similarity. We first present a novel subspace-based framework for formalizing this task. Using this framework, we derive a new analysis of Latent Semantic Indexing (LSI), showing a precise relationship between its performance and the uniformity of the underlying distribution of documents over topics. This analysis helps explain the improvements gained by Ando's (2000) Iterative Residual Rescaling (IRR) algorithm: IRR can compensate for distributional non-uniformity. A further benefit of our framework is that it provides a well-motivated, effective method for automatically determining the rescaling factor IRR depends on, leading to further improvements. A series of experiments over various settings and with several evaluation metrics validates our claims.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Iterative Residual Rescaling: An Analysis and Generalization of LSI does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Iterative Residual Rescaling: An Analysis and Generalization of LSI, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Iterative Residual Rescaling: An Analysis and Generalization of LSI will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-239975

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.