TopSig: Topology Preserving Document Signatures

Computer Science – Information Retrieval

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

12 pages, 8 figures, CIKM 2011

Scientific paper

Performance comparisons between File Signatures and Inverted Files for text retrieval have previously shown several significant shortcomings of file signatures relative to inverted files. The inverted file approach underpins most state-of-the-art search engine algorithms, such as Language and Probabilistic models. It has been widely accepted that traditional file signatures are inferior alternatives to inverted files. This paper describes TopSig, a new approach to the construction of file signatures. Many advances in semantic hashing and dimensionality reduction have been made in recent times, but these were not so far linked to general purpose, signature file based, search engines. This paper introduces a different signature file approach that builds upon and extends these recent advances. We are able to demonstrate significant improvements in the performance of signature file based indexing and retrieval, performance that is comparable to that of state of the art inverted file based systems, including Language models and BM25. These findings suggest that file signatures offer a viable alternative to inverted files in suitable settings and from the theoretical perspective it positions the file signatures model in the class of Vector Space retrieval models.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

TopSig: Topology Preserving Document Signatures does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with TopSig: Topology Preserving Document Signatures, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and TopSig: Topology Preserving Document Signatures will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-520030

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.