Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

15 pages, Postscript only

Scientific paper

This paper presents a new approach for measuring semantic similarity/distance between words and concepts. It combines a lexical taxonomy structure with corpus statistical information so that the semantic distance between nodes in the semantic space constructed by the taxonomy can be better quantified with the computational evidence derived from a distributional analysis of corpus data. Specifically, the proposed measure is a combined approach that inherits the edge-based approach of the edge counting scheme, which is then enhanced by the node-based approach of the information content calculation. When tested on a common data set of word pair similarity ratings, the proposed approach outperforms other computational models. It gives the highest correlation value (r = 0.828) with a benchmark based on human similarity judgements, whereas an upper bound (r = 0.885) is observed when human subjects replicate the same task.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-328007

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.