Word Sense Disambiguation by Web Mining for Word Co-occurrence Probabilities

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

related work available at http://purl.org/peter.turney/

Scientific paper

This paper describes the National Research Council (NRC) Word Sense Disambiguation (WSD) system, as applied to the English Lexical Sample (ELS) task in Senseval-3. The NRC system approaches WSD as a classical supervised machine learning problem, using familiar tools such as the Weka machine learning software and Brill's rule-based part-of-speech tagger. Head words are represented as feature vectors with several hundred features. Approximately half of the features are syntactic and the other half are semantic. The main novelty in the system is the method for generating the semantic features, based on word \hbox{co-occurrence} probabilities. The probabilities are estimated using the Waterloo MultiText System with a corpus of about one terabyte of unlabeled text, collected by a web crawler.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Word Sense Disambiguation by Web Mining for Word Co-occurrence Probabilities does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Word Sense Disambiguation by Web Mining for Word Co-occurrence Probabilities, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Word Sense Disambiguation by Web Mining for Word Co-occurrence Probabilities will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-457172

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.