Named Entity Recognition Using Web Document Corpus

Computer Science – Information Retrieval

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

11 pages 4 figures, 2 tables

Scientific paper

10.5121/ijmit.2011.3104

This paper introduces a named entity recognition approach in textual corpus. This Named Entity (NE) can be a named: location, person, organization, date, time, etc., characterized by instances. A NE is found in texts accompanied by contexts: words that are left or right of the NE. The work mainly aims at identifying contexts inducing the NE's nature. As such, The occurrence of the word "President" in a text, means that this word or context may be followed by the name of a president as President "Obama". Likewise, a word preceded by the string "footballer" induces that this is the name of a footballer. NE recognition may be viewed as a classification method, where every word is assigned to a NE class, regarding the context. The aim of this study is then to identify and classify the contexts that are most relevant to recognize a NE, those which are frequently found with the NE. A learning approach using training corpus: web documents, constructed from learning examples is then suggested. Frequency representations and modified tf-idf representations are used to calculate the context weights associated to context frequency, learning example frequency, and document frequency in the corpus.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Named Entity Recognition Using Web Document Corpus does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Named Entity Recognition Using Web Document Corpus, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Named Entity Recognition Using Web Document Corpus will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-425968

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.