Thematic Annotation: extracting concepts out of documents

Computer Science – Computation and Language

Scientific paper

Rate now

[ 0.00 ] – not rated yet Voters 0 Comments 0

Details Thematic Annotation: extracting concepts out of documents Thematic Annotation: extracting concepts out of documents

: 2004-12-30
: arxiv.org/abs/cs/0412117v1
: Computer Science
: Computation and Language

: Technical report EPFL/LIA. 81 pages, 16 figures
: Scientific paper
: Contrarily to standard approaches to topic annotation, the technique used in this work does not centrally rely on some sort of -- possibly statistical -- keyword extraction. In fact, the proposed annotation algorithm uses a large scale semantic database -- the EDR Electronic Dictionary -- that provides a concept hierarchy based on hyponym and hypernym relations. This concept hierarchy is used to generate a synthetic representation of the document by aggregating the words present in topically homogeneous document segments into a set of concepts best preserving the document's content. This new extraction technique uses an unexplored approach to topic selection. Instead of using semantic similarity measures based on a semantic resource, the later is processed to extract the part of the conceptual hierarchy relevant to the document content. Then this conceptual hierarchy is searched to extract the most relevant set of concepts to represent the topics discussed in the document. Notice that this algorithm is able to extract generic concepts that are not directly present in the document.

Affiliated with

Andrews Pierre

Computer Science – Computation and Language

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Rajman Martin

Computer Science – Computation and Language

Scientist

[ 0.00 ] – not rated yet Voters 0 Comments 0

Also associated with

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Thematic Annotation: extracting concepts out of documents does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Thematic Annotation: extracting concepts out of documents, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Thematic Annotation: extracting concepts out of documents will most certainly appreciate the feedback.

Rate now

Comments { 0 }

Profile ID: LFWR-SCP-O-45455

All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.

Canada

Charities
Companies
MP Candidates
Patents
Employee Salary Disclosure

World

Places of the World
Scientific Papers

United States

Banks
Companies
Counties
Patents
Employee Salary Disclosure