Learning Algorithms for Keyphrase Extraction

Computer Science – Learning

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

46 pages

Scientific paper

Many academic journals ask their authors to provide a list of about five to fifteen keywords, to appear on the first page of each article. Since these key words are often phrases of two or more words, we prefer to call them keyphrases. There is a wide variety of tasks for which keyphrases are useful, as we discuss in this paper. We approach the problem of automatically extracting keyphrases from text as a supervised learning task. We treat a document as a set of phrases, which the learning algorithm must learn to classify as positive or negative examples of keyphrases. Our first set of experiments applies the C4.5 decision tree induction algorithm to this learning task. We evaluate the performance of nine different configurations of C4.5. The second set of experiments applies the GenEx algorithm to the task. We developed the GenEx algorithm specifically for automatically extracting keyphrases from text. The experimental results support the claim that a custom-designed algorithm (GenEx), incorporating specialized procedural domain knowledge, can generate better keyphrases than a generalpurpose algorithm (C4.5). Subjective human evaluation of the keyphrases generated by Extractor suggests that about 80% of the keyphrases are acceptable to human readers. This level of performance should be satisfactory for a wide variety of applications.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Learning Algorithms for Keyphrase Extraction does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Learning Algorithms for Keyphrase Extraction, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Learning Algorithms for Keyphrase Extraction will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-74816

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.