Text Segmentation Using Exponential Models

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

12 pages, LaTeX source and postscript figures for EMNLP-2 paper

Scientific paper

This paper introduces a new statistical approach to partitioning text automatically into coherent segments. Our approach enlists both short-range and long-range language models to help it sniff out likely sites of topic changes in text. To aid its search, the system consults a set of simple lexical hints it has learned to associate with the presence of boundaries through inspection of a large corpus of annotated data. We also propose a new probabilistically motivated error metric for use by the natural language processing and information retrieval communities, intended to supersede precision and recall for appraising segmentation algorithms. Qualitative assessment of our algorithm as well as evaluation using this new metric demonstrate the effectiveness of our approach in two very different domains, Wall Street Journal articles and the TDT Corpus, a collection of newswire articles and broadcast news transcripts.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Text Segmentation Using Exponential Models does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Text Segmentation Using Exponential Models, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Text Segmentation Using Exponential Models will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-466490

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.