A Maximum Entropy Approach to Identifying Sentence Boundaries

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

4 pages, uses aclap.sty and covingtn.sty

Scientific paper

We present a trainable model for identifying sentence boundaries in raw text. Given a corpus annotated with sentence boundaries, our model learns to classify each occurrence of ., ?, and ! as either a valid or invalid sentence boundary. The training procedure requires no hand-crafted rules, lexica, part-of-speech tags, or domain-specific information. The model can therefore be trained easily on any genre of English, and should be trainable on any other Roman-alphabet language. Performance is comparable to or better than the performance of similar systems, but we emphasize the simplicity of retraining for new domains.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

A Maximum Entropy Approach to Identifying Sentence Boundaries does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with A Maximum Entropy Approach to Identifying Sentence Boundaries, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and A Maximum Entropy Approach to Identifying Sentence Boundaries will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-289305

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.