Estimating Lexical Priors for Low-Frequency Syncretic Forms

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Submitted to Computational Linguistics

Scientific paper

Given a previously unseen form that is morphologically n-ways ambiguous, what is the best estimator for the lexical prior probabilities for the various functions of the form? We argue that the best estimator is provided by computing the relative frequencies of the various functions among the hapax legomena --- the forms that occur exactly once in a corpus. This result has important implications for the development of stochastic morphological taggers, especially when some initial hand-tagging of a corpus is required: For predicting lexical priors for very low-frequency morphologically ambiguous types (most of which would not occur in any given corpus) one should concentrate on tagging a good representative sample of the hapax legomena, rather than extensively tagging words of all frequency ranges.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Estimating Lexical Priors for Low-Frequency Syncretic Forms does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Estimating Lexical Priors for Low-Frequency Syncretic Forms, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Estimating Lexical Priors for Low-Frequency Syncretic Forms will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-612059

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.