Tagging and Morphological Disambiguation of Turkish Text

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

To appear in Proceedings of 4th ACL-ANLP Conf. uuencoded gzip'ed postscript file, 6 pages

Scientific paper

Automatic text tagging is an important component in higher level analysis of text corpora, and its output can be used in many natural language processing applications. In languages like Turkish or Finnish, with agglutinative morphology, morphological disambiguation is a very crucial process in tagging, as the structures of many lexical forms are morphologically ambiguous. This paper describes a POS tagger for Turkish text based on a full-scale two-level specification of Turkish morphology that is based on a lexicon of about 24,000 root words. This is augmented with a multi-word and idiomatic construct recognizer, and most importantly morphological disambiguator based on local neighborhood constraints, heuristics and limited amount of statistical information. The tagger also has functionality for statistics compilation and fine tuning of the morphological analyzer, such as logging erroneous morphological parses, commonly used roots, etc. Preliminary results indicate that the tagger can tag about 98-99\% of the texts accurately with very minimal user intervention. Furthermore for sentences morphologically disambiguated with the tagger, an LFG parser developed for Turkish, generates, on the average, 50\% less ambiguous parses and parses almost 2.5 times faster. The tagging functionality is not specific to Turkish, and can be applied to any language with a proper morphological analysis interface.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Tagging and Morphological Disambiguation of Turkish Text does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Tagging and Morphological Disambiguation of Turkish Text, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Tagging and Morphological Disambiguation of Turkish Text will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-282183

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.