Computer Science – Computation and Language
Scientific paper
2003-12-27
Computer Science
Computation and Language
10 pages text; 1 figure. To appear in "Current Issues in Linguistic Theory: Recent Advances in Natural Language Processing";Jo
Scientific paper
We use a Dynamic Bayesian Network to represent compactly a variety of sublexical and contextual features relevant to Part-of-Speech (PoS) tagging. The outcome is a flexible tagger (LegoTag) with state-of-the-art performance (3.6% error on a benchmark corpus). We explore the effect of eliminating redundancy and radically reducing the size of feature vocabularies. We find that a small but linguistically motivated set of suffixes results in improved cross-corpora generalization. We also show that a minimal lexicon limited to function words is sufficient to ensure reasonable performance.
Peshkin Leonid
Savova Virginia
No associations
LandOfFree
Part-of-Speech Tagging with Minimal Lexicalization does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Part-of-Speech Tagging with Minimal Lexicalization, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Part-of-Speech Tagging with Minimal Lexicalization will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-513834