An Efficient Inductive Unsupervised Semantic Tagger

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

uuencoded postscript file. email: cmp-lg/9606012

Scientific paper

We report our development of a simple but fast and efficient inductive unsupervised semantic tagger for Chinese words. A POS hand-tagged corpus of 348,000 words is used. The corpus is being tagged in two steps. First, possible semantic tags are selected from a semantic dictionary(Tong Yi Ci Ci Lin), the POS and the conditional probability of semantic from POS, i.e., P(S|P). The final semantic tag is then assigned by considering the semantic tags before and after the current word and the semantic-word conditional probability P(S|W) derived from the first step. Semantic bigram probabilities P(S|S) are used in the second step. Final manual checking shows that this simple but efficient algorithm has a hit rate of 91%. The tagger tags 142 words per second, using a 120 MHz Pentium running FOXPRO. It runs about 2.3 times faster than a Viterbi tagger.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

An Efficient Inductive Unsupervised Semantic Tagger does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with An Efficient Inductive Unsupervised Semantic Tagger, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and An Efficient Inductive Unsupervised Semantic Tagger will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-100786

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.