Computer Science – Computation and Language
Scientific paper
2002-05-08
Computer Science
Computation and Language
10 pages, to appear in the Proceedings of the Morphology/Phonology Learning Workshop of ACL-02
Scientific paper
We present an algorithm that takes an unannotated corpus as its input, and returns a ranked list of probable morphologically related pairs as its output. The algorithm tries to discover morphologically related pairs by looking for pairs that are both orthographically and semantically similar, where orthographic similarity is measured in terms of minimum edit distance, and semantic similarity is measured in terms of mutual information. The procedure does not rely on a morpheme concatenation model, nor on distributional properties of word substrings (such as affix frequency). Experiments with German and English input give encouraging results, both in terms of precision (proportion of good pairs found at various cutoff points of the ranked list), and in terms of a qualitative analysis of the types of morphological patterns discovered by the algorithm.
Baroni Marco
Matiasek Johannes
Trost Harald
No associations
LandOfFree
Unsupervised discovery of morphologically related words based on orthographic and semantic similarity does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Unsupervised discovery of morphologically related words based on orthographic and semantic similarity, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Unsupervised discovery of morphologically related words based on orthographic and semantic similarity will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-505003