Aligning Noisy Parallel Corpora Across Language Groups : Word Pair Feature Matching by Dynamic Time Warping

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

8 pages, uuencoded, compressed PostScript

Scientific paper

We propose a new algorithm called DK-vec for aligning pairs of Asian/Indo-European noisy parallel texts without sentence boundaries. DK-vec improves on previous alignment algorithms in that it handles better the non-linear nature of noisy corpora. The algorithm uses frequency, position and recency information as features for pattern matching. Dynamic Time Warping is used as the matching technique between word pairs. This algorithm produces a small bilingual lexicon which provides anchor points for alignment.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Aligning Noisy Parallel Corpora Across Language Groups : Word Pair Feature Matching by Dynamic Time Warping does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Aligning Noisy Parallel Corpora Across Language Groups : Word Pair Feature Matching by Dynamic Time Warping, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Aligning Noisy Parallel Corpora Across Language Groups : Word Pair Feature Matching by Dynamic Time Warping will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-674497

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.