Computer Science – Computation and Language
Scientific paper
2002-05-25
Computer Science
Computation and Language
8 pages; to appear in the proceedings of EMNLP-2002
Scientific paper
An important component of any generation system is the mapping dictionary, a lexicon of elementary semantic expressions and corresponding natural language realizations. Typically, labor-intensive knowledge-based methods are used to construct the dictionary. We instead propose to acquire it automatically via a novel multiple-pass algorithm employing multiple-sequence alignment, a technique commonly used in bioinformatics. Crucially, our method leverages latent information contained in multi-parallel corpora -- datasets that supply several verbalizations of the corresponding semantics rather than just one. We used our techniques to generate natural language versions of computer-generated mathematical proofs, with good results on both a per-component and overall-output basis. For example, in evaluations involving a dozen human judges, our system produced output whose readability and faithfulness to the semantic input rivaled that of a traditional generation system.
Barzilay Regina
Lee Lillian
No associations
LandOfFree
Bootstrapping Lexical Choice via Multiple-Sequence Alignment does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Bootstrapping Lexical Choice via Multiple-Sequence Alignment, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Bootstrapping Lexical Choice via Multiple-Sequence Alignment will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-273073