Computer Science – Computation and Language
Scientific paper
1996-11-12
Computer Science
Computation and Language
PhD thesis, 133 pages
Scientific paper
This thesis presents a computational theory of unsupervised language acquisition, precisely defining procedures for learning language from ordinary spoken or written utterances, with no explicit help from a teacher. The theory is based heavily on concepts borrowed from machine learning and statistical estimation. In particular, learning takes place by fitting a stochastic, generative model of language to the evidence. Much of the thesis is devoted to explaining conditions that must hold for this general learning strategy to arrive at linguistically desirable grammars. The thesis introduces a variety of technical innovations, among them a common representation for evidence and grammars, and a learning strategy that separates the ``content'' of linguistic parameters from their representation. Algorithms based on it suffer from few of the search problems that have plagued other computational approaches to language acquisition. The theory has been tested on problems of learning vocabularies and grammars from unsegmented text and continuous speech, and mappings between sound and representations of meaning. It performs extremely well on various objective criteria, acquiring knowledge that causes it to assign almost exactly the same structure to utterances as humans do. This work has application to data compression, language modeling, speech recognition, machine translation, information retrieval, and other tasks that rely on either structural or stochastic descriptions of language.
No associations
LandOfFree
Unsupervised Language Acquisition does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Unsupervised Language Acquisition, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Unsupervised Language Acquisition will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-295730