Estimation of Stochastic Attribute-Value Grammars using an Informative Sample

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

6 pages, 2 figures. Coling 2000, Saarbr\"{u}cken, Germany. pp 586--592

Scientific paper

We argue that some of the computational complexity associated with estimation of stochastic attribute-value grammars can be reduced by training upon an informative subset of the full training set. Results using the parsed Wall Street Journal corpus show that in some circumstances, it is possible to obtain better estimation results using an informative sample than when training upon all the available material. Further experimentation demonstrates that with unlexicalised models, a Gaussian Prior can reduce overfitting. However, when models are lexicalised and contain overlapping features, overfitting does not seem to be a problem, and a Gaussian Prior makes minimal difference to performance. Our approach is applicable for situations when there are an infeasibly large number of parses in the training set, or else for when recovery of these parses from a packed representation is itself computationally expensive.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Estimation of Stochastic Attribute-Value Grammars using an Informative Sample does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Estimation of Stochastic Attribute-Value Grammars using an Informative Sample, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Estimation of Stochastic Attribute-Value Grammars using an Informative Sample will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-355708

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.