Fishing for Exactness

Computer Science – Computation and Language

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

13 pages - postscript

Scientific paper

Statistical methods for automatically identifying dependent word pairs (i.e. dependent bigrams) in a corpus of natural language text have traditionally been performed using asymptotic tests of significance. This paper suggests that Fisher's exact test is a more appropriate test due to the skewed and sparse data samples typical of this problem. Both theoretical and experimental comparisons between Fisher's exact test and a variety of asymptotic tests (the t-test, Pearson's chi-square test, and Likelihood-ratio chi-square test) are presented. These comparisons show that Fisher's exact test is more reliable in identifying dependent word pairs. The usefulness of Fisher's exact test extends to other problems in statistical natural language processing as skewed and sparse data appears to be the rule in natural language. The experiment presented in this paper was performed using PROC FREQ of the SAS System.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Fishing for Exactness does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Fishing for Exactness, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Fishing for Exactness will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-4278

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.