Mathematics – Statistics Theory
Scientific paper
2012-04-06
Mathematics
Statistics Theory
47 pages, 5 figure, submitted to The Annals of Statistics
Scientific paper
We investigate the sparse sample goodness-of-fit problem, where the number of samples $n$ is smaller than the size of the alphabet $m$. The goal of this work is to find an appropriate criterion to analyze statistical tests in this setting. A suitable model for analysis is the high-dimensional model in which both $n$ and $m$ tend to infinity, and $n=o(m)$. We propose a new performance criterion based on large deviation analysis, which generalizes the classical error exponent applicable for large sample problems (in which $m=O(n)$). This new criterion provides insights that are not available from asymptotic consistency or CLT analysis. The main results are: (i) The best achievable probability of error $P_e$ decays as $-\log(P_e)=(n^2/m)(1+o(1))J$ for some $J>0$. (ii) A well-known coincidence-based test attains the optimal generalized error exponent. (iii) The widely used Pearson's chi-square test has J=0. (iv) The contributions (i)-(iii) are established under the assumption that the distribution under the null hypothesis is uniform. For the non-uniform case, a new test is proposed, with a non-zero generalized error exponent.
Huang Dayu
Meyn Sean
No associations
LandOfFree
Generalized Error Exponents for Sparse Sample Goodness of Fit Tests does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Generalized Error Exponents for Sparse Sample Goodness of Fit Tests, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Generalized Error Exponents for Sparse Sample Goodness of Fit Tests will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-36876