Computer Science – Information Theory
Scientific paper
2012-02-08
Computer Science
Information Theory
Scientific paper
The task of the binary classification problem is to determine which of two distributions has generated a length-$n$ test sequence. The two distributions are unknown; however two training sequences of length $N$, one from each distribution, are observed. The distributions share an alphabet of size $m$, which is significantly larger than $n$ and $N$. How does $N,n,m$ affect the probability of classification error? We characterize the achievable error rate in a high-dimensional setting in which $N,n,m$ all tend to infinity and $\max\{n,N\}=o(m)$. The results are: * There exists an asymptotically consistent classifier if and only if $m=o(\min\{N^2,Nn\})$. * The best achievable probability of classification error decays as $-\log(P_e)=J \min\{N^2, Nn\}(1+o(1))/m$ with $J>0$ (shown by achievability and converse results). * A weighted coincidence-based classifier has a non-zero generalized error exponent $J$. * The $\ell_2$-norm based classifier has a zero generalized error exponent.
Huang Dayu
Meyn Sean
No associations
LandOfFree
Classification with High-Dimensional Sparse Samples does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Classification with High-Dimensional Sparse Samples, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Classification with High-Dimensional Sparse Samples will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-63849