Mathematics – Statistics Theory
Scientific paper
2007-01-03
Annals of Statistics 2008, Vol. 36, No. 6, 2605-2637
Mathematics
Statistics Theory
Published in at http://dx.doi.org/10.1214/07-AOS504 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Scientific paper
10.1214/07-AOS504
Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is poorly understood. In a seminal paper, Bickel and Levina [Bernoulli 10 (2004) 989--1010] show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as poor as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as poorly as the random guessing. Thus, it is important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample $t$-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.
Fan Jianqing
Fan Yingying
No associations
LandOfFree
High-dimensional classification using features annealed independence rules does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with High-dimensional classification using features annealed independence rules, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and High-dimensional classification using features annealed independence rules will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-310234