Mathematics – Statistics Theory
Scientific paper
2006-06-30
Annals of Statistics 2006, Vol. 34, No. 2, 559-583
Mathematics
Statistics Theory
Published at http://dx.doi.org/10.1214/009053606000000092 in the Annals of Statistics (http://www.imstat.org/aos/) by the Inst
Scientific paper
10.1214/009053606000000092
We prove that boosting with the squared error loss, $L_2$Boosting, is consistent for very high-dimensional linear models, where the number of predictor variables is allowed to grow essentially as fast as $O$(exp(sample size)), assuming that the true underlying regression function is sparse in terms of the $\ell_1$-norm of the regression coefficients. In the language of signal processing, this means consistency for de-noising using a strongly overcomplete dictionary if the underlying signal is sparse in terms of the $\ell_1$-norm. We also propose here an $\mathit{AIC}$-based method for tuning, namely for choosing the number of boosting iterations. This makes $L_2$Boosting computationally attractive since it is not required to run the algorithm multiple times for cross-validation as commonly used so far. We demonstrate $L_2$Boosting for simulated data, in particular where the predictor dimension is large in comparison to sample size, and for a difficult tumor-classification problem with gene expression microarray data.
No associations
LandOfFree
Boosting for high-dimensional linear models does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Boosting for high-dimensional linear models, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Boosting for high-dimensional linear models will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-117512