Boosting for high-dimensional linear models

Mathematics – Statistics Theory

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

Published at http://dx.doi.org/10.1214/009053606000000092 in the Annals of Statistics (http://www.imstat.org/aos/) by the Inst

Scientific paper

10.1214/009053606000000092

We prove that boosting with the squared error loss, $L_2$Boosting, is consistent for very high-dimensional linear models, where the number of predictor variables is allowed to grow essentially as fast as $O$(exp(sample size)), assuming that the true underlying regression function is sparse in terms of the $\ell_1$-norm of the regression coefficients. In the language of signal processing, this means consistency for de-noising using a strongly overcomplete dictionary if the underlying signal is sparse in terms of the $\ell_1$-norm. We also propose here an $\mathit{AIC}$-based method for tuning, namely for choosing the number of boosting iterations. This makes $L_2$Boosting computationally attractive since it is not required to run the algorithm multiple times for cross-validation as commonly used so far. We demonstrate $L_2$Boosting for simulated data, in particular where the predictor dimension is large in comparison to sample size, and for a difficult tumor-classification problem with gene expression microarray data.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

Boosting for high-dimensional linear models does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with Boosting for high-dimensional linear models, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Boosting for high-dimensional linear models will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-117512

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.