Statistics – Machine Learning
Scientific paper
2010-11-23
Statistics
Machine Learning
Scientific paper
In this article, we derive concentration inequalities for the cross-validation estimate of the generalization error for stable predictors in the context of risk assessment. The notion of stability has been first introduced by \cite{DEWA79} and extended by \cite{KEA95}, \cite{BE01} and \cite{KUNIY02} to characterize class of predictors with infinite VC dimension. In particular, this covers $k$-nearest neighbors rules, bayesian algorithm (\cite{KEA95}), boosting,... General loss functions and class of predictors are considered. We use the formalism introduced by \cite{DUD03} to cover a large variety of cross-validation procedures including leave-one-out cross-validation, $k$-fold cross-validation, hold-out cross-validation (or split sample), and the leave-$\upsilon$-out cross-validation. In particular, we give a simple rule on how to choose the cross-validation, depending on the stability of the class of predictors. In the special case of uniform stability, an interesting consequence is that the number of elements in the test set is not required to grow to infinity for the consistency of the cross-validation procedure. In this special case, the particular interest of leave-one-out cross-validation is emphasized.
Cornec Matthieu
No associations
LandOfFree
Concentration inequalities of the cross-validation estimate for stable predictors does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with Concentration inequalities of the cross-validation estimate for stable predictors, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and Concentration inequalities of the cross-validation estimate for stable predictors will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-588830