The Loss Rank Principle for Model Selection

Mathematics – Statistics Theory

Scientific paper

Rate now

  [ 0.00 ] – not rated yet Voters 0   Comments 0

Details

16 pages

Scientific paper

10.1007/978-3-540-72927-3_42

We introduce a new principle for model selection in regression and classification. Many regression models are controlled by some smoothness or flexibility or complexity parameter c, e.g. the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. Let f_D^c be the (best) regressor of complexity c on data D. A more flexible regressor can fit more data D' well than a more rigid one. If something (here small loss) is easy to achieve it's typically worth less. We define the loss rank of f_D^c as the number of other (fictitious) data D' that are fitted better by f_D'^c than D is fitted by f_D^c. We suggest selecting the model complexity c that has minimal loss rank (LoRP). Unlike most penalized maximum likelihood variants (AIC,BIC,MDL), LoRP only depends on the regression function and loss function. It works without a stochastic noise model, and is directly applicable to any non-parametric regressor, like kNN. In this paper we formalize, discuss, and motivate LoRP, study it for specific regression problems, in particular linear ones, and compare it to other model selection schemes.

No associations

LandOfFree

Say what you really think

Search LandOfFree.com for scientists and scientific papers. Rate them and share your experience with other people.

Rating

The Loss Rank Principle for Model Selection does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.

If you have personal experience with The Loss Rank Principle for Model Selection, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and The Loss Rank Principle for Model Selection will most certainly appreciate the feedback.

Rate now

     

Profile ID: LFWR-SCP-O-207143

  Search
All data on this website is collected from public sources. Our data reflects the most accurate information available at the time of publication.