Neural Comp. Sign up for ETOCS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kearns, M.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Kearns, M.

Neural Computation, Vol 9, 1143-1161, Copyright © 1997 by The MIT Press


LETTERS

A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-test Split

Michael Kearns

We give a theoretical and experimental analysis of the generalization error of cross validation using two natural measures of the problem under consideration. The approximation rate measures the accuracy to which the target function can be ideally approximated as a function of the number of parameters, and thus captures the complexity of the target function with respect to the hypothesis model. The estimation rate measures the deviation between the training and generalization errors as a function of the number of parameters, and thus captures the extent to which the hypothesis model suffers from overfitting. Using these two measures, we give a rigorous and general bound on the error of the simplest form of cross validation. The bound clearly shows the dangers of making -- the fraction of data saved for testing -- too large or too small. By optimizing the bound with respect to , we then argue that the following qualitative properties of cross-validation behavior should be quite robust to significant changes in the underlying model selection problem:

  • When the target function complexity is small compared to the sample size, the performance of cross validation is relatively insensitive to the choice of .
  • The importance of choosing optimally increases, and the optimal value for decreases, as the target function becomes more complex relative to the sample size.
  • There is nevertheless a single fixed value for that works nearly optimally for a wide range of target function complexity.






  • HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
    J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
    Copyright © 1997 by The MIT Press.