Neural Comp. NEW Faster Access
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Dietterich, T. G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Dietterich, T. G.

Neural Computation, Vol 10, 1895-1923, Copyright © 1998 by The MIT Press


LETTERS

Approximate Statistical Test For Comparing Supervised Classification Learning Algorithms

Thomas G. Dietterich

This article reviews five approximate statistical tests for determining whether onelearning algorithm outperforms another on a particular learning task. These tests are compared experimentally to determine their probability of incorrectly detecting a difference when no difference exists (type I error). Two widely used statistical tests are shown to have high probability of type I error in certain situations and should never be used: a test for the difference of two proportions and a paired-differences t test based on taking several random train-test splits. A third test, a paired-differences t test based on 10-fold cross-validation, exhibits somewhat elevated probability of type I error. A fourth test, McNemar's test, is shown to have low type I error. The fifth test is a new test, 5x2 cv, based on five iterations of twofold cross-validation. Experiments show that this test also has acceptable type I error. The article also measures the power (ability to detect algorithm differences when they do exist) of these tests. The cross-validated t test is the most powerful. The 5x2 cv test is shown to be slightly more powerful than McNemar's test. The choice of the best test is determined by the computational cost of running the learning algorithm. For algorithms that can be executed only once, McNemar's test is the only test with acceptable type I error. For algorithms that can be executed 10 times, the 5x2 cv test is recommended, because it is slightly more powerful and because it directly measures variation due to the choice of training set.


This article has been cited by other articles:


Home page
Neural Comput.Home page
H. Shin and S. Cho
Neighborhood property-based pattern selection for support vector machines.
Neural Comput., March 1, 2007; 19(3): 816 - 855.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
D. P. Enot, M. Beckmann, D. Overy, and J. Draper
Predicting interpretability of metabolome models based on behavior, putative identity, and biological relevance of explanatory signals
PNAS, October 3, 2006; 103(40): 14865 - 14870.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
S. Hochreiter and K. Obermayer
Support vector machines for dyadic data.
Neural Comput., June 1, 2006; 18(6): 1472 - 1510.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
D. Berrar, I. Bradbury, and W. Dubitzky
Avoiding model selection bias in small-sample genomic datasets
Bioinformatics, May 15, 2006; 22(10): 1245 - 1250.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
P. Larranaga, B. Calvo, R. Santana, C. Bielza, J. Galdiano, I. Inza, J. A. Lozano, R. Armananzas, G. Santafe, A. Perez, et al.
Machine learning in bioinformatics
Brief Bioinform, March 1, 2006; 7(1): 86 - 112.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
A.M. Cohen, W.R. Hersh, K. Peterson, and P.-Y. Yen
Reducing Workload in Systematic Review Preparation Using Automated Citation Classification
J. Am. Med. Inform. Assoc., March 1, 2006; 13(2): 206 - 219.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
G. S. Catchpole, M. Beckmann, D. P. Enot, M. Mondhe, B. Zywicki, J. Taylor, N. Hardy, A. Smith, R. D. King, D. B. Kell, et al.
Hierarchical metabolomics demonstrates substantial compositional similarity between genetically modified and conventional potato crops
PNAS, October 4, 2005; 102(40): 14458 - 14462.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. A. Vinterbo, E.-Y. Kim, and L. Ohno-Machado
Small, fuzzy and interpretable gene expression based classifiers
Bioinformatics, May 1, 2005; 21(9): 1964 - 1970.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Statnikov, C. F. Aliferis, I. Tsamardinos, D. Hardin, and S. Levy
A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis
Bioinformatics, March 1, 2005; 21(5): 631 - 643.
[Abstract] [Full Text] [PDF]


Home page
J. Neurosci.Home page
R. J. Baddeley, H. A. Ingram, and R. C. Miall
System Identification Applied to a Visuomotor Task: Near-Optimal Human Performance in a Noisy Changing Task
J. Neurosci., April 1, 2003; 23(7): 3066 - 3075.
[Abstract] [Full Text] [PDF]


Home page
Neural Comput.Home page
A. Vehtari and J. Lampinen
Bayesian Model Assessment and Comparison Using Cross-Validation Predictive Densities
Neural Comput., October 1, 2002; 14(10): 2439 - 2468.
[Abstract] [Full Text]


Home page
Neural Comput.Home page
H. Schwenk and Y. Bengio
Boosting Neural Networks
Neural Comput., August 1, 2000; 12(8): 1869 - 1887.
[Abstract] [Full Text]


Home page
Neural Comput.Home page
E. Alpaydin
Combined 5 2 cv F Test for Comparing Supervised Classification Learning Algorithms
Neural Comput., November 15, 1999; 11(8): 1885 - 1892.
[Abstract] [Full Text]


Home page
Neural Comput.Home page
M. Brand
Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction
Neural Comput., July 1, 1999; 11(5): 1155 - 1182.
[Abstract] [Full Text]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
Copyright © 1998 by The MIT Press.