|
|
||||||||
Neural Computation, Vol 9, 349-368, Copyright © 1997 by The MIT Press
LETTERS |
Vijay Balasubramanian
The task of parametric model selection is cast in terms of a statistical mechanics on the space of probability distributions. Using the techniques of low-temperature expansions, I arrive at a systematic series for the Bayesian posterior probability of a model family that significantly extends known results in the literature. In particular, I arrive at a precise understanding of how Occam's razor, the principle that simpler models should be preferred until the data justify more complex models, is automatically embodied by probability theory. These results require a measure on the space of model parameters and I derive and discuss an interpretation of Jeffreys' prior distribution as a uniform prior over the distributions indexed by a family. Finally, I derive a theoretical index of the complexity of a parametric family relative to some true distribution that I call the razor of the model. The form of the razor immediately suggests several interesting questions in the theory of learning that can be studied using the techniques of statistical mechanics.
This article has been cited by other articles:
![]() |
M. B. Kennel, J. Shlens, H. D. I. Abarbanel, and E. J. Chichilnisky Estimating Entropy Rates with Bayesian Confidence Intervals Neural Comput., July 1, 2005; 17(7): 1531 - 1576. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Still and W. Bialek How Many Clusters? An Information-Theoretic Perspective Neural Comput., December 1, 2004; 16(12): 2483 - 2506. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Navarro A Note on the Applied Use of MDL Approximations Neural Comput., September 1, 2004; 16(9): 1763 - 1768. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Bialek, I. Nemenman, and N. Tishby Predictability, Complexity, and Learning Neural Comput., November 1, 2001; 13(11): 2409 - 2463. [Abstract] [Full Text] |
||||
![]() |
I. J. Myung, V. Balasubramanian, and M. A. Pitt Counting probability distributions: Differential geometry and model selection PNAS, September 22, 2000; (2000) 170283897. [Abstract] [Full Text] |
||||
![]() |
M. Brand Structure Learning in Conditional Probability Models via an Entropic Prior and Parameter Extinction Neural Comput., July 1, 1999; 11(5): 1155 - 1182. [Abstract] [Full Text] |
||||
![]() |
I. J. Myung, V. Balasubramanian, and M. A. Pitt Counting probability distributions: Differential geometry and model selection PNAS, October 10, 2000; 97(21): 11170 - 11175. [Abstract] [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| J COGNITIVE NEUROSCIENCE | NEURAL COMPUTATION | MIT PRESS JOURNALS |