Neural Comp. Sign up for ETOCS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Miller, D. J.
Right arrow Articles by Yan, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Miller, D. J.
Right arrow Articles by Yan, L.
(Neural Computation. 2000;12:2175-2207.)
© 2000 The MIT Press


Letter

Approximate Maximum Entropy Joint Feature Inference Consistent with Arbitrary Lower-Order Probability Constraints: Application to Statistical Classification

David J. Miller

Department of Electrical Engineering, The Pennsylvania State University, University Park, PA 16802, U.S.A.

Lian Yan

Department of Electrical Engineering, The Pennsylvania State University, University Park, PA 16802, U.S.A.

We propose a new learning method for discrete space statistical classifiers. Similar to Chow and Liu (1968) and Cheeseman (1983), we cast classification/inference within the more general framework of estimating the joint probability mass function (p.m.f.) for the (feature vector, class label) pair. Cheeseman's proposal to build the maximum entropy (ME) joint p.m.f. consistent with general lower-order probability constraints is in principle powerful, allowing general dependencies between features. However, enormous learning complexity has severely limited the use of this approach. Alternative models such as Bayesian networks (BNs) require explicit determination of conditional independencies. These may be difficult to assess given limited data. Here we propose an approximate ME method, which, like previous methods, incorporates general constraints while retaining quite tractable learning. The new method restricts joint p.m.f. support during learning to a small subset of the full feature space. Classification gains are realized over dependence trees, tree-augmented naive Bayes networks, BNs trained by the Kutato algorithm, and multilayer perceptrons. Extensions to more general inference problems are indicated. We also propose a novel exact inference method when there are several missing features.




This article has been cited by other articles:


Home page
Neural Comput.Home page
D. J. Miller and S. Pal
Transductive methods for the distributed ensemble classification problem.
Neural Comput., March 1, 2007; 19(3): 856 - 884.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
Copyright © 2000 by The MIT Press.