Neural Comp. NEW Faster Access
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Lu, Z.
Right arrow Articles by Leen, T. K.
Right arrow Search for Related Content
PubMed
Right arrow Articles by Lu, Z.
Right arrow Articles by Leen, T. K.
(Neural Computation. 2007;19:1528-1567.)
© 2007 The MIT Press


Letter

Penalized Probabilistic Clustering

Zhengdong Lu

zhengdon{at}csee.ogi.edu Department of Computer Science and Engineering, OGI School of Science and Engineering, Oregon Health and Science Institute, Beaverton, OR 97006, U.S.A.

Todd K. Leen

tleen{at}csee.ogi.edu Department of Computer Science and Engineering, OGI School of Science and Engineering, Oregon Health and Science Institute, Beaverton, OR 97006, U.S.A.

While clustering is usually an unsupervised operation, there are circumstances in which we believe (with varying degrees of certainty) that items A and B should be assigned to the same cluster, while items A and C should not. We would like such pairwise relations to influence cluster assignments of out-of-sample data in a manner consistent with the prior knowledge expressed in the training set. Our starting point is probabilistic clustering based on gaussian mixture models (GMM) of the data distribution. We express clustering preferences in a prior distribution over assignments of data points to clusters. This prior penalizes cluster assignments according to the degree with which they violate the preferences. The model parameters are fit with the expectation-maximization (EM) algorithm. Our model provides a flexible framework that encompasses several other semisupervised clustering models as its special cases. Experiments on artificial and real-world problems show that our model can consistently improve clustering results when pairwise relations are incorporated. The experiments also demonstrate the superiority of our model to other semisupervised clustering methods on handling noisy pairwise relations.







HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
Copyright © 2007 by The MIT Press.