Neural Comp. NEW Faster Access
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Tsuda, K.
Right arrow Articles by Müller, K.-R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tsuda, K.
Right arrow Articles by Müller, K.-R.
(Neural Computation. 2004;16:115-137.)
© 2004 The MIT Press


Letter

Asymptotic Properties of the Fisher Kernel

Koji Tsuda

koji.tsuda{at}tuebingen.mpg.de, Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany, and AIST Computational Biology Research Center, Koto-ku, Tokyo, 135-0064, Japan

Shotaro Akaho

s.akaho{at}aist.go.jp, AIST Neuroscience Research Institute, Tsukuba, 305-8568, Japan

Motoaki Kawanabe

nabe{at}first.fhg.de, Fraunhofer FIRST, 12489 Berlin, Germany

Klaus-Robert Müller

klaus{at}first.fhg.de, Fraunhofer FIRST, 12489 Berlin, Germany, and University of Potsdam, 14482 Potsdam, Germany

This letter analyzes the Fisher kernel from a statistical point of view. The Fisher kernel is a particularly interesting method for constructing a model of the posterior probability that makes intelligent use of unlabeled data (i.e., of the underlying data density). It is important to analyze and ultimately understand the statistical properties of the Fisher kernel. To this end, we first establish sufficient conditions that the constructed posterior model is realizable (i.e., it contains the true distribution). Realizability immediately leads to consistency results. Subsequently, we focus on an asymptotic analysis of the generalization error, which elucidates the learning curves of the Fisher kernel and how unlabeled data contribute to learning. We also point out that the squared or log loss is theoretically preferable—because both yield consistent estimators—to other losses such as the exponential loss, when a linear classifier is used together with the Fisher kernel. Therefore, this letter underlines that the Fisher kernel should be viewed not as a heuristics but as a powerful statistical tool with well-controlled statistical properties.







HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
Copyright © 2004 by The MIT Press.