Neural Comp. NEW Faster Access
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Google Scholar
Right arrow Articles by Wei, H.
Right arrow Articles by Amari, S.-i.
PubMed
Right arrow Articles by Wei, H.
Right arrow Articles by Amari, S.-i.
(Neural Computation. 2008;20:813-843.)
© 2008 The MIT Press


Letter

Dynamics of Learning Near Singularities in Layered Networks

Haikun Wei

weihaikun{at}brain.riken.jp RIKEN Brain Science Institute, Saitama, 3510198, Japan, Southeast University, Nanjing, 210096, China, and Kyushu Institute of Technology, Kitakyushu 8080196, Japan

Jun Zhang

JunZhangjunz{at}umich.edu RIKEN Brain Science Institute, Saitama, 3510198, Japan, and University of Michigan, Ann Arbor, MI 48109, U.S.A.

Florent Cousseau

FlorentCousseauflorent{at}mns.k.u-tokyo.ac.jp RIKEN Brain Science Institute, Saitama, 3510198, Japan, and Universityof Tokyo, Chiba, 2778561, Japan

Tomoko Ozeki

TomokoOzekitozeki{at}tokai.ac.jp RIKEN Brain Science Institute, Saitama, 3510198, Japan, and Tokai University, Kanagawa, 2591292, Japan

Shun-ichi Amari

amari{at}brain.riken.jp RIKEN Brain Science Institute, Saitama, 3510198, Japan

We explicitly analyze the trajectories of learning near singularities in hierarchical networks, such as multilayer perceptrons and radial basis function networks, which include permutation symmetry of hidden nodes, and show their general properties. Such symmetry induces singularities in their parameter space, where the Fisher information matrix degenerates and odd learning behaviors, especially the existence of plateaus in gradient descent learning, arise due to the geometric structure of singularity. We plot dynamic vector fields to demonstrate the universal trajectories of learning near singularities. The singularity induces two types of plateaus, the on-singularity plateau and the near-singularity plateau, depending on the stability of the singularity and the initial parameters of learning. The results presented in this letter are universally applicable to a wide class of hierarchical models. Detailed stability analysis of the dynamics of learning in radial basis function networks and multilayer perceptrons will be presented in separate work.







HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
J COGNITIVE NEUROSCIENCE NEURAL COMPUTATION MIT PRESS JOURNALS
Copyright © 2008 by The MIT Press.