Dr Bernd Porr

 

Isotropic sequence order learning

Bernd Porr and Florentin Wörgötter

Neural Comp., 15, 831-864.

Download: PDF

In this article we present an isotropic, unsupervised algorithm for temporal sequence learning. No special reward signal is used such that all inputs are completely isotropic. All input signals are bandpass filtered before converging onto a linear output neuron. All synaptic weights change according to the correlation of bandpass-filtered inputs with the derivative of the output. We investigate the algorithm in an open- and a closed-loop condition, the latter being defined by embedding the learning system into a behavioural feedback loop. In the open-loop condition we find that the linear structure of the algorithm allows analytically calculating the shape of the weight change which is strictly hetero-synaptic and follows the shape of the weight change curves found in spike-time dependent plasticity. Furthermore, we show that synaptic weights stabilise automatically when no more temporal di erences exist between the inputs without additional normalising measures. In the second part of this study, the algorithm is is placed into an environment which leads to closed sensor-motor loop. To this end a robot is programmed with a pre-wired retraction reflex reaction in response to collisions. Through ISO-learning the robot achieves collisions avoidance by learning the correlation between his early range-finder signals and the later occuring collision signal. Synaptic weights stabilise at the end of learning as theoretically predicted. Finally we discuss the relation of ISO-learning with other drive reinforcement models and with the commonly used temporal di erence (TD-) learning algorithm. This study is followed up by a mathematical analysis of the closed-loop situation in the accompanying article.

Minimalistic Design|oswd
  • Skills
  • Publications
  • Film
  • Research
  • Teaching
  • Software
  • Tinnitus Tailor
  • USB-DUX
  • Attys
  • Privacy policy