On one hand, the visual system has the ability to differentiate between very similar
objects. On the other hand, we can also recognize the same object in images that vary
drastically, due to different viewing angle, distance, or illumination. The ability to
recognize the same object under different viewing conditions is called invariant object
recognition. Such object recognition capabilities are not immediately available after
birth, but are acquired through learning by experience in the visual world.
In many viewing situations different views of the same object are seen in a tem-
poral sequence, e.g. when we are moving an object in our hands while watching it.
This creates temporal correlations between successive retinal projections that can be
used to associate different views of the same object. Theorists have therefore pro-
posed a synaptic plasticity rule with a built-in memory trace (trace rule).
In this dissertation I present spiking neural network models that offer possible
explanations for learning of invariant object representations. These models are based
on the following hypotheses:
1. Instead of a synaptic trace rule, persistent firing of recurrently connected groups
of neurons can serve as a memory trace for invariance learning.
2. Short-range excitatory lateral connections enable learning of self-organizing
topographic maps that represent temporal as well as spatial correlations.
3. When trained with sequences of object views, such a network can learn repre-
sentations that enable invariant object recognition by clustering different views
of the same object within a local neighborhood.
4. Learning of representations for very similar stimuli can be enabled by adaptive
inhibitory feedback connections.
The study presented in chapter 3.1 details an implementation of a spiking neural
network to test the first three hypotheses. This network was tested with stimulus
sets that were designed in two feature dimensions to separate the impact of tempo-
ral and spatial correlations on learned topographic maps. The emerging topographic
maps showed patterns that were dependent on the temporal order of object views
during training. Our results show that pooling over local neighborhoods of the to-
pographic map enables invariant recognition.
Chapter 3.2 focuses on the fourth hypothesis. There we examine how the adaptive
feedback inhibition (AFI) can improve the ability of a network to discriminate between
very similar patterns. The results show that with AFI learning is faster, and the
network learns selective representations for stimuli with higher levels of overlap
than without AFI.
Results of chapter 3.1 suggest a functional role for topographic object representa-
tions that are known to exist in the inferotemporal cortex, and suggests a mechanism
for the development of such representations. The AFI model implements one aspect
of predictive coding: subtraction of a prediction from the actual input of a system. The
successful implementation in a biologically plausible network of spiking neurons
shows that predictive coding can play a role in cortical circuits.