You are on page 1of 16

BioSystems 50 (1999) 173 188

Memories in context
Andre s Pomi Brea, Eduardo Mizraji *
Seccio n Biof sica, Facultad de Ciencias, Uni6ersidad de la Repu blica, Monte6ideo, Uruguay Received 21 May 1998; received in revised form 11 July 1998; accepted 9 November 1998

Abstract Context-dependent associative memories are models that allow the retrieval of different vectorial responses given a same vectorial stimulus, depending on the context presented to the memory. The contextualization is obtained by doing the Kronecker product between two vectorial entries to the associative memory: the key stimulus and the context. These memories are able to display a wide variety of behaviors that range from all the basic operations of the logical calculus (including fuzzy logics) to the selective extraction of features from complex vectorial patterns. In the present contribution, we show that a context-dependent memory matrix stores a large amount of possible virtual associative memories, that awaken in the presence of a context. We show how the vectorial context allows a memory matrix to be representable in terms of its singular-value decomposition. We describe a neural interpretation of the model in which the Kronecker product is performed on the same neurons that sustain the memory. We explored, with numerical experiments, the reliability of chains of contextualized associations. In some cases, random disconnection produces the emergence of oscillatory behaviors of the system. Our results show that associative chains retain their performances for relatively large dimensions. Finally, we analyze the properties of some modules of context-dependent autoassociative memories inserted in recursive nets: the perceptual autoorganization in the presence of ambiguous inputs (e.g. the disambiguation of the Neckers cube gure), the construction of intersection lters, and the feature extraction capabilities. 1999 Elsevier Science Ireland Ltd. All rights reserved. Keywords: Associative memories; Neckers cube; Neural networks

1. Introduction The associative processes established by a biological memory are submerged in contexts. These
* Corresponding author. Present address: Casilla de Correos 6695, 11000 Montevideo, Uruguay. Fax: + 5982-525-8629. E -mail address: mizraj@fcien.edu.uy (E. Mizraji)

contexts are usually relevant to dene the biological value of the associations. It is obvious that the image of a hungry tiger does not produce in a human observer the same associative process when it is seen on the screen of a TV set or in the middle of a jungle. These associative processes are sustained by the activity of large neural networks.

0303-2647/99/$ - see front matter 1999 Elsevier Science Ireland Ltd. All rights reserved. PII: S 0 3 0 3 - 2 6 4 7 ( 9 9 ) 0 0 0 0 5 - 2

174

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

In the forthcoming years it is expected that the use of some new technologies (e.g. PET) is going to provide a very complete map of spatio-temporal patterns linked to memory events. The richness and complexity of mnesic processes exceed the power of almost all present theoretical approaches (for descriptions of the complexity of human memories, see Shacter, 1996). Nevertheless, neural models have been extremely important for the incorporation of many new insights and mathematical techniques to this research eld. A large diversity of biologically-inspired models of memory has been investigated (see Anderson, 1995). In some of these models, the activity of a large group of neurons is represented by high dimensional vectors. These vectors are processed by associative memories and, in some cases, the core of these memories can be represented by a class of correlation matrices. The information stored by these memories resides in the matrix coefcients. Around this basic layout, a number of nonlinear pre- or post-processing stages are added. It has been demonstrated that the preprocessing of inputs using multiplicative vector contexts drastically enlarges the abilities of linear distributed memories (Humphreys et al., 1989; Mizraji, 1989). In particular, when the input and the context vectors are composed using the Kronecker product the memories are able to execute all the logical operations of propositional calculus (including the exclusive-or), to execute modal operations by the means of recursive processes, and to exhibit an adaptive searching of goals (Mizraji et al., 1994). A variety of biophysical mechanisms for the multiplicative processing of neural signals has been reported (see for example, Koch and Poggio, 1992). In a previous article (Mizraji et al., 1994) we centered the attention on some requirements for the biological plausibility of these context-dependent models. In that sense, the mathematical model analyzed there has been anatomically interpreted as having a multiplicative net previous to the associative memory. Furthermore, we have studied the reliability of a single context-dependent association for the case that the multiplicative contextualization is based on a net with local imperfections, as plausibly occurs in the central

nervous systems of mammals. We have also shown in that article that recursive heteroassociative context-dependent memories were able to execute the operations of modal logic and to perform the adaptive searching of goals. In the present work we expand the theory of context-dependent distributed associative memories in three main directions: the mechanisms of the context action, the dialog with neurobiological facts, and the properties of recursive modules with autoassociative memories. In Section 2 we show that when the distributed memory stores the input composed with the context via the Kronecker product, the resulting memory implies the superposition of a variety of associative memories over the same structural support. The vectorial contexts that act as semantic thresholds for the access to a particular kind of associative memory lead to the singular-value decomposition of the matrix memory. In that section we also revisit the neurobiological basis of these models, and we present an alternative anatomical interpretation of the multiplicative model. In Section 3 we study, via numerical experiments, the reliability of those cognitive-like properties of the model that depend on recurrent processes, in the presence of partial destruction or statistical irregularities of the physical support. From the results it emerges the concept that the distribution of the associations over all the elements of the memory is not uniform. Finally, in Section 4, we show some properties of context-dependent autoassociative memories in recursive models, and how they can be used as feature extractors.

2. Virtual memories In this section we describe a model where the inputoutput associations of a matrix memory are dependent on multiplicative vectorial contexts. In this framework, the presence of a context promotes a remarkable operation that transforms the real matrix, over which we are truly computing, into what we shall call a virtual matrix. We show that this transformation promoted by the context leads to the singular-value decomposition of the matrix memory (the singular-value decom-

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

175

position of rectangular matrices is a mathematical method with an important place in the eld of image processing; see, for instance, Jain, 1989). In distributed associative memories (in the following we will refer to them simply as memories or associative memories) models, the meaningful event is the activity of a large group of neurons, naturally represented by high dimensional vectors. These vectors are processed by associative memories which store the information distributed and superposed over matrix coefcients. Let M be a memory that associates m -dimensional inputs f with n -dimensional outputs g. In general, a given memory M (k) can be characterized by the fact that each particular association ( fi, gi ) is weighted with a coefcient v (k) i , that measures the intensity of the presence of this association in the memory. In the simplest case, when the input set { f } is orthonormal, this class of hetero-associative memory can be represented by a correlation matrix (Kohonen, 1977) with the following structure: gi f T M (k) = % v (k) i i .
i

tering memory M, represented by a correlation matrix with the following structure: M = % gi ( fi pi )T.
i

In general u 6 represents the Kronecker product between u and 6. In this expression each term, an association, can also be weighted by a scalar coefcient representing the frequency of this association. Nevertheless, in what follows, and in order to simplify the expressions, we assume that this coefcient equals the unit (Scheme 1). Given a global context-dependent memory M, a particular r -dimensional vectorial context pk selects a particular virtual hetero-associative matrix M (k). Each one of these virtual matrices, a kind of ghost evoked by the context, represents a particular associative memory. The context-dependent memory is a kind of hardware in which a large set of virtual matrices are superimposed. Now we show how this performance is achieved. An input f embedded in the context pk can be factorized as follows: f pk = (I pk )f, where I is an identity matrix of dimension n. Consequently, the way in which the memory M recognizes this pattern can be described in the following way: M ( f pk ) = M (k)f, where M (k) = % B pi,pk \ gi f T i
i

As usual, the superindex T indicates matrix transposition. Remark also that if both sets { f } and {g } are orthonormal, the coefcients v (k) are i the singular values of the rectangular matrix M (k), being gi M (k) fi = v (k) i [M (k)]T gi = v (k) fi i We would like to mention that orthogonality is an expected condition for large random vectors with equiprobable distribution of signs in their components (Mizraji et al., 1994). The models of context-dependent associative memories, as every associative memory models, are based on a matrix, over which a multiplicity of vectorial associations are superimposed. In this case the associations are not simply input output associations, but associations of outputs with a tensorial array of two vectors: an entry and a context (Mizraji, 1989). In context-dependent models the input fi and the context pi vectors are multiplied via the Kronecker product before en-

is a virtual memory, dissected from the structural memory M by means of a particular context pk. In this representation, the singular values of matrix M (k) correspond to the scalar products Bpi, pk\.

Scheme 1.

176

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

Remark that vectors gi, fi and pi represent a kind of structural information, preinstalled in the memory. On the other hand, the vectors f and gk represent a kind of on-line information that interacts with the preestablished memory. Hence, the potential exibility of this class of memories emerges from a compromise between structural rigidity and the selectivity capabilities of the online information. Of course, structural rigidity implies a time scale in which the memory can be considered as a quasi-static device and the self-organizing abilities (due to learning) are not considered. An auto-associative memory associates a given m -dimensional input fi with itself. In this case the memory is represented by a square matrix with the general form: M (k) = % v (k) fi f T i i
i

When { f } is an orthonormal basis, the coefcients v (k) are proper values of the matrix M (k). During iterative processes operating over mixed patterns introduced into the memory, these proper values determine the evolution of the patterns expressed during the recall. High proper values reinforce the associated pattern. Instead, very low proper values can produce the extinction of the corresponding pattern. In this way, the spectrum of proper values determine the selectivity capabilities of an auto-associative memory facing mixed inputs (Anderson et al., 1977). In classical models, this spectrum is xed. The context-dependent autoassociative memory is a rectangular matrix of dimension m (mr ). It differs from the corresponding hetero-associative matrix only in the fact that the output g is the same input f. In these context-dependent memories, the preexisting set of contexts promotes a variety of potentially different spectrum of proper values that allow an adaptive pattern analysis (Mizraji, 1989). The selection of a given spectrum of proper values operates with the same mathematical scheme shown for hetero-associative memories. The models of distributed associative memories have a known general neurobiological basis (see, for instance, Anderson and Hinton, 1989). The

memory is supported by a group of neurons capable of recreating a certain activity pattern in response to a given activity pattern of their afferences. The instantaneous activity of a large group of neurons becomes naturally represented by vectors. Each mnemonic trace, scattered between the neurons of the memory, is also superimposed to other traces on these same neurons. Memories are represented by matrices with as many rows as mnemonic neurons. The columns of the matrix contain the synaptic weights of each afference to any neuron of the memory. A matrix coefcient resumes imprints corresponding to various associated traces. The multiplicative model of context-dependent associations in distributed memories requires that each component of the entry is multiplied by the different components of the context. This implies both the possibility of the combinatorial encounter between the entry and the context components and the existence of certain computing abilities in the neural systems (multiplication). A straightforward interpretation of the model may assume the existence of a second neural layer previous to the associative one. In this previous layer, each neuron computes an element of the Kronecker product. The multiplicative net as a whole multiplies two vectorial afferences of the memory, one playing as the context of the other (Mizraji, 1989; Mizraji et al., 1994). But inasmuch as one notices that each neuron belonging to the memory receives the whole Kronecker product, a more simple design appears that accounts for the model. A neural conguration that facilitates the required combinatorial for the contacts between the entry and the context components would permit the operation to be performed on the dendritic tree of the same neurons that support the associative memory. This conguration is represented in Fig. 1. Multiplicative effects in neuronal processing have been increasingly looked for by neuroscientists (see specially Koch and Poggio, 1987; Poggio, 1990; Koch and Poggio, 1992; Tal and Schwartz, 1997). Multiplication as a coincidence detector was also recently explored with a variety of approaches going from signal analysis (Bialek and Zee, 1990) to integrate-and-re neuron mod-

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

177

Fig. 1. Mnemonic neurons with multiplicative properties. The same group of neurons that sustain the distributed associations of vectorial patterns, perform the Kronecker product between two different systems of afferences.

els (Bugmann, 1992). The support of the multiplicative capacities on the properties of the NMDA receptor has been postulated (Poggio, 1990; Mel, 1992, 1993; Montague and Sejnowski, 1994). In the representation of Fig. 1, the multiplicative activity is taken ahead by subcellular mechanisms acting within the neighborhood of the convergence of a pair of afferences. This surrounding may be identied with the synapses. In this anatomic version of the model, without a multiplicative net, it is no longer necessary the existence of a special neural type, functionally differentiated to perform the multiplication of inputsas it occurred with the neurons in the case of a previous multiplicative net. As a corollary, with the prescindence of the nets of multiplicative neurons, the dimensional growth generated by the multiplication experiments a scaling reduction, since it is being transformed

from number of neurons into number of synapses. Therefore, this kind of contextualization capability could be an universal and ubiquitous phenomenon within the nervous system, provided that neurochemical devices able to implement the multiplication of signals exist.

3. Dynamic reliability As far as we have seen, this model requires a wide and accurate connectivity of afferences convergent toward a given neuronal group. However, neurobiological facts suggest that in most of the central association areas incoming activity and ring rates are usually low and, moreover, that neurons of different cortices are sparsely connected (as stated, for example, in Bibbig et al., 1995). In addition, the information in the nervous system is usually accompanied by a certain level

178

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

of noise. From the above considerations, the biological plausibility of this kind of models depends on the survival of the context-dependent associative performances when the connectivity is fuzzy, the matrix memory is sparse and the patterns are presented with a certain level of noise. To consider noisy inputs to an intact memory or to feed damaged memories with intact inputs become equivalent operations in associative memories. As a general consideration, we must remember that each row of the matrix memory represents a neuron of the memory, and that each column represents a particular input-context combination: an element of the Kronecker product. Hence, the destruction of an average percentage of elements of each row of a matrix memory (synapses of a neuron), and the contextualization of an entry using only a statistical sample of context signals, become mathematically equivalent (Mizraji et al., 1994). We have previously shown (Mizraji et al., 1994) that in these matrix memories a simple contextdependent association remains reliable when each element of the entry is inuenced by a xed random fraction of the elements of the context vector (the destruction of a neuron in the multiplicative net implies that a same element of the Kronecker product will be lacking to each associative neuron). In that work it became also clear that the reliability grows with the dimensionality of the vectors. The anatomic version of the model described in Fig. 1, with the Kronecker product composed on the same group of associative neurons, restates the problem of the reliability in a somewhat different way. The expected natural imprecision in the connectivity, corresponding to a local fuzzy design of neural systems, is not any longer due to the absence of a xed component of the Kronecker product for each of the neurons of the memory, but to the lack of a certain average percentage of their connections. At the matrix level, this implies the deletion of a certain average percentage of different elements on each row. We illustrate this fact in the following diagram. In this matrix memory of 4 9 (see below), each neuron, represented by a row of the matrix, has exactly the same percentage of damaged (zero) elements

(33%). The empty blocks correspond to undamaged coefcients. We tested if this different way of performing the same percentage of destruction over the net has different consequences on the reliability of a single association.

Using vectors of low dimensions (4, 8 and 16), numerical experiments similar to those performed in Mizraji et al. (1994), showed equivalent robustness for a single context-dependent association (see Fig. 2). We now study the reliability of chains of context-dependent associations. We explore, via nu-

Fig. 2. Reliability of single context-dependent associations. Reliability of a single context-dependent association for an increasing average percentage of deletions over the afferences of each neuron of the memory. Each point represents 100 random realizations of a given average percentage of destruction (the methodology is described lower in the text). For each random realization we calculated, given a same vector stimulus, the correlation between the ideal associated vector and the actual response vector. Thereafter, we calculated the average correlation for that average percentage of destruction.

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

179

Fig. 3. A reentrant loop in an associative memory generates lateral connections within the mnemonic neurons. (a) Architecture of an associative memory model with three mnemonic neurons. (b) A model with a reentrant loop. (c) Its equivalent diagram.

merical experiments, the effect of distributed lesions on the dynamic properties of context-dependent associative memories with a reentrant loop. The recursion in a net of this kind implies a modication in the connectivity that gives rise to lateral connectivity between the neurons of the memory (Fig. 3). In this context, the terms destruction and disconnection become equivalent. Therefore, we will be exploring the reliability of the dynamical cognitive properties in the presence of a reduction of the connectivity. We use an adaptive searching device previously described (Mizraji et al., 1994) to show the inuence of memory destruction on the generation of goal-oriented trajectories. The capacity of these context-dependent memories to structure teleological sequences was illustrated by a model of cardi-

nal point deambulation in a circular planar space. The vectors for each cardinal point (w, e, n, s ) were arbitrarily chosen from members of an orthogonal basis of an n -dimensional vector space. Memories were instructed according to the two different databases shown in following Table 1. Given any position, corresponding to a cardinal point, and one objective to reach (another cardinal point), these transition tables proportionate the next cardinal point position. Only steps through contiguous cardinal points were instructed. As an example, if the objective to reach is the cardinal point north, and the initial position is the cardinal point east, a system with a memory instructed according to the rst transition table gets the north in one step; on the other hand, another system, with a memory instructed according to the second transition table will reach the north after three steps, passing through the south and the west. Versions of these two different memories were constructed using orthonormal Walsh vectors of 4, 8 and 16 dimensions. Both kinds of memories, inserted in a recursive loop, where each output is reinjected to the memory after being multiplied by a xed context, allow the system to perform the set of instructed goal-oriented trajectories. In the presence of an objective position to reach (represented by a xed context), a given initial position can be actualized by a recurrent entrance and processing of the successive associated steps.
Table 1 Structure of the databases Objective to reach West First database West w East s North w South w Second database West w East s North e South w East North South

n e e e n e e w

n n n w n s n w

s s e s n s e s

180

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

Instructed combinations of positions and objectives give rise to a sequence of steps that exactly ends when the goal is reached. Moreover, this device shows interpolation abilities when confronted with not instructed intermediate positions (a linear combination of the learnt orthogonal basis). This interpolation ability is dependent on the strategy with which learnt trajectories were designed (Mizraji et al., 1994). Whereas the memory instructed with the rst transition diagram can exactly interpolate (and reach non-instructed objectives), the second one can only reach a point near the intermediate target proposed. Chains of context-dependent associations in these systems with uninjured memories always result in a xed vector. To evaluate the persistence of these cognitivelike behaviors in the presence of destruction (or disconnection of the net) we performed numerical experiments. In these experiments (carried out for memories instructed with the two different databases and different dimensionality) we provoked an increasing average percentage of destruction on the afferences of each neuron of the memory (components of the Kronecker product). We proposed an objective position to reach distant several steps from the initial position. We evaluated the capacity of reaching the same end-point for memories with different degrees of incompleteness. For each average percent of destruction (disconnection), we explored the effect of 100 different random arrays of deletions on each memory matrix. The positions of the deletions were established as follows: we decided if some element of the matrix is zero (i.e. if it is deleted), by pre-establishing a xed probability and by using a random number generator. The xed probability denes the average percentage of disconnection: for instance, in order to do one of the 100 experiments for an average disconnection percentage of 30% we xed a probability of 0.3, and using a random number generator we decided for each element of the memory matrix if it became deleted. Each of the 100 random realizations was confronted to the same initial position and objective to reach (xed vector context). We evaluated the correlation between the context vector (pro-

posed position to reach) and the actual end-vector (reached position), and we nally computed the average correlation over the 100 realizations. We now describe the results of our numerical experiments. For some patterns of random disconnection we found that some vectorial series of context-dependent associations do not reach a xed point in the n -vector space. In spite of this, they can reach different oscillatory modes with different periods (Pomi, 1995). The emergence of oscillatory behavior in the net as a consequence of disconnection is related to the dimensionality of the space, as we show in Fig. 4a,b. The accuracy of the points reached after the chain of vectorial associations for a growing fraction of disconnection (destruction) is shown in Fig. 4c,d. These reliability curves were calculated in terms of average correlation between the actual reached position and the position reached by the fully connected net, over 100 random destruction patterns for each point. We assigned zero correlation to those realizations that do not reach a xed point in the n -vectorial space. This is justied by the fact that in the majority of the cases the oscillations stay around zero, and that for the rest, their dispersion with respect to zero are approximately counterbalanced. From the results above, we conclude that cognitive-properties dependent on dynamic relaxation of the system are acceptably maintained in the presence of high percentages of destruction or disconnection. We have shown that high dimensions allow for a large percentage of disconnection, and that the behavior is similar for different kinds of memories with the same number of instructed associations. The two anatomical interpretations of the model, that imply two different kinds of expected imprecisions in their design, do not present critical differences with respect to their reliability (Pomi, 1995). The reliability of those dynamic cognitive properties that emerge from recursive context-dependent autoassociative memories give the same qualitative results (Pomi and Mizraji, 1997b). From the study of different individual cases we realized that, although distributed, the memories do not respond uniformly to the deterioration of their different individual matrix components. The

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

181

Fig. 4. Reliability of chains of context-dependent associations. (a) It is shown the fraction of the realizations with oscillatory behavior, for different dimensionality and increasing disconnection of the memory instructed according to the rst database. The same initial position and objective to reach were proposed. (b) Emergence of oscillatory behavior with disconnection, for memories of a same dimensionality (dim = 16) but instructed according to the two different databases. (c) Reliability curves for different dimensionality when is proposed an objective to reach that implies a chain of context-dependent associations. For each average percentage of destruction of the memory instructed according to the rst database, the accuracy of the reached positions is evaluated as the average correlation with respect to the ideal objective vector. (d) Reliability curves for two memories instructed with the two different databases.

damage of certain coefcients is critical for the emergence of an oscillatory behavior in these systems. These results highlight the fact that the distribution and superposition of the memory traces generate heterogeneities in the structure of these matrix memories that are strongly dependent on the vector code. In the Appendix it is shown how this fact inuences some statistical properties of the coefcients of the memory matrices; it is also shown the existing link between the variance of the matrix and a noise-to-signal expression for context-dependent associative memories.

4. Feature analysis Autoassociative context-dependent memories embedded in recursive nets can perform different operations related to feature analysis and pattern recognition. As mentioned above, autoassociative memories are those that associate each input f with itself. Context-dependent autoassociative memories (A ) take into account the fact that, to each autoassociated input fi it corresponds a group of possible characteristic contexts pij. Using orthonormal vectors they become expressed as follows:

182

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

A = % fi ( fi pij )T
i, j

In recursive nets, context-dependent associations return to the memory in order to be reprocessed. Then, a dynamic system is generated, in which different contexts give rise to different temporal sequences. Given an initial input and after a chain of associations, different attractors can be reached, determined by different on-line contexts. The behavior of the system will depend on the kind of memory installed (such as auto or hetero associations), on the path through which the output reenters (the structure of the feedback) and also on the integrity of the connections (Scheme 2). We now show how the instruction of the memories with autoassociations enables the system with the capability of selecting a pure instructedautoassociated feature fk m { f } from a complex combination of them presented as initial input. Each initial input will have the following structure: (S ki fi ), with (ki a numerical weight. For simplication we will consider here that to each single feature fi corresponds only one context pi. By means of this property the context takes a role in perceptual autoorganization and in the disambiguation of ambiguous gures.

4.1. Disambiguation and perceptual autoorganization


Confronted with a complex initial stimulus, a single context pk one of the instructed context vectors, member of the orthogonal basis marks its own corresponding feature fk, which will be extracted in one step. A

Fig. 5. Modeling the static properties of Neckers cube perception. The stimulus entry and the output are considered to be perceptions. The other entry, the context, is considered as an internal cue. Each of the two visions of the cube are represented by the orthogonal vectors a and b, and the Neckers cube, the ambiguous gure, is represented by an equally weighted combination of a and b. The memory was instructed with the autoassociations M = a (a ca )T + b (b cb )T + a (a a )T + b (b b )T, where ca and cb represent two opposite perspectives, e.g. xing a particular dyedral angle before or behind the plane. When the Neckers cube gure (0,5a + 0,5b ) is presented to the memory accompanied by one of the two pure visions of the cube (a or b ), or with one of the two perspectives (ca or cb ), acting as context, the perception will be disambiguated in one step.

  n
% ki fi pk
i

fk

Scheme 2.

This power of the context as a semantic switch can be highlighted in a common experience: the management of ambiguity. We are frequently confronted with perceptions compatible with more than one semantic decoding. Examples of these ambiguous perceptions are words or phrases with multiple meanings. In this kind of situation the resolution of the ambiguity usually comes from the context. A paradigmatic case of ambiguity in visual perception is the Neckers cube: this wellknown two dimensional drawing is consistent with two different three dimensional perceptions (see Fig. 5). As William James keenly realized, Neckers cube perception becomes immediately disambiguated if the observer, at the same time, inter-

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

183

nally presents himself with an imaginary view via another internal input way. In James (1890) Principles of Psychology, William James mentioned that some planar diagrams can remind us of two different objects depending on the different perspective with which they are observed: Whichever of these objects we conceive clearly at the moment of looking at the gure, we seem to see in all its solidity before usWe need only attend to one of the angles represented, and imagine it either solid or hollow -pulled towards us out of the plane of the paper, or pushed back behind the same- and the whole gure obeys the cue and is instantaneously transformed beneath our gaze. The peculiarity of all these cases is the ambiguity of the perception to which the xed retinal impression gives rise. A context-dependent model with xed context and an autoassociative memory satisfactorily models this perceptual experience (Pomi et al., 1993; Pomi, 1995; Pomi and Mizraji, 1997b; see Fig. 5). This property also reminds the way in which the nervous system recognizes a face within a group of persons. As in the previous situation, in this case, if you have in mind the face you are searching for, you can easily nd it in a crowd. We now consider another situation with the same recursive net as the former diagram. In the general case, the xed context accompanying the initial input experienced by the system, may be also a complex context, that is, a combination (S ui pi ) of context vectors pi. Confronted with a complex context, this system will evolve towards the extraction of the single feature corresponding to the heaviest component of the context: A

Using memories with two autoassociated features (a and b ), each one with a proper context (ca and cb ), these phenomenological properties can be described in terms of analytical recursive formulas. We are considering complex vectors that are linear combinations of two features or two pure contexts; before each iteration we impose a probabilistic rescaling such that the sum of the coefcients of each component in a complex vector (stimulus or context) will be the unit. Then, we will have a perceptual and also a contextual space belonging to the segment [0,1], on which the dynamics will occur. With an intermediate initial position in the perceptual space [(h0a = (1 (h0)b ], and a xed context [(lca + (1 l )cb ], the position in the perceptual space at a given time (ht ) will be ht + 1 = ht [D + (1 D )ht ] 1 with D = (1 l ) l 1 Depending on the relative weight of the two components of the context (D ), the system will stay at h = 0.5 (for l = 0.5) or reach either h = 1 or 0, the three equilibrium points. The graphical representation corresponding to this situation is seen in Fig. 6a.

4.2. Self -contextual autoassociati6e memories as intersection lters and entry -analyzers
During the instruction of these memories, besides the fact that inputs are autoassociated, each input also acts as its own context. Then, the instructed entries are vectors multiplied by themselves, a kind of second degree polynomial in the Kronecker product. A = %fi ( fi fi )T
i

  
i

% ki fi % ui pi
i

n

fq

with uq = max (ui ). Then, the structure of the context drives the complex initial input to the perception of the nal selected component. Observe that if the pure components of the context vector are equally weighted, this system cannot make a decision, and the complex perception remains unchanged.

Placed in a net with a recurrent loop, these very simple memories can promote an analysis of their inputs. With the initial input acting also as xed context (Scheme 3)this recursive net will extract the heaviest component of the initial complex input experienced by the system

184

   n
% ki fi % ki fi
i i

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

fq

with kq = max (ki ). The same results can be reached with another structure of the recursion, as will be seen below. Let us now return to the example with only two features. In this case the memory will be A = a (a a )T + b (b b )T. Consider now a recursive loop like the following conguration, where the context is variable (Scheme 4).

Scheme 3.

Here, the same output is reinjected and acts as input and also as context at the next step. Now the context is variable and at each time step it updates with the perception. If we begin with an initial perception and context [(h0 a + (1 h0) b ], and using the same probabilistic rescaling as above, the general recursive expression for the evolution of the perception will be given by
2 1 2 ht + 1 = h 2 t [h t + (1 ht ) ]

whose plot can be seen in Fig. 6b. The initial perception evolves step by step, and the nal perception will be the heaviest component of this entry. The capacities of self-contextual autoassociative memories as entry-analyzers are related with their possibility of acting as intersection lters. This is a neural net that brings, as output, the intersection between its entries (Scheme 5). A conventional second-degree autoassociative memory of the kind we have just seen A = %fi ( fi fi )T
i

performs the intersection in the rst step. For instance, if the entries to the memory are the linear combinations ( f1 + f2) and ( f2 + f3), A [( f1 + f2) ( f2 + f3)] = A [( f1 f2) + ( f1 f3) + ( f2 f2) + ( f2 f3)]
Fig. 6. Perceptual autoorganization for two features and two corresponding contexts. The successive positions (ht ) in the perceptual space (a dynamics over the segment [0, 1]) provide the perceptual evolution, once an initial perception and a context are given. (a) Depending on the xed context l, the system will reach one of the equilibrium points, according to the recursive formula of Section 4.1. (b) When the successive perceptions act also as their own contexts, the heaviest component of the entry becomes extracted (see text and formula of Section 4.2).

= f2,

Scheme 4.

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

185

Scheme 5.

it performs the intersection, and their common component f2 becomes extracted. The following equation shows another example of intersection of two entries. A [( f1 + f2 + f3) ( f1 + f3 + f4 + f5)] = f1 + f3 Using these second degree memories as basic components, a classical situation of natural and articial intelligence can be easily modeled: the successive renements that occur in the selection of items during the reception of new information (Pomi and Mizraji, 1997a).

5. Discussion In this work we specially emphasize the multiplicative performances of a neuron. Plausibly the simplest, yet computationally powerful, cortical neuron activity could be described using the following equation:
(2) gh (~ + 1) = % M (1) hi fi (~ ) + % M hk pk (~ ) i k

+ % M (3) h(ik) fi (~ )pk (~ )


i,k

where M and M (2) hk are matrix coefcients corresponding to classical linear pattern associations; M (3) h(ik) represents a matrix of the kind we use in our Kronecker product models, and gh (~ ), fi (~ ) and pk (~ ) represent activities of individual neurons. Almost pure M (3) h(ik) memories could be the natural outcome of learning procedures oriented towards the prevention of responses gh (~ + 1) in the presence of isolated inputs fi (~ ) or pk (~ ). Remark that slight modications of the integrateand-re neurochemical model used by Nass and Cooper (1975) can lead to the previous equation. Context-dependent distributed associative memories could be supported by neurons that present themselves as classical linear associators

(1) hi

of their afferences to the soma, but capable of incorporating, in a more distal level, local modulation of their afferences and the spatial variability of dendritic processing (Hounsgaard and Midtgaard, 1989). An entry, widely spread through a neuron of the net, could be modulated by different afferences environments at different regions of the dendritic tree. This modulation, expressed by the multiplication, may occur directly at the synaptic level or in a more indirect manner, through the participation of the ionic channels and chemical mediators interacting in the amplication and propagation of the signals at the dendritic level. The multiplicative abilities of these neural devices are currently investigated (Mel, 1992, 1993). Some researchers are also investigating how this kind of multiplicative abilities can be sustained, not by individual neurons but by populations of neurons (Salinas and Abbott, 1996). We have explored some dynamic properties of context-dependent autoassociative memories. Autoassociative recursive memories have been utilized as dynamic models in which pure features act as attractors. Given a stimulus belonging to the feature space, the system relaxes until an attractor is reached. Brain-state-in-a-box model (BSB) and also Hopeld models are classical models of relaxation memories (Anderson et al., 1977; Hopeld, 1982). But these one entry-one associated response models are challenged by the necessity of implementing, in a simple way, visits to the different possible attractors. This is a natural expectance in processes admiting various interpretations for a same given stimulus, such as perceptual or linguistic disambiguation. For the case of the BSB model, Kawamoto and Anderson (1983) have communicated a strategy based on the reinstruction of the memory by a given function, with which they were able to obtain a switch in the response between two different associations. This requires to change the structure of the memory for the modeling of short time biological processes. On the contrary, the models of contextdependent associative memories could do it using another strategy, given that their canonical property is their possibility of retrieving different associations at a same stimulus, depending on

186

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

on-line different contextual environments. Hence, the simple variation of an on-line context, without any other modication of the memory or re-learning, allows for a change of the reached attractor beginning at a same initial position. The variety of behaviors allowed by different combinations of the structure of the recursive loop and the kind of memory installed, provides a large set of possibilities, as we have shown for context-dependent autoassociative memories. The modeling of higher cognitive performances will be probably grounded on the interaction of modules with specic functions. Hence, context-dependent memories are devices with an important potential role for these modular constructions.

that the classical truth-values true and false map on two q -dimensional vectors s and n, respectively; the resulting logical gates are rectangular matrices of dimension q q 2 (Mizraji, 1992, 1996). We chose for our example the matrices that execute the implication L, and the exclusive-or X : L = s (s s )T + n (s n )T + s (n s )T + s (n n )T, X = n (s s )T + s (s n )T + s (n s )T + n (n n )T. We are going to build up these matrices using two different orthonormal basis {s,n }. Case 1: s = [1 0]T, L=

1 0

0 1

1 0

1 0

n = [0 1]T. X=

0 1

1 0

1 0

0 1

Acknowledgements We would like to thank our colleagues Luis Acerenza, Fabia n Alvarez and Julio Herna ndez for their comments. We acknowledge PEDECIBA (Uruguay) for nancial support.

L = 1/2 var(L ) = 1/4

X = 1/2 var(X ) = 1/4.

Case 2: s = (1/
2) [1 1]T, n = (1/
2) [1 1]T. L = (1/
2)

Appendix A. Structural heterogeneity of distributed memories The information stored in a distributed memory is scattered within the matrix coefcients in a complex way. In this appendix, we show how the statistical properties of these coefcients depend on the structure of the stored vectors. Given a rectangular matrix A = [aij ] of dimension m n, we dene the mean and the variance of matrix A as follows: A aij = (1/mn ) % aij,
i, j 2 var(A ) var(aij ) = a 2 ij aij .

X = (1/
2)

 

2 1 2 0

0 1 0 0

0 1 0 0

0 1

0 2

n n

L = 1/(2
2) var(L ) = 3/8

X = 0 var(X ) = 1/2.

In what follows, we are going to illustrate these statistics using two different low dimensional memories. In our example the memories are selected among the logical operators of vector logic. This logic emerges from the installation of the classical logical operations into context-dependent distributed memories. This formalism produces

This very simple example shows that simple changes in the vector code are strongly reected in the matrix structure. This fact suggests that the spectrum of synaptic reinforcement in a real neural situation (potentially representable using a matrix structure) can be strongly inuenced by both the memory patterns and the particularities of vector codes. We now dene a q -dimensional one-vector 1q as a column vector lled with 1s in its q positions. The following results can be directly proved: A = (1/mn ) (1T m A 1n )
2 var(A ) = (1/mn ) tr (A T A ) [(1/mn )(1T m A 1n )] ,

tr A = %aii .
i

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188

187

Using these results we can immediately obtain the mean value and the variance for a context-dependent associative memory dened over orthonormal vectors:
T T M = (1/mnr )% (1T m gi )(1n fi ) (1r pi ), i

var (M ) = (1/mnr )% gi,gj fi, fj pi,pj


i, j

M .
2

In these equations it is implicit the way in which the code used to represent the patterns determines the statistical properties displayed by these matrix memories. Remark that for large vectors with their components symmetrically distributed around 0, we have M : 0. Moreover, if the patterns g, f and p belong to orthogonal sets, the cross scalar products vanish, and var(M ) : (1/mnr ) % gi 2 fi 2 pi 2 .
i

Finally, if the vectors have been normalized, we have var(M ) = K /mnr, where K is the number of stored patterns. Notice that the typical noise/signal relation of this kind of memory is K /nr (see, for instance, Anderson, 1972). Hence, this last expression of the variance can be expressed as var(M) : (noise/signal)/(output dimension).

References
Anderson, J.A., 1972. A simple neural network generating an interactive memory. Math. Biosci. 14, 197220. Anderson, J.A., 1995. An Introduction to Neural Networks. MIT Press, Cambridge, MA. Anderson, J.A., Hinton, G.E., 1989. Models of information processing in the brain. In: Hinton, G.E., Anderson, J.A. (Eds.), Parallel Models of Associative Memory. Erlbaum, New Jersey, pp. 23 62. Anderson, J.A., Silverstein, J.W., Ritz, S.A., Jones, R.S., 1977. Distinctive features, categorical perception, and probability learning: some applications of a neural model. Psychol. Rev. 84, 413 451.

Bialek, W., Zee, A., 1990. Coding and computation with neural spike trains. J. Stat. Phys. 59, 103 115. Bibbig, A., Wenneckers, T., Palm, G., 1995. A neural network model of the cortico-hippocampal interplay and the representation of contexts. Behav. Brain Res. 66, 169 175. Bugmann, G., 1992. Multiplying with neurons: compensation for irregular input spike trains using time-dependent synaptic efciencies. Biol. Cybern. 68, 87 92. Hopeld, J.J., 1982. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. 79, 2554 2558. Hounsgaard, J., Midtgaard, J., 1989. Dendrite processing in more ways than one. TINS 12, 313 315. Humphreys, M.S., Bain, J.D., Pike, R., 1989. Different ways to cue a coherent memory system: a theory for episodic, semantic, and procedural tasks. Psychol. Rev. 96, 208 233. Jain, A.K., 1989. Fundamentals of Digital Image Processing. Prentice-Hall, New Jersey. James, W., 1890. Principles of Psychology, The Great Books of the Western World 53. The University of Chicago, p. 618. Kawamoto, A.H., Anderson, J.A., 1983. A neural network model of multistable perception. Acta Psychol. 59, 35 65. Koch, C., Poggio, T., 1987. Biophysics of computation: neurons, synapses and membranes. In: Edelman, G.M., Gall, W.E., Cowan, W.M. (Eds.), Synaptic Function. Wiley, New York, pp. 637 697. Koch, C., Poggio, T., 1992. Multiplying with synapses and neurons. In: McKenna, T., Davis, J., Zornetzer, S.F. (Eds.), Single Neuron Computation. Academic Press, pp. 315 345. Kohonen, T., 1977. Associative Memory. A System-Theoretical Approach. Springer, New York, p. 98. Mel, B.W., 1992. NMDA-based pattern discrimination in a modeled cortical neuron. Neural Comp. 4, 502 517. Mel, B.W., 1993. Synaptic integration in an excitable dendritic tree. J. Neurosci. 70, 1086 1101. Mizraji, E., 1989. Context-dependent associations in linear distributed memories. Bull. Math. Biol. 51, 195 205. Mizraji, E., 1992. Vector logics: the matrix vector representation of logical calculus. Fuzzy Sets Syst. 50, 179 185. Mizraji, E., 1996. The operators of vector logic. Math. Log. Q. 42, 27 40. Mizraji, E., Pomi, A., Alvarez, F., 1994. Multiplicative contexts in associative memories. BioSystems 32, 145 161. Montague, P.R., Sejnowski, T.J., 1994. The predictive brain: temporal coincidence and temporal order in synaptic learning mechanisms. Learn. Mem. 1, 1 33. Nass, M.M., Cooper, L.N., 1975. A theory for the development of feature detecting cells in visual cortex. Biol. Cybern. 19, 1 18. Poggio, T., 1990. A theory of how the brain might work. In: The Brain, Cold Spring Harbor Simposia on Quantitative Biology, vol. LV. The Cold Spring Harbor Laboratory Press, New York, pp. 390 431. Pomi, A., 1995. Estudio de algunas propiedades de un modelo de memoria asociativa sensible a contextos, M.Sc. Thesis, PEDECIBA, Montevideo.

188

A. Pomi Brea, E. Mizraji / BioSystems 50 (1999) 173188 Symposium. D.I.R.A.C., Facultad de Ciencias, Montevideo. Salinas, E., Abbott, L.F., 1996. A model of multiplicative neural responses in parietal cortex. Proc. Natl. Acad. Sci. 93, 11956 11961. Shacter, D.L., 1996. Searching for Memory. The Brain, the Mind, and the Past. BasicBooks, New York. Tal, D., Schwartz, E.L., 1997. Computing with the leaky integrate-and-re neuron: logarithmic computation and multiplication. Neural Comput. 9, 305 318.

Pomi, A., Alvarez, F., Mizraji, E., 1993. Actividades cognitivas en un modelo de red neural reverberante sensible a contextos, XXII Reunio n Cient ca de la Sociedad Argentina de Biof sica, Maciel, Santa Fe. Pomi, A., Mizraji, E., 1997a. Context-dependent associative memories as integrative nodes in modular neural neworks, III Congreso Iberoamericano de Biof sica, Buenos Aires. Pomi, A., Mizraji, E., 1997b. Disambiguation with context dependent associative memories. In: Mizraji, E., Acerenza, L., Alvarez, F., Pomi, A. (Eds.), Biological Complexity: A

You might also like