TMP 1 B18

J Comput Neurosci
DOI 10.1007/s10827-009-0154-6
Principal component analysis of ensemble recordings

reveals cell assemblies at high temporal resolution
Adrien Peyrache · Karim Benchenane ·
Mehdi Khamassi · Sidney I. Wiener ·
Francesco P. Battaglia
Received: 22 December 2008 / Revised: 10 March 2009 / Accepted: 8 April 2009

© The Author(s) 2009. This article is published with open access at Springerlink.com
Abstract Simultaneous recordings of many single neu- allowing one to evaluate the statistical significance of
rons reveals unique insights into network processing instantaneous coactivations. Hence, when applied in an
spanning the timescale from single spikes to global epoch different from the one where the patterns were
oscillations. Neurons dynamically self-organize in sub- identified, (e.g. subsequent sleep) this measure allows
groups of coactivated elements referred to as cell to identify times and intensities of reactivation. The
assemblies. Furthermore, these cell assemblies are re- distribution of this measure provides information on
activated, or replayed, preferentially during subsequent the dynamics of reactivation events: in sleep these occur
rest or sleep episodes, a proposed mechanism for mem- as transients rather than as a continuous process.
ory trace consolidation. Here we employ Principal
Component Analysis to isolate such patterns of neural Keywords PCA · Reactivation · Sleep ·
activity. In addition, a measure is developed to quantify Memory consolidation
the similarity of instantaneous activity with a template
pattern, and we derive theoretical distributions for the
null hypothesis of no correlation between spike trains, 1 Introduction
Ensemble recordings, or the simultaneous recordings

Action Editor: P. Dayan of groups of tens to hundreds cells from one or mul-
tiple brain areas in behaving animals, offer a valuable
A. Peyrache (B) · K. Benchenane · M. Khamassi ·
window into the network mechanisms and information
S. I. Wiener · F. P. Battaglia
Laboratoire de Physiologie de la Perception et de l’Action, processing in the brain which ultimately leads to behav-
Collège de France, CNRS, 11, place Marcelin Berthelot, ior. In the last two decades, the dramatic increase in
75231, Paris Cedex 05, France yield of such techniques with the use of tetrodes, silicon
e-mail: adrien.peyrache@college-de-france.fr
probes and other devices (McNaughton et al. 1983;
S. I. Wiener Buzsáki 2004) poses extremely challenging problems to
e-mail: sidney.wiener@college-de-france.fr
the data analyst trying to represent and interpret such
M. Khamassi high-dimensional data and uncover the organization of
Institut des Systèmes Intelligents et de Robotique, network activity.
Université Pierre et Marie Curie – Paris 6, Starting with Donald Hebb’s seminal work (Hebb
CNRS FRE 2507, 4 Place Jussieu,
1949), theorists have posited cell assemblies, or
75252 Paris Cedex, France
subgroups of coactivated cells, as the main unit of
F. P. Battaglia information representation. In this theory, information
Center for Neuroscience, Swammerdam Institute for Life is represented by patterns of cell activity, which create
Sciences, Faculty of Science, Universiteit van Amsterdam,
P.O. Box 94084, Kruislaan 320, 1090GB Amsterdam,
a coherent, powerful input to downstream areas.
The Netherlands Cells assemblies would result from modifications of
e-mail: F.P.Battaglia@uva.nl local synapses, e.g. according to Hebb’s rule (Hebb
J Comput Neurosci
1949). Their expression and dynamics are likely driven and Wilson 2006). This is proposed to be important
to a large extent by specific interactions between for memory consolidation, i.e. turning transient, labile
principal cells and interneurons (Geisler et al. 2007; synaptic modifications induced during experience into
Wilson and Laurent 2005, Benchenane et al. 2008). stable long-term memory traces. Replay appears to
From an experimental point of view, cell assemblies take place chiefly during Slow Wave Sleep (SWS). In
can be characterized in terms of the coordinated firing the hippocampus, a brain structure strongly implicated
of several neurons in a given temporal window, either in facilitating long term memory (Scoville and Milner
simultaneously (Harris et al. 2003), or in ordered se- 1957; Marr 1971; Squire and Zola-Morgan 1991; Nadel
quences of action potentials from different cells, as and Moscovitch 1997), cell assemblies observed during
has been shown in both hippocampus (Lee and Wilson wakefulness are replayed in subsequent SWS episodes
2002) and neocortex (Ikegaya et al. 2004). Ensemble (Wilson and McNaughton 1994) in the form of cell
recording provides the opportunity to measure these firing sequences (Nádasdy et al. 1999; Lee and Wilson
co-activations in the brain of behaving animals. 2002). This occurs during coordinated bursts of activity
To date, only few methods for rigorous statistically known as sharp waves (Kudrimoti et al. 1999).
based quantification of cell assemblies have been pro- To detect replay, we first need to characterize
posed (e.g. Pipa et al. 2008). This problem is all the the activity during active experience, and to generate
more delicate when temporal ordering of cells’ dis- representative templates from it. Then, templates are
charges is taken into consideration (Mokeichev et al. compared with the activity during sleep to assess their
2007), requiring immense data sets in order to attain repetitions. Previous methods have only provided a
the necessary statistical power (Ji and Wilson 2006). On measure of the overall amount of replay occurring
the other hand, cell assemblies’ zeros-lag co-activations during a whole sleep episode (Wilson and McNaughton
already provide a rich picture of network function 1994; Kudrimoti et al. 1999), by using the session-
(Nicolelis et al. 1995; Riehle et al. 1997), and may rep- wide correlation matrix as a template. Alternatively,
resent an easier target for statistical pattern recognition templates have been generated from the neural activity
methods. Moreover, the effective connectivity between during a fixed, repetitive behavioral sequence. This
cells is a dynamical, rapidly changing parameter. Be- is possible, for example, for hippocampal place cells,
cause of this, it is important to follow cell assemblies which fire as the animal follows a trajectory through the
at a rapid time scale. This would improve our under- firing fields of the respective neurons (Lee and Wilson
standing of the temporal evolution of the interaction 2002), if the animal runs through the same trajectory
between cells, their link to brain rhythms, the activity multiple times (Louie and Wilson 2001; Ji and Wilson
in other brain areas or ongoing behavior. 2006; Euston et al. 2007), or when a new and transient
Principal Component Analysis (PCA) has previously experience occurs (Ribeiro et al. 2004). One can then
been used to find groups of neurons that tend to fire search for the repetition of this template during the
together in a given time window (Nicolelis et al. 1995; sleep phase. However, such a template construction
Chapin and Nicolelis 1999). PCA (see e.g. Bishop 1995) technique is not applicable when the behavior is not
can be applied to the correlation matrix of binned repeated, or if the behavioral correlates of the recorded
multi-unit spike trains, and provides a reduced dimen- cells are not known.
sionality representation of ensemble activity in terms of Recently, we used PCA to identify patterns in pre-
PC scores, i.e., the projections over the eigenvectors of frontal cortical neurons ensembles (Peyrache et al.
the correlation matrix associated with the largest eigen- 2009), without making reference to behavioral se-
values. This representation accounts for the largest quences, and we devised a novel, simple measure using
fraction of the variance of the original data for a the PCA-extracted patterns to assess the instantaneous
fixed dimensionality. Also, PCA is intimately related to similarity of the activity during sleep. During sleep, this
Hebb’s plasticity rule: it has been shown to be an emer- similarity increased in strong transients demonstrat-
gent property of hebbian learning in artificial neural ing that neuron ensembles appearing in the AWAKE
networks (Oja 1982; Bourlard and Kamp 1988). phase reactivate during SWS. Further, the fine tempo-
A remarkable success of ensemble recordings was ral resolution of this approach uncovered for the first
the demonstration that, during sleep, neural patterns of time a link between assembly replay in the prefrontal
activity appearing in the immediately previous awake cortex and hippocampal sharp waves, as well as the
experience are replayed (Wilson and McNaughton relationship between this replay and slow cortical os-
1994; Nádasdy et al. 1999; Lee and Wilson 2002; Ji cillations. It was also possible to determine the precise
J Comput Neurosci
behavioral events corresponding to the origin of re- given in the following sections, but schematically, our
played patterns. Moreover, we were able to determine procedure is divided in five steps:
that the formation of these cell assemblies involves spe-
1. Spike trains from multiple, simultaneously re-
cific interactions between interneurons and pyramidal
corded cells from the awake epoch are binned and
cells (Benchenane et al. 2008).
z-transformed.
The present paper presents this methodology in
2. The correlation matrix of the binned spike trains is
detail with mathematical and statistical support, and
computed, and diagonalized.
provides new results on how the statistical significance
3. The eigenvectors associated to the largest eigenval-
of the replay can be assessed. We show how random
ues are retained; a threshold value can be computed
matrix theory can be used to provide analytical bounds
from the upper bound for eigenvalues of correla-
for quantities of interest in the analysis using a multi-
tion matrices of independent, normally distributed
variate normal distribution as a null hypothesis, and we
spike trains.
show how deviations from normality can be dealt with
4. Spike trains from the sleep epoch are binned and
in a consistent manner.
z-transformed.
5. A measure of the instantaneous similarity (termed
reactivation strength) of the sleep multi-unit activity
2 Methods
at each time (the population vector) with the eigen-
value is computed.
2.1 Experimental setting
Reactivation strength is a time series describing how
Four rats were implanted with 6 tetrodes (McNaughton much the sleep ensemble activity resembles the awake
et al. 1983) in the prelimbic and infralimbic areas in activity at any given time. To make the claim that
the medial PreFrontal Cortex (mPFC). 1692 cells were replay of experience-related patterns is taking place
recorded in the mPFC from four rats, during a total of during sleep, we need to test the computed reactiva-
63 recording sessions (Rat 15: 16; Rat 18: 11; Rat 19: tion strength against an appropriate null hypotheses.
12; Rat 20: 24). Rats performed a rule-shift task on a The simplest null hypothesis is that the sleep activity
Y maze, where, in order to earn a food reward, they is completely independent from the awake data, and,
had to select one of the two target arms, on the basis of for example, is drawn from a multivariate normal dis-
four rules presented successively. The rules concerned tribution. Clearly enough, disproving this hypothesis
either the arm location, or whether the arm was illu- does not provide sufficient evidence for replay: certain
minated (changing randomly at each trial). As soon as structural activity correlations may have existed prior
the rat learned a rule, the rule was changed and had to to the experience, perhaps because of already present
be inferred by trial and error. Recordings were made synaptic connections. Moreover, activity distributions
also in sleep periods prior to (PRE) and after (POST) may be non-normal, if for no other reason, because
training sessions. For an extensive description of the binned spike trains for small enough bin sizes will tend
behavioral and experimental methods, see Peyrache to be very sparse and thus much of the mass of the
et al. (2009). distribution will concentrate around zero, causing the
distribution to be strongly asymmetrical. Nevertheless,
2.2 Analysis framework from the conceptual point of view, this null hypothesis
is interesting as it allows to better characterize devi-
The inspiration for developing this method was to ations from normality in the activity distribution and
quantitatively and precisely compare the correla- their consequences.
tion matrix of the binned multi-unit spike trains To control for structural correlations, the sleep data
recorded during active behavior with the instantaneous must be compared with another sleep session recorded
(co)activations of the same neurons recorded at each prior to the experience: if reactivation strengths in
moment during the ensuing sleep. Throughout the ar- sleep after experience (the POST epoch) tend to be
ticle, the bin width is fixed at 100 ms. The “awake” larger than the values for the same neuron ensemble
correlation matrix can be seen as the superimposition measured for the sleep before (or the PRE epoch), one
of several modes of patterns of activity. The PCA may conclude that the experience epoch contributed
procedure makes it possible to separate such patterns. to increase sleep activity correlations, and replay has
The precise mathematical definition of the algorithm is taken place. If no difference between PRE and POST
J Comput Neurosci
reactivation strength is measured, one may conclude coefficients, we perform a PCA on the Q matrix, that is,
that experience had no effect on correlations in spon- an eigenvector decomposition of the correlation matrix.
taneous activity during sleep. This yields a set of N eigenvectors pl , l = 1..N, each
An important feature of this technique is that it associated to an eigenvalue λl . The patterns will be
permits instantaneous assessments of replay. Formally, associated to the projectors of the correlation matrix,
reactivation strength measures similarity between the noted P(l) , which are the outer products of all eigenvec-
correlation matrices for the awake and sleep data (the tors with themselves, providing the following represen-
approach followed in works such as e.g. Kudrimoti et al. tation of the correlation matrix:
1999), which was decomposed into the contributions

coming from each eigenvector and each time during C= λl ( p(l) )T p(l) = λl P(l) (4)
l l
sleep. As discussed below, this considerably increases
analysis power, as replay time series can be correlated This form highlights the fact that the ensemble cor-
with other physiological time series of relevance. relation matrix can be seen as the superimposition of
Furthermore, it is also useful to apply the analysis several co-activation patterns, whose importance is
in the opposite sense: extracting templates from replay measured by the eigenvalue λl . PCA allows to distin-
events in the sleep epoch, and matching them to the guish these patterns which can, in turn, be compared
awake data. In this way, one can identify those behav- with the instantaneous cell activity during different
ioral events with activity most similar to sleep activity, epochs. In order to do that though, we also need to
and therefore, which behavioral events may contribute establish which patterns are likely to reflect underly-
the most to replay. For this reason, we will describe ing information encoding processes and which are the
the procedure in terms of generic TEMPLATE and result of noise fluctuations. This problem is addressed
MATCH epochs. below.
2.3 Isolation of neural patterns 2.4 Time course of template matching
Let us consider N neurons recorded simultaneously Let us consider two epochs, TEMPLATE and
over the time interval [0 . . . T]. The neurons’ activity MATCH. The general idea is to compare the instan-
could be represented by a series of spike times noted taneous co-activations of neurons during the MATCH
{t ji } where t ji is the j-th spike of the i-th neuron (i = epoch with the patterns identified during TEMPLATE,
1 . . . N). following the method proposed above.
This activity is then binned yielding a time series of To begin, we could just compare the epoch-wide cor-
spike counts si (tk ) where i = 1 . . . N, tk = 1 . . . B, where relation matrices for the two epochs. One such measure
B is the total number of bins and tk represents the of similarity, computed from the two epochs, would be:
center time of the bins:
M M A−T E = CijT EMPLAT E CijM ATCH (5)

si (tk ) = card t ji : tk + b /2 > tij > tk − b /2 (1) i, j:i< j
Here, b is the bin width (b = T/B). Hereafter, for 1 M ATCH T T EMPLAT E

= Tr C −I C −I
the sake of clarity, the indices of the discrete times 2
tk will be omitted. Then, these binned activities are z- (6)
transformed, obtaining the Q matrix:
where the superscript M A − T E stands for MATCH-
si (t) − si
Qit = (2) TEMPLATE. This measure is strongly positive in the
σsi case of high similarity and is strongly negative in the

where si = B1 Bj=1 s j(t) and σsi = B−1 B 2 case where correlations change sign (from positive to
t=1 s (t) − s .
1 i i 2
negative and vice versa) from the TEMPLATE to the
The pairwise cell activity correlation matrix is then ob-
MATCH epoch.
tained as
In substance, this is the approach used in studies such
1 as Wilson and McNaughton (1994), Kudrimoti et al.
C = QQT (3)
B (1999), which gave an overall assessment of the amount
The elements of the correlation matrix, Cij, are of replay in the whole MATCH epoch (in their case,
the Pearson correlation coefficients between the spike the sleep epoch). Further mathematical manipulation
count series for neurons i and j. To disambiguate the yields a prescription to measure the exact time course
contribution of each pattern in the resulting correlation of the replay: M M A−T E can be expressed as a sum over
J Comput Neurosci
a time series defined for each time bin during the POST The time series RlM A−T E (t) measures the instantaneous
(or PRE) epoch (by using Eq. (3)): match of the l-th co-activation template and the on-
1
M ATCH T EMPLAT E going activity. The exclusion of the diagonal terms
M M A−T E = C Cij (7) in Eq. (13) reduces the sensitivity of the reactiva-
2 i, j:i= j ij
tion strength measure to fluctuations in the instan-
1 taneous firing rates. The mean reactivation measure,
= M M ATCH−T EMPLAT E is therefore a weighted sum of the
2B M ATCH
time-averaged value of pattern similarity:

BM ATCH
× QitM ATCH CijT EMPLAT E Q M 1

M A−T E
ATCH N
jt
t=1 i, j:i= j M M ATCH−T EMPLAT E = λl Rl (14)
2 t
l=1
(8)
where .t denotes the average over time.
B
M ATCH
1
= R0M A−T E (t) (9)
2B M ATCH t=1
3 Results
where B M ATCH is the number of bins in the MATCH
epoch, and 3.1 Significance of principal components

R0M A−T E (t) = QitM ATCH CijT EMPLAT E Q M

jt
ATCH
.
To determine the significance of the patterns extracted
i, j:i= j
by PCA, we need to consider, for comparison, the null
Thus, CijT EMPLAT E can be seen as a quadratic form, hypothesis in which the spike trains are independent
applied to the vector of multi-cell spike counts random variables. Following the seminal work from
QitM ATCH , at each time t during the rest epochs, to pro- Wigner (1955) on the spectra of random matrices,
duce the time series R0M A−T E (t). R0M A−T E (t) represents the distribution of singular values (root square of the
a decomposition of the epoch-wide correlation simi- eigenvalues of the correlation matrix) of random N-
larity into its instantaneous contributions, i.e. the sim- dimensional data sets has been shown to follow the
ilarity between the current population vector at time so-called Marc̆enko-Pastur distribution (Marčenko and
t and the general pattern of co-activation during the Pastur 1967; Sengupta and Mitra 1999). In the limit
TEMPLATE epoch. Therefore, it contains information B → ∞ and N → ∞, with q = B/N ≥ 1 fixed,
on exactly when during the MATCH epochs occur √
q (λmax − λ) (λ − λmin )
patterns of co-activation similar to those that occurred ρ (λ) = (15)
2π σ 2 λ
in TEMPLATE. However, this measure can still com-
bine together factors from several different patterns where
which may co-activate independently. The obtained 2
λmin < λ < λmax min = σ
and λmax 2
1 ± 1/q
time course may therefore be the result of averaging
over these distinct patterns, which may in fact behave
quite differently from one another. The neural patterns σ 2 is the variance of the elements of the random
are extracted from CT EMPLAT E : matrix, which in our case is 1, because the Q matrix is z-

transformed. Equation (15) shows that the distribution
CT EMPLAT E = λl P(l) (10)
vanishes for λ greater than an upper limit λmax . Under
l
the null hypothesis of a random matrix Q, the corre-
from Eqs. (8) and (9), R0M A−T E (t) can now be expressed lations between spike trains are determined only by
as: random fluctuations, and the eigenvalues of C must lie

between λmin and λmax . Eigenvalues greater than λmax
R0M A−T E (t) = λl QitM ATCH Pij(l) Q M
jt
ATCH
(11) are therefore a sign of non-random effects in the matrix,
l i, j:i= j
and for this reason we call principal components asso-

= λl RlM A−T E (t) (12) ciated to those eigenvalues signal components, while
l those associated to eigenvalues between λmin and λmax
are defined as non-signal components. However, the
where

finite size of data sets implies that eigenvalue distribu-
RlM A−T E (t) = QitM ATCH Pij(l) Q M
jt
ATCH
(13) tion borders are not as sharp as the theoretical bounds
i, j:i= j described by Eq. (15). The highest eigenvalue of any
J Comput Neurosci
correlation matrix is drawn from the Tracy-Widom upper-bound is marked as a black dotted line and the
distribution (Tracy and Widom 1994) in the case of expected distribution in case of random fluctuation is
normal, or close to normal, variables. Thus, the highest depicted on the right plots of the figure. The upper
eigenvalue lies around λmax with a standard deviation bound (i.e. λtw ∼ λmax + N −2/3 ) of the expected fluctu-
of order N −2/3 (which assumes a value of ∼ 0.1 for ation for the highest eigenvalue given by Tracy-Widom
N = 30, typical for our recordings). We also use the is shown as a red line. It can be observed that the
value λmax to normalize uniformly eigenvalues across first eigenvalues are largely above the expected upper
different section, defining the normalized encoding bound (at least 4 or 5 times the width of the Tracy-
strength as Widom distribution above λmax ) hence allowing rejec-
tion the null hypothesis of independent spike count
λ series. To check whether any other irregularities (i.e.
Φ= . (16)
λmax normality violation) in the distribution of binned spike
trains could affect the eigenvalues of the correlation
Figure 1(a) shows the distribution of the eigenval- matrix (see for example Biroli et al. 2007), each row
ues of three different ensembles. The Marc̆enko-Pastur of Q was randomly permuted. The resulting shuffled
Fig. 1 Evidence for signal

components in the data sets.
(a) (b)
(a) Distribution of the 14
eigenvalues for three example 1.5
sessions (left) to be compared
Eigenvalues (λ)
10
Proportion (%)
with the Marc̆enko-Pastur 1.3
theoretical distribution
(right). The upper bound of 1.1
this theoretical distribution is 6
indicated in the left panel 0.9
with black dotted lines. The
red dotted line indicates the 2
0.7
upper bound derived from
Tracy-Widom distribution for 1 31 0.92 0.96 1 1.04 1.08
the highest eigenvalues in Eigenvector # Probability Eigenvalues (λ)
case of finite data sets (see
text). (b) Histograms of the 1.8 14
spectra of all the eigenvalues
1.6
Eigenvalues (λ)
for each of the same 3 data

Proportion (%)
sets as for panels in A after 10

1.4
shuffling. The empirical
distribution was in good 1.2
agreement with the 6
Marc̆enko-Pastur 1
distribution, even without 0.8
taking the Tracy-Widom 2
correction into account. In 0.6
particular, all computed 1 52 0.9 0.95 1 1.05 1.1
eigenvalues remained within Eigenvector # Probability Eigenvalues (λ)
the theoretical bounds
14
2.1
Eigenvalues (λ)
Proportion (%)
1.7 10
1.3
6
0.9
2
0.5
1 37 0.9 0.95 1 1.05 1.1
Eigenvector # Probability Eigenvalues (λ)
J Comput Neurosci
Sleep PRE Sleep POST

160 160
Reactivation Strength
120 120
80 80
40 40
0 0
0 400 800 1200 1600 0 400 800 1200 1600

Time (s) Time (s)
Fig. 2 Example of reactivation strength time course (bins of by brief, sharp increases in the reactivation strength, indicating
100 ms) for one principal component extracted from awake strong similarity between instantaneous coactivations and the
activity during two sleep sessions. Shaded areas denote periods correlation pattern of the awake principal component
of identified Slow Wave Sleep (SWS). POST SWS is dominated
matrix is composed of rows whose individual distrib- not simply an effect of the non-normality of each cell’s
utions are preserved but whose temporal interactions binned spike count.
are lost. The spectra of their correlation matrix are
shown in Fig. 1(b). All eigenvalues remain within the 3.2 Average reactivation
bounds of the Marc̆enko-Pastur distribution (red curve,
equivalent to the ones presented in Fig. 1(a), right). For sake of simplicity, let us first compute reactiva-
Thus, we argue that the observed signal eigenvalues are tion strengths using the TEMPLATE epoch as the
an effect of the correlation between spike trains, and MATCH epoch as well. In this case, at a time t,
(a) PRE POST (b)

2.2 2.2 2.2
y=x
1.8 1.8 r = 0.61 1.8

s = 0.43
φ
γ POST
γ POST
γ PRE
2.2
r = 0.3
s = 0.19 1.9
1.4 1.4 1.4
1.6
1.3
1 1 1
1
1 1.5 2 2.5 1 1.5 2 2.5 1 1.4 1.8 2.2

AWAKE Eigenvalue (λ) AWAKE Eigenvalue (λ) γPRE
Fig. 3 Eigenvectors from AWAKE better match activity in sleep were more strongly correlated during POST, and the slope of the
POST than in PRE. (a) Eigenvalues from the AWAKE cor- linear regression line was steeper too ( p < 10−6 ). (b) Average
relation matrix (x-axis) plotted against the average reactivation reactivation strength from POST versus PRE. Encoding strength
strength represented by the very same vectors during sleep PRE is color coded. The points tend to lie above the line represent-
(left) and POST (right) for signal components only (Φ > 1). ing the identity function, showing that mean reactivation was
Each dot represents one of the 323 signal components identified stronger during POST. This effect was stronger for components
from 63 datasets (four rats). Correlation values (r) and slopes with higher encoding strength
(s) are indicated for the two distributions. The two measures
J Comput Neurosci
the standardized population vector is written Qt = Fig. 4 Distribution of the R measure during the TEMPLATE
[Q1t , . . . , Qit , . . . , Q Nt ]T . Let Wt be defined as Wt = epochs (RT E−T E ). Data are from the same three sessions as
QtT Qt . Wt can be decomposed in a diagonal matrix WtD in Fig. 1 (AWAKE). (a) Distribution of R across all time bins
for the first principal component of each of the three sessions
and the remaining matrix WtR , therefore: (representative of the dataset, the respective encoding strength
T Φ = λ/λmax are displayed on top of the distributions). Real data
RlT E−T E (t) = pl WtR pl (17) (black), theoretical expectation (red) derived from a Monte-
Carlo sampling of Eq. (32) (n = 105 ), and a numerical simulation
using normal multivariate data with the same correlation matrix
whence,
as the actual data (blue). (b) Same plots as A but in log-log scale.
T (c) Distribution of the α and β terms from Eq. (32). High encod-
RlT E−T E t = pl WtR t pl (18) ing strength eigenvectors (e.g. one at bottom) tend to exhibit a
R clear power-law distribution of their R measure distribution
By definition, Wt = C and thus W t = C − I which
thus gives:
T E−T E
Rl t
= λl − 1 (19)
is more similar to its AWAKE analog than the one
In the case where the MATCH epoch is different from computed for PRE. Figure 3(b) shows an eigenvector-
the TEMPLATE, we have: by-eigenvector (combined across sessions) comparison
of the reactivation strengths during PRE and POST.
RlM A−T E = γlM ATCH − 1 where While it is apparent that some reactivation strengths
t
T appear during PRE as well, most eigenvectors showed
γl M ATCH
= pl C M ATCH pl (20) a larger value for POST, especially for eigenvectors as-
sociated to large eigenvalues. During PRE, reactivation
As mentioned above, memory trace reactivation strengths were nevertheless still important. This could
studies aim at comparing awake activity (the TEM- be due, as mentioned above, to structural correlations,
PLATE epoch here is the AWAKE epoch) with the as well as to neural processes reflecting anticipation of
subsequent sleep epoch (POST epoch, taking the role the upcoming task (or perhaps lingering reactivations
of the MATCH epoch). In Fig. 2 the epoch-wide time of yet earlier experiences).
course of R for an example principal component from a
recording of an ensemble of mPFC neurons is displayed
for the PRE and POST epochs. Transient peaks can 3.3 Distribution of R
be observed that are much stronger in POST, and con-
centrated in the periods of identified slow wave sleep In exploring the time course of the reactivation mea-
(SWS) (Peyrache et al. 2009). However, the baseline sure R, one interesting question that emerges is the na-
level is comparable between PRE and POST epochs. ture of its variability. One possibility is that R fluctuates
For this reason, from this point on, sleep epochs will steadily around an average value (possibly different
always refer to SWS only. The SWS preceding the for each epoch), as would be the case, for example,
AWAKE epoch is taken as a control (PRE epoch). if the underlying spike trains were a gaussian process.
The variable γl PRE (resp. γl POST ) quantifies the amount Alternatively, power-law behavior for the distribution
of variance that a given eigenvector from AWAKE of R values would indicate that the temporal evolution
could explain during the PRE epoch (resp. POST). of R is dominated by strong transients, as it would
The empirical distribution of γl PRE (resp. γl POST ) as a result, for example, from “avalanche” dynamics (Beggs
function of λl is shown in Fig. 3(a). During POST, γl was and Plenz 2003, 2004; Levina et al. 2007). In fact, if spike
more correlated to λl than in PRE, indicating that the trains are multivariate normally distributed variables,
correlation structure is more similar to that measured the distribution of R can be computed and compared
during AWAKE than it is in PRE. Note that, if it held with experimental data. Let us consider the case in
that C POST = C AW AKE , then γl POST = λl for all l. In which the TEMPLATE activity is considered fixed,
general, this is not the case, for example, because the and we shall compute the distribution of R when the
sleep correlation structure includes patterns that are columns of the Q M ATCH matrix are drawn from a
characteristic of that behavioral phase, and these do multivariate normal distribution with covariance matrix
not appear during the AWAKE epoch. In any event, C, QtM ATCH ∼ N (0, C).
during POST the regression line between λl and the In this case, for m different time bins, Q is a m × N
corresponding values of γl has a steeper (and closer to matrix and W = QT Q is a N × N matrix drawn from
1) slope, indicating that the POST correlation matrix the so-called Wishart distribution with m degrees of
J Comput Neurosci
(a) (b) (c)

0 0
10
0 10 10
Actual Data 10
-1 α
Theoretical Distribution -2
Simulated Data –1 10
10
–1 10
Log Probability
-3
10
–2
–2 10 0 1 2
10 10 10 10
φ = 1.3 10 0
-1
10
10
–3 10
–3
-2
β
10
10 -3
–4 –4
10 10
–5 0 5 10 15 20 25 30 35 40 10 0 10 1 10 2 10
0
10
1
10
2
Reactivation strength Reactivation strength
0 0
10 10 10 0
-1
10 α
–1 –1 -2
10 10 10
Log Probability
10 -3
–2 –2
10 10 0
10 10 1 10 2
φ = 1.6 10 0
10
–3
10
–3
10 -1 β
-2
10
-3
–4 –4 10
10 10
–10 0 10 20 30 40 50 60 0 1 2
10 0 10 1 10 2 10 10 10
0 0 0
10 10 10
-1
10
α
–1 –1 10 -2
Log Probability
10 10
10 -3
–2 –2
10 10 10 0 10 1 10 2
φ = 1.9 10 0
10 -1
10
–3
10
–3
β
10 -2
-3
10
–4 –4
10 10
–10 0 10 20 30 40 50 60 70 10 0 10 1 10 2 10
0
10
1
10
2
freedom, W ∼ W N (C, m). It can be shown that, for any where σz2 = zT Cz In particular, if z = pl , and C =
given N-dimensional vector z: C T EMPLAT E it leads to:
l T
∀z ∈ N , zT Wz ∼ σz2 χm2 (21) p W pl ∼ λl χm2 (22)
J Comput Neurosci
Let assume that for the population vector Qt = should be considered separately: α(t) = QtT P(l) Qt and
[Q1t , . . . , Qit , . . . , Q Nt ]T the Qit are drawn from a β(t) = QtT D(l) Qt . First, α(t) is easily deduced from
multivariate normal distribution: where C is the co- Eq. (22)
variance matrix of the multivariate distribution (as the
columns of Q are by definition z-transformed, C is also
α(t) = QtT P(l) Qt (24)
the correlation matrix). From Eq. (13), Rl could be
T
written as: = pl Qt QtT . pl (25)
T
= pl Wt pl (26)
Rl (t) = QtT P(l) Qt − QtT D(l) Qt (23)
where Wt = Qt QtT follows a Wishart distribution with

where
2 D is a diagonal matrix whose elements are a degree of freedom of 1 such that α ∼ λl χ12 in the case
pli The two terms on the right side of Eq. (23) C = C T EMPLAT E .
i=1..N
(a) (b) (c)

PRE POST PRE POST
Average React. Strength

1 1 1 1 1
Actual Data
Log Probability
Random Normal Data

–1 –1 –1 –1
10 10 10 10
–2 –2 –2 –2 0.5
10 10 10 10
φ = 1.3
–3 –3 –3 –3
10 10 10 10
0
0 40 80 120 0 40 80 120 1 10 1 10 2 1 101 10 2 PRE POST
Reactivation Strength Reactivation Strength
PRE POST PRE POST

1 1 1 1 1
Log Probability
–1 –1 –1 –1
10 10 10 10
–2 –2 –2 –2
10 10 0.5
10 10
φ = 1.6
–3 –3 –3 –3
10 10 10 10
–4 –4
10 10 0
0 40 80 120 0 40 80 120 1 10 1 10 2 1 101 10 2 PRE POST
PRE POST PRE POST

1 1 1 1 1
Log Probability
–1 –1 –1 –1
10 10 10 10
–2 –2 –2 –2 0.5
10 10 10 10
φ = 1.9
–3 –3 –3 –3
10 10 10 10
0
0 40 80 120 0 40 80 120 1 10 1 10 2 1 101 10 2 PRE POST
Fig. 5 Distribution of the R measure during the MATCH epochs high encoding strength components (second and third rows). (c)
(PRE and POST). Data are from the same three principal Bar plot of average of the distributions of actual data shown
components as in Fig. 4 (AWAKE). (a) Distribution of R for in (a) and (b) for PRE (left) and POST (right). Note that the
PRE (left) and POST (right) sleep of the real data (black) and average reactivation strength is equal to γ − 1. The difference in
the theoretical expectation (red) derived from a Monte-Carlo the means seemed to be related to the difference in the tails of
sampling of Eq. (33) (n = 105 ). (b) Same plots as (a) but in log- the distributions
log scale showing a clear power law decay in sleep POST for
J Comput Neurosci
The “auto correlation” term β(t) is a weight- A common approximation (Imhof 1961) of a
ed sum of χ 2 distributed variables whose number of weighted sum of chi-squares is a gamma distribution
degrees of freedom is not known a priori: whose first two moments are the same as those of the
sum. For a gamma distribution Γk,θ of shape parameter
k and scale parameter θ, this gives (with the superscript
β(t) = QtT D(l) Qt (27) of pl omitted):

2
= pli Qit2 (28)
θ2
i
kθ = pi2 and k = pi4 (29)
2
(a) PRE POST

160 160
120 120 5
Sleep PRE
4 Sleep POST
1% of bins over 3.4% of bins over
Probability
80 99th percentile 80 99th percentile 3
of shuffled data. of shuffled data. 2 K-S test
p = 0.11
40 40 1
0
-1 -0.5 0 0.5 1 1.5
0 0 averaged z-score
–1 –0.5 0 0.5 1 1.5 –1 –0.5 0 0.5 1 1.5

averaged z-score averaged z-score
(b) (c)
POST bins over 99th percentile
3 of PRE actual distribution 0.7
Sleep PRE Shuffled
Sleep POST Shuffled
0.6
Sleep PRE Real
Average Reactivation
Sleep POST Real

0.5
Percentage (%)
2
0.4
0.3
1 expected
0.2
0.1
0 0
1 < Φ < 1.4 1.4 < Φ < 1.8 1.8 < Φ 1 < Φ < 1.4 1.4 < Φ < 1.8 1.8 < Φ
Fig. 6 Effect of instantaneous global fluctuations of firing rate for all time bins in PRE (blue) and POST (red) sleep. The two
on the reactivation strength measure. (a) For one principal com- distributions are not different. (b) Pooled data of the number of
ponent recorded during one session (third example of Figs. 1, POST bins exceeding the 99th percentile of PRE distributions.
4 and 5), the scatterplots show the dependence between the Signal components were grouped according to their encoding
instantaneous activation (expressed as the instantaneous z-score strength. The percentages were significantly over 1% for the
averaged over all recorded cells) and the corresponding reacti- three groups ( p < 0.05, t-test) and, individually, percentages
vation strength. In black, the actual data are shown, vs. the 99th were correlated with encoding strengths (r = 0.39, p < 10−12 ).
percentile of the shuffled control. The data pertain to the PRE (c) Pooled data from all principal components computed from all
epoch (left) and the POST epoch (right). In POST, but not in available recording sessions comparing the reactivation measure
PRE, a larger number (3.4%) of points than expected by chance average for sleep PRE and POST with shuffled measures for
is above the 99th percentile, showing that reactivation effects the two same epochs. The difference between PRE and POST
are not likely to be the product of activity fluctuations alone. sleep epochs was significant for the three signal groups ( p < 0.05,
Meanwhile, 4.5% of POST bins were above the 99th percentile of paired t-test), but not for shuffled measures. Furthermore, the
the PRE distribution. Right inset: Distribution of the averaged z- averages for shuffled data were one order of magnitude less than
score, measuring the degree of instantaneous population activity, actual measures
J Comput Neurosci

which leads to (recalling that pi2 = 1): In principle, the heavier tail of the reactivation
strength distribution during POST observed in Fig. 5
1
4 −1
k = θ −1 = pi (30) could result from an increase in variability over the
2 global population instantaneous firing rate. The stan-
hence, β is equivalent to1 dardization of the binned spike trains for each cell
(corresponding to the rows of the Q matrix), does
1
4 −1
β ∼ Γm,m−1 where m = pi (31) not prevent the instantaneous firing rate from varying
2 considerably, for example because of UP/DOWN states
Finally, if α and β are assumed independent, the bistability dominating cortical activity during sleep
theoretical distributions of RlT E−T E and RlM A−T E are: (Steriade 2006). In order to control for this possibil-
ity, we computed the reactivation strength from shuf-
RlT E−T E ∼ λl χ12 − Γm,m−1 (32)
fled data where, for each time bin, the identity of
RlM A−T E ∼ γlM ATCH χ12 − Γm,m−1 (33) the cells was randomly permuted. This shuffling pro-
cedure preserves the instantaneous global firing rate
This result leads to an important conclusion: even (and its fluctuations), but it destroys the patterns of co-
if α and β were correlated, the tail of the distribution activation. In Fig. 6(a), from the same session presented
could not be “heavier” or more skewed than an ex- in Fig. 2, the reactivation measure was computed for
ponential distribution. Nevertheless, as we shall see in one principal component while the eigenvector weights
the following section, experimental evidence shows that (or equivalently, the identities of the cells in the multi-
those distributions are actually power-laws. The distri- unit spike train ) were shuffled 1000 times. The 99th
bution of RT E−T E for the first eigenvectors of the 3 data percentile of the resulting distribution, for each time
sets presented in Fig. 1 are plotted in Fig. 4(a), against bin, is shown as the grey curve superimposed upon
the theoretical curve (red) and the result from multi- actual reactivation measure data (black dots). Those
variate normal data simulations (blue). The very same points are plotted as a function of the average firing
distributions are shown in log-log scales in Fig. 4(b) rate (in z-score) which represent the global activation of
to highlight the power-law tails of the distributions. the cell population. There is a relation between instan-
The higher the encoding strength (λ/λmax ), the better taneous global activation and the upper bound of the
the tail is fitted with a power-law (in other words the distribution of shuffled measures (the 99th percentile)
tail is linear in log-log plots). Figure 4(c) shows the which is similar in PRE and POST sleep. Nevertheless,
theoretical (under the multivariate normal hypothesis) while the actual reactivation measure remained within
and empirical distribution for the individual terms α the expected bounds in PRE sleep (only 1% of the bins
and β. exceeded the shuffled measure), the actual reactivation
measure largely exceeded this confidence interval in
3.4 Reactivation is a rare event POST sleep (3.4% of the bins were above the 99th
percentile). To check whether this could be due to a
The significant increase of the average of reactivation difference in global population activation, the PRE and
measures from PRE sleep to POST sleep (Fig. 3(c), POST sleep distribution of average z-score were com-
see also Peyrache et al. 2009) might not be the most pared (right inset) and, indeed, showed no difference
relevant parameter which changes with learning. In- (Kolmogorov-Smirnov test, p = 0.11).
deed, as shown in Fig. 2, the reactivation measure This difference in the tail of the distribution is very
shows prominent transient ‘spikes’ during POST sleep important for the excess reactivation strength in POST
associated with a simultaneous increase in firing of the with respect to PRE which we take as evidence for
cells associated with the highest weights in the principal memory replay. In the example of Fig. 6(a), 4.5% of
component. During POST, reactivation strength distri- the bins from POST exceeded the 99th percentile of the
butions deviate strongly from the multivariate normal PRE distribution. Hence, the difference in tails of PRE
case, and their tail can be well fit with a power law and POST distributions (as in the examples in Fig. 5)
(Fig. 5). Such deviation from the theoretical distribu- resulted in a higher probability for reactivation strength
tion is less marked during PRE, despite some hints of values from POST sleep to exceed the 99th percentile
power law behavior. of the PRE reactivation strength distribution than the
expected 1% (Fig. 6(b)) in an encoding strength de-
pendent manner: the percentage of “outliers” is signif-
1 Note that β ∼ (2m)−1 χ2m
2 such that β follows a normalized χ 2
icantly above chance for all groups of components and
4 −1
distribution whose degree of freedom is pi . it increases with encoding strength. Whereas average
J Comput Neurosci
0.3
1.8 < Φ (a)
Cumulative epoch reactivation
100
0.2
80 1st PC
2nd PC
0.1 1.4 < Φ < 1.8 60
1 < Φ < 1.4

0 40
Φ<1
0 1
10 10
20
Reactivation strength
Fig. 7 Cumulative average computed with Eq. (34) for compo- 0

nents of the whole data sets separated in the same four groups as
in Fig. 6. Black diamonds display the 99th percentile of the POST
distribution. This shows that for the highest encoding strength, -20
half of the difference between PRE and POST (represented by
the asymptotic value of each curve) is explained by only 1% of
40
the bins of POST sleep
35
actual reactivation strength differences between PRE

and POST (Fig. 6(c), significant for all groups, p < 0.05, 30
t-test) show the same profile as the increased number
of outliers in POST sleep (Fig. 6(b)), there was no dif- 25
Cells #
ference in mean of the shuffled reactivation strengths.

Furthermore, reactivation strengths for shuffled data 20
were on average one order of magnitude smaller than
reactivation strengths computed from actual data. 15
These brief, sharp increases in the reactivation
strength time course (Fig. 2, or similarly the outliers
10
in the distribution from POST) accounted for a large
part of the difference between the average reactivation
strengths. This can be seen in the cumulative contribu- 5
tions:
r
Rr−∞ = uP(u)du (34) 4726 4730 4734 4738 4742 4746
−∞ (b) Time (s)
whose difference between POST sleep and PRE sleep 0.05

is shown in Fig. 7. The patterns were grouped according
0.04
Fig. 8 Interaction between two simultaneous reactivation 0.03

Correlation
strengths. (a) Example of reactivation strength timecourse for

two signal components from the same session during SWS (top)
with the simultaneous cell activity (raster plot, bottom; each row 0.02
represents the spike train from one cell). The red and green
rasters (respective to the colored reactivation strength traces) 0.01
show the spike activity of the six cells associated with the highest
weights in each component (i.e. with the highest contribution,
see below). Other spike trains (in black) are displayed in random 0
order. Each peak of the reactivation strengths is associated with
a transient increase of firing rates of the cells with the highest
weights in the respective component. (b) Cross-correlogram of –0.01
–4 –2 0 2 4
the reactivation strengths for the two components during SWS
Time (s)
showing marked negativity at 0 lag
J Comput Neurosci
(a) (b)
30 20
Individual contribution on R2 (%)

20
1 st
PC 2nd PC
20
10 10
10
0 0
0
-0.2 0 0.2 0.4 -0.2 0 0.2 0.4 0 10 20 30

Cells’ weights in p1 Cells’ weights in p2 Individual contribution on R1 (%)
Fig. 9 Contribution of individual spike trains to the overall reac- for the first (resp. the second) principal components p1 (or p2 ).
tivation strength. (a) Contribution of all the cells recorded during (b) Scatter plot of the contributions to R1 (x-axis) and R2 shows
one session to the average of R1 (or R2 ), the reactivation strength that the sets of high-contribution cells for the two components
of the first (or second) component, as a function of PC weights are virtually disjoint
to encoding strength (Φ = λ/λmax ). For distributions principal vector. Nevertheless, each spike train partici-
P(u) with an exponential tail, this function will reach pates differently to any particular reactivation strength,
an asymptotic value, indicating that large values con- likely depending on its associated weight in the princi-
tribute little to the overall average. Diverging values of pal component. To quantify the contribution of the kth
Rr−∞ (e.g. ∝ log(r)), are indicative of a P(u) with a cell, the reactivation strength Rl−k was computed with
tail decaying with a power law. This function converges ∀t, Qkt = 0 , or by removing the terms depending on
asymptotically to a value equal to the difference of the Qkt in Eq. (13):
average reactivation strength between POST and PRE
(also equal to γ POST − γ PRE ). The black diamonds Rl−k (t) = Qit Pij(l) Q jt (35)
i, j:i= j
indicate the 99th percentile of the distribution of POST i=k, j=k
sleep reactivation distribution. Hence, up to half of the
difference (for the highest encoding strength) between then the contribution was defined as:

POST and PRE sleep average of reactivation strength 1 Rl−k t
is due to one percent of the time bins from POST Il =
k
1− (36)
2 Rl t
sleep, that is, the bins in which the transient reactivation
events took place. the normalization factor 1/2 has been derived from
simple calculation so that
3.5 Interactions between different cell assemblies
Ilk = 1 (37)
k
Different principal components, referring to the same
data, tend to activate at different times, and their Figure 9(a) shows an example of the distribution
activation is concomitant with the firing of indepen- of the contribution for two signal components in a
dent cell groups (Fig. 8(a)). Interestingly, as shown in single day. The joint distribution of individual cells’
Fig. 8(b), the time courses of R for the two principal contributions to those two patterns (Fig. 9(b)) indicates
components show a trough for zero-lag in the cross- no overlap between identities of highly influential cells.
correlation, showing that the simultaneous activation
of the two components was less likely than in the case
of uncorrelated time course. This effect was observed 4 Discussion
for all pairs of principal components compute from the
same sessions (Peyrache et al. 2009). This study shows that a simple and linear pattern
Peaks of R correspond to transient synchronization, separation method such as PCA can be powerful in the
or co-firing, of cells with same-signed weights in the identification and characterization of cell assemblies
J Comput Neurosci
in brain recordings. This is an important part of the determination of which of the principal components
study of the replay phenomenon, where two epochs in a given data set are likely to carry meaningful in-
must be compared, one in which assemblies would be formation; in the data considered here, up to 5 or 6
encoded, and another one, in which the same assem- PCs can be found in a simultaneous recording of 30–
blies might be replayed again. By construction, our 50 neurons (Fig. 1). This could be seen as a generaliza-
method is a simple extension of the seminal work tion in N dimensions of the classical Pearson test for
of Bruce McNaughton and co-workers (Wilson and pairwise correlations. The boundaries of the support of
McNaughton 1994; Kudrimoti et al. 1999), offering two the theoretical distribution for the eigenvalues, λmin and
important new features: first, a detailed time course for λmax , can be taken as the critical value for the rejec-
replay is obtained, at the scale of the chosen temporal tion of the null hypothesis. In the range of parameters
bin (100 ms in the present study). The resulting resolu- corresponding to our practical experimental situation
tion is much finer than what can be achieved if replay is (number of variables, N ∼ 50, number of time bins, b ∼
only measured from the similarity between the epoch- 104 for a bin size of 100 ms) these boundaries are sharp,
wide correlation matrices. This has important conse- as demonstrated by the Tracy-Widom estimate of the
quences for the study of the physiology of replay. In variance of the distribution of the largest eigenvalue
particular, we have found that replay takes place for the (Tracy and Widom 1994). Thus, an analysis procedure
most part in discrete, transient events (see e.g. Figs. 2 that considers as ‘signal’ the principal components as-
and 5), which correspond to the coordinated activation sociated to an eigenvalue larger than λmax is justified
of subgroups of cells. In fact, such transients mostly from the theoretical point of view. This allows us to
take place during UP states characteristic of slow-wave identify certain principal components as signal-carrying
sleep. These are periods of elevated, relatively steady cell assemblies.
activation, when measured at the level of the global In the next stage of the analysis, the R measure is
neuronal population. However, a very different time computed, measuring the time course of replay during
course is uncovered when we consider the dynamics the PRE and the POST epoch. In principle, replay
of subgroups of cells, defined by co-activations mea- could be the result of a continuous process, for example
sured during wakefulness: a avalanche-like dynamics one that modified the probability of co-activation of
(Beggs and Plenz 2003, 2004; Levina et al. 2007), which cell pairs, as a consequence of synaptic modification.
is embedded in a generally more regular population In this case, one would expect an exponentially tailed
dynamics. Moreover, a detailed view of temporal evolu- distribution for the R values. This was indeed verified
tion of replay has allowed to explore the links between analytically, under the hypothesis of multivariate, nor-
this phenomenon on one hand and hippocampal sharp mally distributed data.
waves (crucial for hippocampal replay (Kudrimoti et al. Reactivation strengths are greater than chance lev-
1999)) and UP-DOWN state transitions on the other, els both in PRE and POST sleep. This could be due
showing how replay is an integral part of hippocampal- to structural correlations, pre-encoded in the synaptic
cortical interactions and sleep physiology (Peyrache matrix. Such correlations would be present in ensemble
et al. 2009). activity at all times, both in spontaneous and in behav-
Second, PCA allows to tease apart the dynamics iorally evoked activity, and would not have to encode
of different cell assemblies, corresponding to different any task-relevant information. It is also possible that,
principal components. Interestingly, distinct subgroups during PRE sleep, the prefrontal cortex is already en-
tend to seldom reactivate at the same instant, suggest- gaged in processes anticipatory of the task. This could
ing that some sort of pattern separation mechanism explain the similarities between the activity in the PRE
may take place during sleep. Because the time courses and AWAKE epochs.
of the different principal components are un- (or Nevertheless, POST sleep shows a significantly
anti-)correlated (Fig. 8), separating them allows to re- greater degree of replay. This can be observed by em-
veal details of the temporal evolution which would pirically comparing the R distributions for PRE and
be otherwise averaged out, for example, the transients POST. Interestingly, most of the difference between
discussed above. PRE and POST is accounted for by the very large data
This measure also lends itself to rigorous mathemat- points in the tail of the distribution, so that, for the
ical analysis, making some inroads towards precisely principal components associated to the largest eigen-
defined null hypotheses to be tested against the experi- values, up to 50% of the difference is accounted for by
mental results. The known eigenvalue spectrum of cor- only the largest 1% of the points. It seems therefore
relation matrices from purely random data (Marčenko likely that the large transients in the replay measure
and Pastur 1967; Sengupta and Mitra 1999) allows are at least in part a consequence of replay. Moreover,
J Comput Neurosci
it is possible that during experience, synaptic plasticity Beggs, J. M., & Plenz, D. (2004). Neuronal avalanches are diverse
operates by modifying and strengthening existing cell and precise activity patterns that are stable for many hours in
cortical slice cultures. Journal of Neuroscience, 24(22), 5216–
assemblies (during gradual learning for example), as 5229.
opposed to creating new ones from scratch. This would Benchenane, K., Peyrache, A., Khamassi, M., Battaglia, F. P.,
also contribute to explain why reactivation strength for & Wiener, S. I. (2008). Coherence of theta rhythm be-
the same eigenvectors may be high both in PRE and tween hippocampus and medial prefrontal cortex modu-
lates prefrontal network activity during learning in rats. Soc
POST (albeit stronger in POST). Neuroscie Abstr, 690.15.
Our method, in its current version, has some lim- Biroli, G., Bouchaud, J. P., & Potters, M. (2007). On the top
itations. For one, it does not provide a way to di- eigenvalue of heavy-tailed random matrices. EPL (Euro-
rectly discount structural correlation patterns present physics Letters), 78(1), 10001+.
Bishop, C. M. (1995). Neural networks for pattern recognition.
in the PRE epoch from the templates extracted by New York: Oxford University Press.
the AWAKE epoch, which would obviate the need Bourlard, H., & Kamp, Y. (1988). Auto-association by multilayer
for a comparison of the empirical PRE and POST perceptrons and singular value decomposition. Biological
distributions. Also, it would be important to compute Cybernetics, 59(4), 291–294.
Buzsáki, G. (2004). Large-scale recording of neuronal ensembles.
analytical bounds for quantities under null hypotheses Nature Neuroscience, 7(5), 446–451.
less stringent than that of multivariate normal spike Chapin, J. K., & Nicolelis, M. A. (1999). Principal component
trains. Still this technique has already led to scientific analysis of neuronal ensemble activity reveals multidimen-
sional somatosensory representations. Journal of Neuro-
results of relevance (Peyrache et al. 2009): as another science Methods, 94(1), 121–140.
example of use of this technique, as mentioned above, Euston, D. R., Tatsuno, M., & Mcnaughton, B. L. (2007).
the sleep epoch can serve as a template for detecting Fast-forward playback of recent memory sequences in
matches in the awake epoch: we extracted patterns prefrontal cortex during sleep. Science, 318(5853), 1147–
1150.
from PCA applied to the POST epoch, and matched Fujisawa, S., Amarasingham, A., Harrison, M. T. T., & Buzsáki,
them on the activity during the AWAKE epoch. This G. (2008). Behavior-dependent short-term assembly dynam-
allowed us to assess which behavioral phases of the task ics in the medial prefrontal cortex. Nature Neuroscience, 11,
were represented the most in the sleep activity (and 823–833.
Geisler, C., Robbe, D., Zugaro, M., Sirota, A., & Buzsáki, G.
possibly, be preferentially consolidated). We concluded (2007). Hippocampal place cell assemblies are speed-
that this coincided with activity at the “choice point” of controlled oscillators. Proceedings of the National
the maze, i.e. the fork of the Y-maze where the animal Acadademy of Science of the United States of America,
104(19), 8149–8154.
had to commit to a potentially costly choice. Also,
Harris, K. D., Csicsvari, J., Hirase, H., Dragoi, G., & Buzsáki, G.
the effect depended upon learning: sleep-derived ac- (2003). Organization of cell assemblies in the hippocampus.
tivity patterns were more concentrated at the decision Nature, 424(6948), 552–556.
point after the rat acquired the rule governing reward Hebb, D. O. (1949). The organization of behavior: A neuropsy-
chological theory. New York: Wiley.
(Peyrache et al. 2009). These initial results provide Ikegaya, Y., Aaron, G., Cossart, R., Aronov, D., Lampl, I.,
hope that this method, for its relative simplicity and Ferster, D., et al. (2004). Synfire chains and cortical songs:
ease of approach with mathematical tools, may spur Temporal modules of cortical activity. Science, 304(5670),
further experimental and analytical work. 559–564.
Imhof, J. P. (1961). Computing the distribution of quadratic
forms in normal variables. Biometrika, 48(3–4), 419–
Acknowledgements We thank Dr. A. Aubry for interesting 426.
discussions. Supported by Fondation Fyssen (FPB), Fonda- Ji, D., & Wilson, M. A. (2006). Coordinated memory replay in the
tion pour la Recherche Medicale (AP), EC contracts FP6-IST visual cortex and hippocampus during sleep. Nature Neuro-
027819 (ICEA), FP6-IST-027140 (BACS), and FP6-IST-027017 science, 10(1), 100–107.
(NeuroProbes). Kudrimoti, H. S., Barnes, C. A., & McNaughton, B. L. (1999).
Reactivation of hippocampal cell assemblies: Effects of be-
Open Access This article is distributed under the terms of the havioral state, experience, and eeg dynamics. Journal of
Creative Commons Attribution Noncommercial License which Neuroscience, 19(10), 4090–4101.
permits any noncommercial use, distribution, and reproduction Lee, A. K., & Wilson, M. A. (2002). Memory of sequential expe-
in any medium, provided the original author(s) and source are rience in the hippocampus during slow wave sleep. Neuron,
credited. 36(6), 1183–1194.
Levina, A., Herrmann, J. M., & Geisel, T. (2007). Dynamical
synapses causing self-organized criticality in neural networks
Nature Physics, 3(12), 857–860.
Louie, K., & Wilson, M. A. (2001). Temporally structured replay
References of awake hippocampal ensemble activity during rapid eye
movement sleep. Neuron, 29(1), 145–156.
Beggs, J. M., & Plenz, D. (2003). Neuronal avalanches in neo- Marr, D. (1971). Simple memory: A theory for archicortex. Philo-
cortical circuits. Journal of Neuroscience, 23(35), 11167– sophical transactions of the Royal Society of London. Series
11177. B, Biological Sciences, 262(841), 23–81.
J Comput Neurosci
Marčenko, V. A., & Pastur, L. A. (1967). Distribution of eigen- Neuroscience (Advanced Online Publication). doi:10.1038/
values for some sets of random matrices. Mathematics of the nn.2337.
USSR-Sbornik, 1(4), 457–483. Ribeiro, S., Gervasoni, D., Soares, E. S., Zhou, Y., Lin, S. C.,
McNaughton, B. L., O’Keefe, J., & Barnes, C. A. (1983). The Pantoja, J., et al. (2004). Long-lasting novelty-induced neu-
stereotrode: A new technique for simultaneous isolation of ronal reverberation during slow-wave sleep in multiple fore-
several single units in the central nervous system from mul- brain areas. PLoS Biology, 2(1), e24.
tiple unit records. Journal of Neuroscience Methods, 8(4), Riehle, A., Grun, S., Diesmann, M., & Aertsen, A. (1997).
391–397. Spike synchronization and rate modulation differentially in-
Mokeichev, A., Okun, M., Barak, O., Katz, Y., Ben-Shahar, O., & volved in motor cortical function. Science, 278(5345), 1950–
Lampl, I. (2007). Stochastic emergence of repeating cortical 1953.
motifs in spontaneous membrane potential fluctuations in Scoville, W. B., & Milner, B. (1957). Loss of recent memory
vivo. Neuron, 53(3), 413–425. after bilateral hippocampal lesions. Journal of Neurology,
Nádasdy, Z., Hirase, H., Czurkó, A., Csicsvari, J., & Buzsáki, Neurosurgery and Psychiatry, 20(11), 11–21.
G. (1999). Replay and time compression of recurring spike Sengupta, A. M., & Mitra, P. P. (1999). Distributions of singular
sequences in the hippocampus. Journal of Neuroscience, values for some random matrices. Physical Review E, 60(3),
19(21), 9497–9507. 3389+.
Nadel, L., & Moscovitch, M. (1997). Memory consolidation, ret- Squire, L. R., & Zola-Morgan, S. (1991). The medial temporal
rograde amnesia and the hippocampal complex. Current lobe memory system. Science, 253(5026), 1380–1386.
Opinion in Neurobiology, 7(2), 217–227. Steriade, M. (2006). Grouping of brain rhythms in corticothala-
Nicolelis, M. A., Baccala, L. A., Lin, R. C., & Chapin, J. K. (1995). mic systems. Neuroscience, 137, 1087–1106.
Sensorimotor encoding by synchronous neural ensemble ac- Tracy, C., & Widom, H. (1994). Level-spacing distributions and
tivity at multiple levels of the somatosensory system. Science, the airy kernel. Communications in Mathematical Physics,
268(5215), 1353–1358. 159(1), 151–174.
Oja, E. (1982). A simplified neuron model as a principal compo- Wigner, E. P. (1955). Characteristic vectors of bordered matrices
nent analyzer. Journal of Mathematical Biology, 15(3), 267– with infinite dimensions. The Annals of Mathematics, 62(3),
273. 548–564.
Pipa, G., Wheeler, D. W., Singer, W., & Nikolić, D. (2008). Wilson, M. A., & McNaughton, B. L. (1994). Reactivation
Neuroxidence: Reliable and efficient analysis of an excess of hippocampal ensemble memories during sleep. Science,
or deficiency of joint-spike events. Journal of Computational 265(5172), 676–679.
Neuroscience, 25(1), 64–88. Wilson, R. I., & Laurent, G. (2005). Role of gabaergic inhi-
Peyrache, A., Khamassi, M., Benchenane, K., Wiener, S. I., bition in shaping odor-evoked spatiotemporal patterns in
& Battaglia, F. P. (2009). Replay of rule learning-related the drosophila antennal lobe. The Journal of Neuroscience,
patterns in the prefrontal cortex during sleep. Nature 25(40), 9069–9079.

TMP 1 B18

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

TMP 1 B18

Uploaded by

Copyright:

Available Formats

J Comput Neurosci

Principal component analysis of ensemble recordings

Received: 22 December 2008 / Revised: 10 March 2009 / Accepted: 8 April 2009

Ensemble recordings, or the simultaneous recordings

2.3 Isolation of neural patterns 2.4 Time course of template matching

 M M A−T E = CijT EMPLAT E CijM ATCH (5)

Here, b is the bin width (b = T/B). Hereafter, for 1  M ATCH T  T EMPLAT E 

× QitM ATCH CijT EMPLAT E Q M 1

R0M A−T E (t) = QitM ATCH CijT EMPLAT E Q M

Fig. 1 Evidence for signal

for each of the same 3 data

sets as for panels in A after 10

Sleep PRE Sleep POST

0 400 800 1200 1600 0 400 800 1200 1600

(a) PRE POST (b)

1.8 1.8 r = 0.61 1.8

1 1.5 2 2.5 1 1.5 2 2.5 1 1.4 1.8 2.2

(a) (b) (c)

Reactivation strength Reactivation strength

Reactivation strength Reactivation strength

where Wt = Qt QtT follows a Wishart distribution with

(a) (b) (c)

Average React. Strength

Random Normal Data

PRE POST PRE POST

Average React. Strength

PRE POST PRE POST

(a) PRE POST

–1 –0.5 0 0.5 1 1.5 –1 –0.5 0 0.5 1 1.5

Sleep POST Real

1 < Φ < 1.4

Fig. 7 Cumulative average computed with Eq. (34) for compo- 0

actual reactivation strength differences between PRE

ference in mean of the shuffled reactivation strengths.

whose difference between POST sleep and PRE sleep 0.05

Fig. 8 Interaction between two simultaneous reactivation 0.03

strengths. (a) Example of reactivation strength timecourse for

Individual contribution on R2 (%)

Individual contribution on R2 (%)

-0.2 0 0.2 0.4 -0.2 0 0.2 0.4 0 10 20 30

You might also like

M M A−T E = CijT EMPLAT E CijM ATCH (5)

Here, b is the bin width (b = T/B). Hereafter, for 1 M ATCH T T EMPLAT E