RP Gub 13 06

c
2011
by Taejeong Kim
Gaussian process
Gaussian process: For any choice of t1, , tk ,
X = (Xt1 , , Xtk )t is a Gaussian random vector.
A Gaussian random process is fully characterized by its 1st
and 2nd moment, ie, by mX (t) and RX (t, s).
Any linear or affine transformation of a Gaussian random
process is Gaussian, eg, integration, differentiation, and stable linear filtering.
If samples of a Gaussian random process are uncorrelated,
they are independent.
If a Gaussian random process is wss, it is sss.
A stationary Gaussian process with arbitrary acf or psd can
be obtained by filtering a white Gaussian noise.
c
2011
by Taejeong Kim
jointly Gaussian processes Xt and Yt: For any choice of

t1, , tk and s1, , sl , (Xt1 , , Xtk , Ys1 , , Ysl )t is a Gaussian random vector.
Jointly Gaussian random processes are fully characterized by
their 1st and 2nd moments, ie, by mX (t), mY (t), RX (t, s),
RY (t, s), and RXY (t, s).
Any linear or affine transformation of jointly Gaussian random processes is Gaussian.
If two jointly Gaussian random processes are uncorrelated,
they are independent.
If jointly Gaussian random processes are jwss, they are jsss.
c
2011
by Taejeong Kim
White noise
white noise Xt: wss process with SX (f ) = N20 watts/Hz
mX (t) = 0, RX ( ) = N20 ( )
> 0, Xt and Xt+ are uncorrelated.
Z
N0 df = : infinite average power
2
PX = EXt =
2
If Xt is Gaussian, > 0, Xt and Xt+ are independent.
Thermal noise: Gaussian, SX (f ) = N20
|f |/f0
,
exp(|f |/f0)1
N0 = kT = 3.77 1021, k: Boltzmann constant

f0 = kT /h = 5.69 THz, h: Plank constant
Xt and Xt+ are independent if > a few pico-seconds.
discrete-time white noise: uncorrelated identical rvs
6= samples of continuous-time white noise.
c
2011
by Taejeong Kim
Matched filter
We discuss cont-time cases; disc-time cases are similar.
Consider detecting a deterministic signal v(t) among a wss random noise Xt with psd SX (f ).
v(t) + Xt
h(t)
w(t) + Yt
t0
H1
H0
w(t) + Yt =
h(t )(v( ) + X )d
Z
w(t) =
h(t )v( )d
Z
Yt =
h(t )X d
goal: Find h(t) that maximizes the signal-to-noise ratio (SNR)
at t0: R = |w(t0)|2/EYt20 .
c
2011
by Taejeong Kim
H(f ), V (f ), W (f ): Fourier transforms of h(t), v(t), w(t)

Z
j2f t0
w(t0) =
W
(f
)e
df
Z
j2f t0
=
H(f
)V
(f
)e
df
H(f )
V
(f )ej2f t0
df
SX (f )
SX (f )
V (f )ej2f t0
H(f )
|w(t0)|
SX (f )
2
|H(f
)|
SX (f )df
SX (f )
df
|V (f )|2
df
SX (f )
[Schwarz ineq: | xtyt dt| |xt| dt |yt|2dt]

=
EYt20
SY (f )df
|V (f )|2
df
SX (f )
|V (f )|2
df
SX (f )
c
2011
by Taejeong Kim
2
Z
|w(t0)|2
|V (f )|
R=
S (f ) df
2
EYt0
X
V
(f )ej2f t0
Equality holds if and only if H(f ) SX (f ) = a
SX (f )
s
V (f )ej2f t0
or H(f ) = a
for some constant a
SX (f )
: matched filter, matched to the input signal

It emphasizes the frequency band where the signal exceeds
the noise, while it suppresses the band where the noise exceeds the signal.
|a V (f )|2
SY (f ) = |H(f )| SX (f ) = S (f )
X
2
Z
|a V (f )| j2f
RY ( ) = S (f ) e
df
X
2
c
2011
by Taejeong Kim
7
Z
j2f t
w(t) =
df
W (f )e
Z
j2f t
=
df
H(f )V (f )e
V (f )ej2f t0
j2f t
=
V
(f
)e
df
SX (f )
Z
|V (f )|2 j2f (tt0)
w(t) = a S (f ) e
df = a1 RY (t t0)
X
EYt2
1
w(t0) = a RY (0) = a : maximum at t0
EYt2
R= 2
|a|
Z
When Xt is Gaussian and the decision is made by thresholding

the output, the matched filter also minimizes the probability of
decision error with a proper threshold level.
c
2011
by Taejeong Kim
If Xt is white with psd SX (f ) = N20 ,
V (f )ej2t0
H(f ) = a S (f )
X
2a V (f )ej2t0
=N
0
2a v(t t)
h(t) = N
0
0
HH
v(t)
HH
HH
H
H
h(t)
w(t0) + Yt0 =
=
t0
h(t0 )(v( ) + X )d
2a Z v( )(v( ) + X )d : time
N0
correlation
Z
Z
2a
2
= N ( v ( )d +
v( )X d )
This shows that the signal energy is fully utilized.
c
2011
by Taejeong Kim
Wiener filter
Consider estimating a random signal Vt from a random observation signal Ut. They are assumed jwss with zero mean and
known psds and cross psd. We discuss continuous-time cases.
Ut
h(t)
Vt
goal: Find h(t) that minimizes the mean-squared error (mse)

E|Vt Vt|2.
Z
Z
Vt = h(t )U d =
h()Ut d: optimal
Z
Z

Vt =
h(t )U d = h()U
t d: another
E|Vt Vt|2 E|Vt
Vt|2 for
any h(t).
c
2011
by Taejeong Kim
10
Vt6
Vt Vt
9
Q
Q
s
Q
Vt
Vt
subspace of all the processes

obtained by linear-filtering
of Ut
orthogonality principle:
Z
E Vt(Vt Vt) = 0 for all Vt =

h()Ut d or for all h(t)
Vt is an mmse estimation.
c
2011
by Taejeong Kim
11
proof: E|Vt Vt|2 = E|(Vt Vt) + (Vt Vt)|2

= E|Vt Vt|2 + E|Vt Vt|2 + 2E(Vt Vt)(Vt Vt)
E|Vt Vt|2 if the last term disappears.
Z
Z
Since Vt Vt = h()Ut d
h()Ut d
Z
= (h() h())U
t d,
it is in the subspace.
E(Vt Vt)(Vt Vt) = 0 by orthogonality.
optimal filter, Wiener filter:
Z
0 = E Vt(Vt Vt) = E(Vt Vt)

h()U
t d [orthogonality]
=E
h()(Vt Vt)Ut d
=
h()(EV
U
E
V
t
t
tUt )d
=
h()(RV U () RV U ())d
Z
c
2011
by Taejeong Kim
12
For this to hold for any h(),

we must have RV U () = RV U ().
RV U ( ) = RV U ( ) = RU ( ) h( )
SV U (f ) = SU (f )H(f )
SV U (f )
H(f ) =
SU (f )
If Ut = Vt + Xt, where Xt is a noise uncorrelated with Vt,
SV U (f ) = SV (f ) + SV X (f ) = SV (f )
SU (f ) = SV (f ) + SX (f ) + SV X (f ) + SXV (f ) = SV (f ) + SX (f )
SV (f )
H(f ) =
SV (f ) + SX (f )
The derivation goes parallel for discrete-time cases.
c
2011
by Taejeong Kim
13
Binomial counting process and random walk

counting process Nt:
(1) Nt = 0, t 0; (2) integer-valued; (3) non-decreasing
binomial counting process Nn: discrete-time process;
X
Nn = ni=1 Xi, where Xi is a Bernoulli process.
u u u
u u u u u u
u u
u u u u u
u u u
u
-
u
u
u
u
u
u
u
u
u
u
u
u
u
u -
random walk Nn: discrete-time process;

X
Nn = ni=1 Xi, where Xi is a modified (1) Bernoulli process.
c
2011
by Taejeong Kim
14
Poisson process
Poisson process Nt: continuous-time counting process with
(t)k t
P (Nt = k) = pNt (k) = k! e
for some > 0, t 0.
: arrival rate, the number of arrivals per unit time

In n repeated Bernoulli trials with the success probability p =
t , for a fixed constant , let the random variable K be the
n
n
number of successes. Then EKn = , and as n , pKn (k)
converges to the pmf of Poi(t).
0
Consider dividing [0, t] into n subintervals and a Bernoulli

trial is executed on each subinterval. Then Kn is a binomial
dist
counting process, and Kn
Nt as n .
c
2011
by Taejeong Kim
15
We have seen a proof before.
n
t(z1)
t
t
alternate proof: GKn (z) = 1 n + n z = 1 + n
et(z1) [Poi(t)] as n .
A Poisson process is also defined for t 0 using increments:
1. N0 = 0.
2. For s < t, Nt Ns is a Poisson rv with mean (t s).
3. For t1 < t2 < < tn, the increments
Nt2 Nt1 , Nt3 Nt2 , , Ntn1 Ntn are independent
: independent increment process
c
2011
by Taejeong Kim
16
Poisson process
c
2011
by Taejeong Kim
17
For an independent increment process with a known initial

value, its joint probabilities can readily be computed.
pNt Ntn (k1, , kn)
1
= P (Nt1N0 = k1 0, Nt2Nt1 = k2 k1, , NtnNtn1 = kn kn1)

= P (Nt1N0 = k1 0)P (Nt2Nt1 = k2 k1) P (NtnNtn1 = kn kn1)
((t10))k10 (t10) ((t2t1))k2k1 (t2t1)
= (k 0)! e
(k
k
)!
1
2
1
((tntn1))knkn1 (tntn1)
e
(knkn1)!
c
2011
by Taejeong Kim
moments: ENt = t, var(Nt) = t: non-stationary

For t > s,
ENtNs = E(Nt Ns + Ns)Ns = E(Nt Ns)Ns + ENs2
= E(Nt Ns)ENs + ENs2
= (t s) s + (s)2 + s = (t)(s) + s
cov(Nt, Ns) = ENtNs ENtENs = s
RN (t, s) = ENtNs = (t)(s) + min(t, s)
CN (t, s) = cov(Nt, Ns) = min(t, s)
18
c
2011
by Taejeong Kim
19
arrival time: Tk := min{t > 0 : Nt k}

1 FTk (t) = P (Tk > t) = P (Nt < k)
=
k1
i=0
(t)i t
e
i!
Nt
k1
i=1
= e
i!
k1
i=1
(t)i
Tk
et + i! ()et
i1
(t)
= et
d (1 F (t))
fTk (t) = dt
Tk
i
d Xk1 (t) et
= dt
i=0 i!
i1
i(t)
(t)i
i!
(i1)!
(t)k1et
= (k1)! , t 0 Erl(k, )
c
2011
by Taejeong Kim
inter-arrival time:
X1, X2, X3, , where Xk = Tk Tk1
Nt
Xk
20
-
Tk1 = t t+x
P (Xk x) = 1 P (Xk > x)
= 1 P (Xk > x|X1 = x1, , Xk1 = xk1)
= 1 P (Nt+x Nt = 0|X1 = x1, , Xk1 = xk1),
where t = Tk1 = x1 + + xk1: (k 1)th arrival time
(x)0 x
= 1 P (Nt+x Nt = 0) = 1 0! e , x > 0
0,
x0
1 ex, x > 0
=
Xk exp()
0,
x0
A counting process with iid exp() interarrival times is a Poisson process of rate .
c
2011
by Taejeong Kim
21
counting process binomial Poisson

arrival time
Pascal
Erlang
inter-arrival time geometric exponential
(superposition of Poisson processes): Lt and Mt are Poisson

processes, independent of each other, with respective rates
and . Then Nt = Lt + Mt is a Poisson process with rate
+ .
(decomposition of a Poisson process): Nt is a Poisson process
with rate ; Xk is a Bernoulli process, independent of Nt, with
success probability p; Lt and Mt are counting processes defined
such that Nt = Lt + Mt, where the n-th arrival of Nt induces
either an arrival of Lt if Xk = 0 or an arrival of Mt if Xk = 1.
Then Lt and Mt are Poisson processes with respective rates
(1 p) and p, and are independent of each other.
c
2011
by Taejeong Kim
22
superposition and decomposition of Poisson processes

Lt
Mt
Nt
Xk
u
u
uu
u
u
-
c
2011
by Taejeong Kim
23
Wiener process
A Wiener process Wt, also called Brownian motion, describes
the motion of a highly excited particle in a fluid, viewed in one
coordinate, that does not drift off in one direction.
Wiener process for t 0:
1. W0 = 0.
2. For s < t, Wt Ws is a Gaussian random variable with
mean zero and variance 2(t s).
3. For t1 < t2 < < tk , the increments Wt2 Wt1 , Wt3 Wt2 ,
, Wtk1 Wtk are independent: independent increment
process.
4. Each sample path is a continuous function of t.
Wt =
t
0 X d ,
t 0, where Xt is a white Gaussian process.
c
2011
by Taejeong Kim
24
The sample paths of Wt are continuous everywhere but differentiable nowhere.

For any c > 0, Vt = 1 Wc t is an identical Wiener process as
c
Wt. self-similar, fractal
moments: EWt = 0, var(Wt) = 2t
For t > s,
EWtWs = E(Wt Ws + Ws)Ws = E(Wt Ws)Ws + EWs2
= E(Wt Ws)EWs + EWs2 = 2s
cov(Wt, Ws) = EWtWs EWtEWs = 2s
RW (t, s) = EWtWs = 2 min(t, s)
CW (t, s) = cov(Wt, Ws) = 2 min(t, s)
Wiener processes are nonstationary.
c
2011
by Taejeong Kim
25
random walk approximation:

X1, X2, is an equiprobable modified Bernoulli process with
values +1 and 1.
Sn :=
n
i=1 Xi:
symmetric random walk
Wt(n) := 1 Sbntc,
n
where b c is the greatest integer no greater than .

Wt(n)
1
n
1
n
?
6
-
c
2011
by Taejeong Kim
26
As n ,
1. The power of the process is maintained.
2. By the central limit theorem, Wt(n) converges in distribution
to a Gaussian process.
3. As the random walk is an independent increment process, so
is its limit process.
4. Wt(n) eventually becomes a Wiener process.
If the random walk is replaced by a binomial counting process, the limit process is a drifting Wiener process.
c
2011
by Taejeong Kim
27
Markov process
We discuss jointly continuous cases; jointly discrete cases are similar.
Markov property:
For any t1 < t2 < < tn and x1, , xn,
fXtn |Xt Xt (xn|x1, , xn1) = fXtn |Xt (xn|xn1)
1
n1
n1
t1
tn2
tn1 tn
Markov process: a process with the Markov property
c
2011
by Taejeong Kim
28
fXt Xtn (x1, , xn)

1
= fXt (x1)fXt |Xt (x2|x1)fXt
1
fXtn |Xt
|Xt Xt (x3|x1, x2)

3
1 2
(xn|x1, , xn1)
Xt
1
n1
= fXt (x1)fXt
1
(xn|xn1)
|Xt (x2|x1)fXt |Xt (x3|x2) fXtn |Xt
2
1
3
2
n1
examples: binomial counting process, random walk, Poisson

process, Wiener process
Discrete-valued Markov processes are called Markov chains.
equivalence: Markov property is equivalent to the conditional independence of the past and the future given the present.
conditional independence: fXY |Z (x, y|z) = fX|Z (x|z)fY |Z (y|z)
c
2011
by Taejeong Kim
29
conditional independence of the past and the future given

the present: For t1 < t2 < < tn < < tn+k ,
fXt Xt Xt Xt |Xtn (x1, , xn1, xn+1, , xn+k |xn)
1
n1
= fX t
fXt
n+1
n+k
Xt
|X (x1, , xn1|xn)
n1 tn
n+1
Xt
|Xtn (xn+1, , xn+k |xn)
n+k
t1
tn1
tn tn+1
The equivalence implies that the time-reversed Markov process is also Markov.
c
2011
by Taejeong Kim
30
proof of the theorem: arguments are suppressed

Markov property conditional independence:
fXt Xt Xt Xt |Xtn
1
n1
= fX t
n+1
n+k
Xt
|X fXt
Xt
|X Xt
X
n1 tn
n+1
n1 tn
n+k t1
= fX t
Xt
|Xtn (fXt
|X Xtn
1
n1
n+1 t1
= fX t
Xt
|Xtn (fXt
|X
1
n1
n+1 tn
fX t
n+1
[ch: chain rule]
fX t
) [ch]
|Xt Xt
1
n+k
n+k1
fXt
n+k
|Xt
n+k1
) [Mp]
|X
Xt
n+k tn
= fX t
|Xtn
fX t
= fX t
|Xtn
fX t
n+1
n+1
n+2
|Xtn Xt
n+1
|X
n+2 tn+1
fXt
fX t
|X Xt
n+k tn
n+k1
|X
n+k tn+k1
[Mp]
[ch]
c
2011
by Taejeong Kim
31
conditional independence Markov property:

fXt Xt Xtn |Xt
= fXt Xt |Xt fXtn |Xt Xt
1
[ch]
fXt Xt
1
n2
n2
n1
Xtn |Xt
n1
= fX t
n2
n1
Xt
|X
1
n2 tn1
X
n2 tn1
fXtn |Xt
n1
[ci]
An independent increment process with known initial state X0

is Markov.
proof: fXtn |Xt Xt (xn|x1, , xn1)
1
n1
= fXtn Xt
(xn
|Xt Xt
1
n1
= fXtn Xt
(xn
|Xt
n1
n1
n1
xn1|x1, , xn1)
xn1|xn1) = fXtn |Xt
n1
(xn|xn1)
example: binomial counting process, random walk, Poisson

process, Wiener process
c
2011
by Taejeong Kim
32
Chapman-Kolmogorov equation for a Markov process:

for t1 t2 t3,
Z
fXt |Xt (x3|x1) = fXt |Xt (x2|x1)fXt |Xt (x3|x2)dx2
1
t1
proof: fXt
Z
|Xt (x3|x1)
3
1
t2
= fX t
t3
Xt |Xt (x2, x3|x1)dx2

2 3
1
= fX t
Z
|Xt (x2|x1)fXt |Xt Xt (x3|x1, x2)dx2

2
1
3
1 2
= fX t
|Xt (x2|x1)fXt |Xt (x3|x2)dx2

2
1
3
2
[Mp]
[ch]
[marginal]
c
2011
by Taejeong Kim
33
|Xt (xk |x1)

1
k
fX t
Z
= fX t
(xk |xk1) fXt |Xt (x2|x1)dx2 dxk1

|Xt
2
1
k
k1
For Markov Xt and t1 < t2 < t3,

E(E(Xt3 |Xt2 )|Xt1 ) = E(Xt3 |Xt1 ).
proof:
Xt1
E(E(Xt3 |Xt2 )|Xt1 = x1)
Z
= E(Xt3 |Xt2 = x2)fXt |Xt (x2|x1)dx2
Z
= [ x 3 fX t
Z
= x3[ fXt
= x3fXt
|Xt (x3|x2)dx3]fXt |Xt (x2|x1)dx2

2
2
1
|Xt (x3|x2)fXt |Xt (x2|x1)dx2]dx3
2
2
1
|Xt (x3|x1)dx3
1
= E(Xt3 |Xt1 = x1)
[C-K eqn]
Xt2
Xt 3
c
2011
by Taejeong Kim
34
homogeneous Markov process:

fXt|Xs (u|v) is shift invariant,
ie, , fXt+ |Xs+ (u|v) = fXt|Xs (u|v).
homogeneous 6= stationary
A discrete-time homogeneous Markov chain (discrete-valued
Markov process) is characterized by a state transition diagram, which includes states and transition probabilities.
state: the value of the random process, often an integer
transition probability: pij = P (Xn+1 = j|Xn = i)
c
2011
by Taejeong Kim
35
example: random walk

p
p
j
Y
1p
-1
1p
1p
1p
example:
0.2
0.1
q
0.8 gry/2
0.3
0.4
blu/1
0.2
0.5
blk/3 i 0.9
0.5
brn/4
0.1
j
Y
1p
c
2011
by Taejeong Kim
36
transition matrix P : the matrix whose (i, j)th entry is pij

example: random walk
..
..
..
..
1p 0
p
0
0 1p 0
p
0
0 1p 0
0
0
0 1p
..
..
..
example:
0
0.2
0
0.2
0.1
0.8
0
0.3
0.4
0
0.9
0.5
0.5
0.1
..
..
..
0
0
p
0
0
0
0
p
..
..
c
2011
by Taejeong Kim
37
j pij = 1
(2)
pij := [P 2]ij
= k pik pkj
X
= k P (Xn+1 = k|Xn = i)P (Xn+2 = j|Xn+1 = k)
= P (Xn+2 = j|Xn = i): Chapman-Kolmogorov equation
(m)
pij := [P m]ij = P (Xn+m = j|Xn = i)

(n+m)
pij
(n) (m)
k pik pkj : Chapman-Kolmogorov equation

(m)
i pij P (Xn
m
P (Xn+m = j) =
= i)
p(n+m) = p(n)P , where
p(n) := (P (Xn = 1), P (Xn = 2), P (Xn = 3), ) is
the marginal pmf of Xn expressed as a row vector.
p(n) = p(0)P n
c
2011
by Taejeong Kim
38
stationary distribution for P : = (1, 2, 3, )

X
that satisfies = P and i i = 1.
If exists for P of a Markov chain,
1
1
n
limn P =
1
..
2
2
2
..
..
limn p(n) = limn p(0)P n = regardless of p(0).

The Markov Chain is asymptotically stationary if exists.
c
2011
by Taejeong Kim
39
Gauss Markov process

A Gaussian process Xt is Markov if and only if
for any t1 < t2 < t3, CX (t1, t3) = CX (tC1,t2(t)C,tX (t) 2,t3) .
X 2 2
Note that CX (t2, t2) = var(Xt2 ) and that this includes both
discrete and continuous time processes.
proof of only if part:
Recall if X and Y are jointly Gaussian,
E(X|Y ) = mX + X (Y mY ).
Y
For Gaussian Xt,
CX (t1,t2)
E(Xt2 |Xt1 ) = mX (t2) + C
[X mX (t1)].
(t ,t ) t1
X 1 1
c
2011
by Taejeong Kim
40
For Markov Xt,

E(E(Xt3 |Xt2 )|Xt1 ) = E(Xt3 |Xt1 ).
E(E(Xt3 |Xt2 )|Xt1 )
CX (t2,t3)
= E mX (t3) + C
[X
m
(t
)]|X
t
X
2
t
2
1
(t ,t )
X 2 2
X (t2,t3) [E(X |X ) m (t )]
= mX (t3) + C
t2
t1
X 2
C (t ,t )
X 2 2
= mX (t3)
CX (t1,t2)
X (t2,t3)
m
+C
(t
)
+
[X
m
(t
)]
m
(t
)
X
2
t
X
1
X
2
1
C (t ,t )
C (t ,t )
X 2 2
X 1 1
X (t2,t3) CX (t1,t2) [X m (t )]
= mX (t3) + C
X 1
CX (t2,t2) CX (t1,t1) t1
X (t1,t3) [X m (t )],
Since E(Xt3 |Xt1 ) = mX (t3) + C
X 1
C (t ,t ) t1
X 1 1
the proof is complete. if part is not given.
c
2011
by Taejeong Kim
41
example: Wiener process: CX (t, s) = 2 min(t, s)

example: Ornstein-Uhlenbeck process: mX (t) = 0,
CX (t, s) = 2e|ts|, ie, CX ( ) = 2e| |
Compare with the random telegraph process.
Ornstein-Uhlenbeck process is the only continuous-time zeromean stationary Gauss Markov process.
If Xt is stationary, the equation of the theorem implies exponential CX ( ), therefore suggesting the proof of the if part.
1. C(1 + 2) = 12 C(1)C(2) for 1, 2 > 0
2. C( ) is even
C( ) = 2e| |, where = ln 2 ln C(1)
c
2011
by Taejeong Kim
42
Autoregressive and moving average process

Autoregressive (AR) and moving average (MA) processes are
discrete-time processes generated by filtering iid processes.
1st order AR process:
Xn = aXn1 + Wn, where Wn is iid.
1st order all-pole filter output when Wn is the input.
Wn
Markov
Xn
c
2011
by Taejeong Kim
43
If stationary and zero mean,
2
X
2
2
a2X
+W
2
X
2
W
=
1a2
2
RX (1) = EXnXn1 = E(aXn1 + Wn)Xn1 = aX
2
RX (2) = EXnXn2 = E(aXn1 + Wn)Xn2 = a2X
2
RX ( ) = EXnXn = E(aXn1 + Wn)Xn = a X
X1
1
a
a
X2
a
1
a
2
2
a
a
1
covariance matrix of X3 = X
..
..
..
..
Xn
an1 an2 an3
n1
a
n2
a
n3
a
..
c
2011
by Taejeong Kim
44
kth order AR process:

Xn = a1Xn1 + + ak Xnk + Wn, where Wn is iid.
kth order all-pole filter output when Wn is the input.
all-pole filter, 4th-order
Wn
6
1
6
2
6
3
a4
Xn
all-zero filter, 4th-order

Wn
b0
D
-
b1
-
b2
-
b3
-
b4
Xn
c
2011
by Taejeong Kim
45
A kth order AR process is kth order Markov.

kth order Markov:
fXn|X1Xn1 (xn|x1 xn1)
= fXn|Xnk Xn1 (xn|xnk xn1)
Equivalently, the past and the future are conditionally independent given k consecutive samples.
The psd of an AR process is rational in ej2f .
SX (f ) =
SX (f ) =
2
W
2
W
2
1
: first order
j2f
1ae
1
: k-th order
j2f
j2f
k
1a1e
ak e
If Wn is Gaussian, so is the AR process.

AR modeling of speech is widely used in a variety of applications such as speech coding and speech synthesis.
c
2011
by Taejeong Kim
46
kth order MA process:

Xn = b0Wn + b1Wn1 + + bk Wnk , where Wn is iid.
kth order all-zero filter output when Wn is the input.
not Markov
The psd of an MA process is rational in ej2f .
j2f
j2f k 2
2
+ + bk e
SX (f ) = W b0 + b1e
: kth order
(k, l)th order ARMA process:
Xn = a1Xn1 + + ak Xnk + b0Wn + + bl Wnl ,
where Wn is iid.
(k, l)-th order pole-zero filter output when Wn is the input.
c
2011
by Taejeong Kim
47
pole-zero filter, (4,4)th-order

-
Wn
b0
D
D
D
b2

6
6
2
b3

6
6
3
b1
-

6
6
1

6
6
a
a
a
pole-zero filter, (4,4)th-order
D
?
D
?
D
?
Xn
6
n
6
1
6
2
6
3
b0
D
?
b1
D
?
b2
D
?
b3
a4
b4
a4
6
n
b4
The psd of an MA process is rational in ej2f .

SX (f ) =
2
j2f
j2f
l
b0+b1e
++bl e
2
:
W
j2f
j2f
k
1a1e
ak e
(k, l)-th order
Any rational psd can be synthesized by an ARMA process.
c
2011
by Taejeong Kim
48
Ergodicity
Ergodicity means equality between time averages and statistical
averages.
statistical average: EXt = mX , Eg(Xt)
time average:
1 XT
1 XT X or
t=1 t
t=T Xt, discrete-time
T
2T
+1
ET Xt = 1 Z T
Z
1
T
X
dt
or
X dt, continuous-time
T 0 t
2T T t
ET g(Xt)
EXt = limT ET Xt
Eg(Xt)
Assume Xt is wss for ergodicity of first and second moment
and sss for higher moments.
We will discuss continuous-time cases.
c
2011
by Taejeong Kim
49
Xt is ergodic
in the mean
EXt = EXt
in the 2nd moment
EXt2 = EXt2
in the acf
, EXt+ Xt = EXt+ Xt
equality in all moments
We can compute the corresponding moment or joint moment
by time averaging the appropriate function of a sample path.
Ergodic theorems provide (necessary and) sufficient conditions for certain ergodicities.
Since the time average is a limit of a random sequence, the
senses of the above equalities need to be defined.
c
2011
by Taejeong Kim
50
For an iid random sequence Xt, t = , 1, 0, 1, 2,

pr
WLLN states that ET Xt EXt as T .
a.s.
SLLN states that ET Xt
EXt as T .
ms
We also know that ET Xt
EXt as T . [How?]
These imply ergodicity of the mean in three different senses.
ms
Xt is ms ergodic in the mean: ET Xt
mX as T .
A (wss) process Xt is ms ergodic in the mean if and only if
Z
limT T1 T0 1 T CX ( )d = 0.
c
2011
by Taejeong Kim
51
proof: EET Xt = EEXt = EEXt = mX
Z
2
1 T
var(ET Xt) = E T 0 Xtdt mX
Z
2
1 T
= E T 0 (Xt mX )dt
Z
1
T ZT
= 2 0 0 E(Xt mX )(Xs mX )dtds
T
Z
s
1
T ZT
6
= 2 0 0 CX (t s)dtds (1)
T
Z
1
T Z T s
= 2 0 s CX ( )d ds
T
@
-
6
@
@
@
(t, s)(, s), = t s
Z
Z
1
0 ZT
T Z T
= 2 T CX ( )dsd + 0 0 CX ( )dsd
T
!
Z
Z
1
0
T
= 2 T (T + )CX ( )d + 0 (T )CX ( )d
T
@
@ -
c
2011
by Taejeong Kim
52
Z
1
= 2 TT (T | |)CX ( )d [symmetry]
T
Z
2
T
= T 0 1 T CX ( )d : since CX ( ) is even.
(2)
A (wss) process Xt is ms ergodic in the mean

Z
if
|CX ( )|d < .
|CX ()|d <

Z
1
limT T T0 1 T CX ( )d
proof:
= 0 [dominated conv thm]
var(ET Xt) 0 as T [prev thm]

A (wss) process Xt is ms ergodic in the mean
Z
1
if and only if limT T T0 CX ( )d = 0.
proof: only if part: Assume ergodicity.
c
2011
by Taejeong Kim
53
1 T C ( )d 2
T 0 X
2
Z
1 T
= T 0 E(X mX )(X0 mX )d
2
Z
1
T
= E(X0 mX ) T 0 (X mX )d
2
2 1ZT
E(X0 mX ) E T 0 (X mX )d [Schwarz ineq]
Z
C ( )d [(2)]
2 2 T
1
= X
X
T 0
T
Z
1
if part: Assume limT T T0 CX ( )d = 0 and consider
Z
Z
T
T
1
|var(ET Xt)| = 2 0 0 CX (t s)dtds [(1)]

T
Z
Z
Z
Z
T
t
T
s
1
= 2 ( 0 0 CX (t s)dsdt + 0 0 CX (s t)dtds)
s
T
6
Z
Z
T
t
2
B
= 2 0 0 CX (t s)dsdt [symmetry]
T
A
Z
c
2011
by Taejeong Kim
= 22
54
T t
0 0 CX ( )d dt ,
where = t s, (t, s)(t, ) in A
Z
Z
Z
Z
2
2
T t
T t
= 2 0 0 CX ( )d dt + 2 T 0 CX ( )d dt, where
T
T
T is a constant given by this:
Z
1
Since limT T T0 CX ( )d = 0,
1 t
> 0, T such that t T t 0 CX ( )d < .

Assuming that T is very large and T T,
Z
Z
Z
Z
2
2
t
t
T
T
2 0 0 CX ( )d dt + 2 T 0 CX ( )d dt [ ineq]
T
T
Z
Z
Z
1 t
2
2
T
t
T
2 0 0 CX ( )d dt + 2 T t t 0 CX ( )d dt [ ineq]
T
T
2
2
Z
+ 22 TT tdt = 1 + 22 T T
2
2
T
T
c
2011
by Taejeong Kim
55
ms
Xt is ms ergodic in the acf : ET Xt+ Xt
RX ( )
for each as T .
( )
( )
Define for each , Yt := Xt+ Xt. Then EYt = RX ( ),

( )
and Xt is ms ergodic in the acf if and only if Yt is ms
ergodic in the mean.
( ) ( )
2
( )
CY ( ) () = EYt+ Yt RX
2
( )
= EXt++ Xt+ Xt+ Xt RX
A (wss) process Xt is ms ergodic in the acf

Z
1
if and only if limT T T0 CY ( ) ()d = 0 for every .
c
2011
by Taejeong Kim
56
Isserlis theorem: For jointly Gaussian zero-mean random variables X1, X2, X3, and X4,
EX1X2X3X4
= EX1X2EX3X4 + EX1X3EX2X4 + EX1X4EX2X3.
In fact, for any even number k,
X
EX1X2 Xk = EXi1 Xi2 EXi3 Xi4 EXik1 Xik ,
where the sum is over all possible ways of partitioning
{1, 2, , k} into k/2 sets of pairs.
c
2011
by Taejeong Kim
57
A stationary Gaussian process Xt with zero mean is ms ergodic

in the acf if and only if
Z
1
limT T T0 CX2 ( )d = 0.
( )
proof: Let Yt
:= Xt+ Xt, and consider

( )
( )
CY ( ) () = EYt+ Yt
CX2 ( ) [zero mean]
= EXt++ Xt+ Xt+ Xt CX2 ( )

= EXt++ Xt+ EXt+ Xt + EXt++ Xt+ EXt+ Xt
+EXt++ XtEXt+ Xt+ CX2 ( ) [Isserlis thm]
= CX2 ( ) + CX2 () + CX ( + )CX ( ) CX2 ( )
= CX2 () + CX ( + )CX ( )
c
2011
by Taejeong Kim
58
only if: Assumed is, by the previous theorem (page 55), that
Z
1
limT T T0 CY ( ) ()d = 0 for every .
Z
1
limT T T0 [CX2 () + CX ( + )CX ( )]d = 0
Z
2
Setting = 0, limT T T0 CX 2()d = 0.
Z
1
if: Assume that limT T T0 CX 2( )d = 0.
1
T
limT
C
()d
(
)
0
Y
T
Z
Z
1
1
T
T
2
lim T 0 CX ()d + lim T 0 CX ( + )CX ( )d
Z
Z
1/2
1/2
2
2
1 T
1 T
C ( )d
[Schwarz]
lim T 0 CX ( + )d
T 0 X
= 0 for any finite .

By the previous theorem, the proof is complete.

RP Gub 13 06

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

RP Gub 13 06

Uploaded by

Copyright:

Available Formats

c

jointly Gaussian processes Xt and Yt: For any choice of

If Xt is Gaussian, > 0, Xt and Xt+ are independent.

Thermal noise: Gaussian, SX (f ) = N20

N0 = kT = 3.77 1021, k: Boltzmann constant

H(f ), V (f ), W (f ): Fourier transforms of h(t), v(t), w(t)

[Schwarz ineq: | xtyt dt| |xt| dt |yt|2dt]

: matched filter, matched to the input signal

When Xt is Gaussian and the decision is made by thresholding

If Xt is white with psd SX (f ) = N20 ,

This shows that the signal energy is fully utilized.

goal: Find h(t) that minimizes the mean-squared error (mse)

E|Vt Vt|2 E|Vt

subspace of all the processes

E Vt(Vt Vt) = 0 for all Vt =

proof: E|Vt Vt|2 = E|(Vt Vt) + (Vt Vt)|2

0 = E Vt(Vt Vt) = E(Vt Vt)

For this to hold for any h(),

Binomial counting process and random walk

random walk Nn: discrete-time process;

: arrival rate, the number of arrivals per unit time

Consider dividing [0, t] into n subintervals and a Bernoulli

We have seen a proof before.

alternate proof: GKn (z) = 1 n + n z = 1 + n

For an independent increment process with a known initial

= P (Nt1N0 = k1 0, Nt2Nt1 = k2 k1, , NtnNtn1 = kn kn1)

moments: ENt = t, var(Nt) = t: non-stationary

arrival time: Tk := min{t > 0 : Nt k}

counting process binomial Poisson

(superposition of Poisson processes): Lt and Mt are Poisson

superposition and decomposition of Poisson processes

t 0, where Xt is a white Gaussian process.

The sample paths of Wt are continuous everywhere but differentiable nowhere.

random walk approximation:

symmetric random walk

where b c is the greatest integer no greater than .

Markov process: a process with the Markov property

fXt Xtn (x1, , xn)

|Xt Xt (x3|x1, x2)

examples: binomial counting process, random walk, Poisson

conditional independence of the past and the future given

proof of the theorem: arguments are suppressed

[ch: chain rule]

conditional independence Markov property:

An independent increment process with known initial state X0

xn1|xn1) = fXtn |Xt

example: binomial counting process, random walk, Poisson

Chapman-Kolmogorov equation for a Markov process:

Xt |Xt (x2, x3|x1)dx2

|Xt (x2|x1)fXt |Xt Xt (x3|x1, x2)dx2

|Xt (x2|x1)fXt |Xt (x3|x2)dx2

|Xt (xk |x1)

(xk |xk1) fXt |Xt (x2|x1)dx2 dxk1

For Markov Xt and t1 < t2 < t3,

|Xt (x3|x2)dx3]fXt |Xt (x2|x1)dx2

= E(Xt3 |Xt1 = x1)

homogeneous Markov process:

example: random walk

transition matrix P : the matrix whose (i, j)th entry is pij

pij := [P m]ij = P (Xn+m = j|Xn = i)

k pik pkj : Chapman-Kolmogorov equation

stationary distribution for P : = (1, 2, 3, )

limn p(n) = limn p(0)P n = regardless of p(0).