L 4 B

Minimum Mean-Square Error (MMSE) and
Linear MMSE (LMMSE) Estimation

Outline:
MMSE estimation,
Linear MMSE (LMMSE) estimation,
Geometric formulation of LMMSE estimation and
orthogonality principle.
Reading: Chapter 12 in Kay-I.
EE 527, Detection and Estimation Theory, # 4b 1
MMSE Estimation
Consider the following problem:
A signal = is transmitted through a noisy channel, modeled
using the conditional pdf f
X |
(x| ), which is the likelihood
function of . We observe X = x. The signal has known
prior (marginal) pdf
f
() = ()
which summarizes our knowledge about before (i.e. prior to)
collecting X = x. We wish to estimate using the observation
X = x:
=

(x) = g(x).
We choose g(X) to minimize the Bayesian (preposterior)
mean-square error:
BMSE = E
,X
{[
(X) ]
2
} = E
,X
{[g(X) ]
2
}.
Here,

(X) that achieve the minimum BMSE are called
minimum MSE (MMSE) estimates of .

(X) may not
be unique.
A Reminder: MMSE Estimation
Theorem 1. The MMSE estimate of (based on the
observation X = x) is given by
MMSE
= g(x) = E
| X
(| x). (1)
The minimum BMSE (i.e. the BMSE of

MMSE
(x) =
E
| X
[| x]) is
MBMSE = E
X
[var
| X
(| X)]
= E
(
2
) E
X
{[E
| X
(| X)]
2
}. (2)
Lemma 1. We rst show that
min
b
E
[(b )
2
] = var
()
is achieved for
b = E
().
Therefore, in absence of any observations, the MMSE estimate
of is equal to the mean of the (prior, marginal) pdf of :
E
[(b)
2
] = E
[(E
() + E
() b)
2
]
= E
_
[E
()]
2
+ [E
() b]
2
+2 [E
() b] [E
()]
_
= E
[(E
[])
2
] + (E
[] b)
2
+2 [E
() b] E
[E
()]
. .
0
E
{[E
()]
2
}
with equality if and only if b = E
().
Proof. (Theorem 1) We now consider our MMSE estimation
problem, write BMSE of an estimator g(X) as
BMSE = E
,X
{[g(X)]
2
}
iter. exp.
= E
X
_
E
| X
{[g(X)]
2
| X}
. .
(
b
| X), see handout # 4
_
and use Lemma 1 to conclude that, for each X = x, the
posterior expected squared loss
(
| X) = E
| X
{[g(X)]
2
| x}
is minimized for
g(x) = E
| X
(| x).
Thus, BMSE is minimized for
g(X) = E
| X
(| X).
We now nd the minimum BMSE:
MBMSE = E
,X
{[E
| X
(| X)]
2
}
iter. exp.
= E
X
[E
| X
{[E
| X
(| X)]
2
| X}]
= E
X
[var
| X
(| X)]. (3)
2
Comments:
E
X
[
MMSE
(X)] = E
() unbiased on average. (4)

However,

MMSE
(X) is practically never unbiased in the
classical sense:
E
X |
[
MMSE
(X) | ] = in general. (5)
You will show (5) in a HW assignment.
For independent and X, the MMSE estimate of is
MMSE
(X) = E
().
The estimation error
E =

MMSE
(X) (6)
and the MMSE estimate

MMSE
(X) are orthogonal:
E
,X
[E

MMSE
(X)] = E
,X
{[
MMSE
(X) ]
MMSE
(X)}
iter. exp.
= E
X
{E
| X
([
MMSE
(X) ]
MMSE
(X) | X)}
= E
X
{
MMSE
(X) E
| X
[

MMSE
(X) | X]} = 0
since

MMSE
(X) = E
| X
[| X]. It is clear from this
derivation that the estimation error E in (6) is orthogonal
to any function g(X) of X:
E
,X
{[

MMSE
(X)] g(X)}
= E
X
{E
| X
([

MMSE
(X)] g(X) | X)}
= E
X
{g(X) E
| X
[

MMSE
(X) | X]} = 0.
The law of conditional variances [(5) in handout # 0b]
implies
var
() = E
X
[var
| X
(| X)]
. .
MBMSE, see (3)
+var
X
( E
| X
[| X]
. .
b
MMSE
(X), see (1)
)
i.e. the sum of
the minimum BMSE for estimating and
variance of the MMSE estimate of
is equal to the (marginal, prior) variance of .
Additive Gaussian Noise Channel
Consider a communication channel with input
N(
,
2
)
and noise
W N(0,
2
)
where and W are independent and the measurement X is
modeled as
X = + W. (7)
Find the MMSE estimate of based on X and the
resulting minimum BMSE (MBMSE), i.e. E
| X
(| X) and
E
X
[var
| X
(| X)], see (1) and (2).
Note: We have already considered this problem in handout
# 4. We revisit it here with focus on MMSE estimation and
nding MBMSE.
Solution: From (7), we have:
f
X |
(x| ) = N(x| ,
2
).
We now nd f
| X
( | x) using the Bayes rule:
f
| X
( | x) f
() f
X |
(x| )
exp
_
1
2
2
)
2
exp
_
1
2
2
(x )
2
exp
_
1
2
(
1
+
1
2
)
2
+ (
1
+
1
2
x)
_
= N
_
2
x +
1
2
+
1
,
_
1
2
+
1
_
1
_
implying that
MMSE
(X) = E
| X
(| X) =
1
2
X +
1
2
+
1
(8)
var
| X
(| X) =
_
1
2
+
1
_
1
(9)
and, consequently,
MBMSE = E
X
[var
| X
(| X)] =
_
1
2
+
1
_
1
. (10)
Note: In the above example, the MMSE estimate is a linear
(more precisely, constant + linear = ane) function of the
observation X. This is not always the case, e.g. for
f
| X
( | x) =
_
xe
x
x > 0, 0
0 otherwise
we obtain E
| X
(| X) = 1/X. Here is another example.
Computing the MMSE estimator: Another example.
Gaussian Linear Model (Theorem 10.3 in Kay-I)
Theorem 2. Consider the linear model:
X = H +W
H is a known matrix, and
W N(0, C
W
)
N(
, C
)
where W and are independent and C
W
,
, and C
are
known hyperparameters. Then, the posterior pdf f
| X
( | x)
is Gaussian:
f
| X
( | x) = N
_
| (H
T
C
1
W
H + C
1
)
1
(H
T
C
1
W
x + C
1
),
(H
T
C
1
W
H + C
1
)
1
_
. (11)
Proof.
f
| X
( | x) f
X|
(x| ) ()
exp[
1
2
(x H)
T
C
1
W
(x H)]
exp[
1
2
(
)
T
C
1
)]
exp(
1
2
T
H
T
C
1
W
H +x
T
C
1
W
H)
exp(
1
2
T
C
1
+
T
C
1
)
= exp[
1
2
T
(H
T
C
1
W
H + C
1
) + (x
T
C
1
W
H +
T
C
1
) ]
N
_
(H
T
C
1
W
H + C
1
)
1
(H
T
C
1
W
x + C
1
),
(H
T
C
1
W
H + C
1
)
1
_
.
2
Comments:
DC-level estimation in AWGN with known variance
introduced on p. 17 of handout # 4 is a special case of
this result, see also Example 10.2 in Kay-I.
Examine the posterior mean:
E
| X
( | x) = ( H
T
C
1
W
H
. .
likelihood precision
+ C
1
..
prior precision
)
1
(H
T
C
1
W
x
. .
data-dependent term
+ C
1
. .
prior-dependent term
).
Noninformative (at) prior on and white noise. Consider
the Jereys noninformative (at) prior pdf for :
() 1 (C
1
= 0)
and white noise:
C
W
=
2
I
..
identity matrix
.
Then, f
| X
( | x) in (11) simplies to
f
| X
( | x) = N
_
LS
(x)
..
(H
T
H)
1
H
T
x,
2
(H
T
H)
1
_
.
Prediction: We now practice prediction for this model. Say
we wish to predict a X
coming from the following model:

X
= h
T
+ W

where W
N(0,
2
) is independent from W, implying
that X
and X are conditionally independent given =

and, therefore,
f
X
| ,X
(x
| , x) = f
X
|
(x
| ) = N(x
| h
T
,
2
).
Then, our posterior predictive pdf is [along the lines of (10)]
f
X
| X
(x
| x) =
_
f
X
|
(x
| )
. .
N(x
| h
T
,
2
)
f
| X
( | x)
. .
N( |
b
(x),C
post
)
d
where
(x) = (H
T
C
1
W
H + C
1
)
1
(H
T
C
1
w
x + C
1
)
C
post
= (H
T
C
1
W
H + C
1
)
1
implying
f
X
| X
(x
| x) = N(h
T

(x), h
T
C
post
h
+
2
).
Linear MMSE (LMMSE) Estimation
For exact MMSE estimation, we need to know the joint pdf
(or joint pmf) f
,X
(, x), typically specied through the prior
(marginal) pdf/pmf f
() and conditional pdf/pmf f

X |
(x| ),
which together yield the joint pdf (or pmf, or combined
pdf/pmf)
f
,X
(, x) = f
X |
(x| ) f
().
This information may not be available.
We typically have estimates of the rst and second moments
of the signal and the observation, i.e. of the means, variances,
and covariance between and X.
This information is generally not sucient for MMSE
estimation of , but is sucient for linear MMSE (LMMSE)
estimation of , i.e. for nding estimates of the form:
=

(X) = a X + b (12)
that minimize BMSE:
BMSE = E
,X
{[

(X)]
2
}.
The minimization is with respect to a and b.
Note: Even though it is more appropriate to refer to this
estimator as ane MMSE estimator, linear MMSE estimator
is the most widely used name for it. In most applications, we
consider zero-mean X and ; then, our estimator is indeed
linear, see Theorem 3 below.
Theorem 3. The LMMSE estimate of is
(X) =
cov
,X
(, X)
2
X
. .
a
opt
[X E
X
(X)] + E
()
=
,X

X E
X
(X)
X
+ E
() (13)
and its BMSE is given by
MBMSE
linear
= cov
,X
(

(X), ) (14)
=
2
cov
2
,X
(, X)
2
X
= (1
2
,X
)
2
. (15)
Here,
cov
,X
(, X) = E
,X
[(
) (X
X
)]
= E
,X
(X) E
() E
X
(X)
is introduced on p. 4 of handout # 0b,
var
() = cov
(, ) =
2
var
X
(X) =
2
X
and
,X
is the correlation coecient between and X,
dened as
,X
=
cov
,X
(, X)
_
var
()
_
var
X
(X)
=
cov
,X
(, X)
X
where
=
_
and
X
=
_
2
X
are the (marginal) standard
deviations of and X.
Proof. Suppose rst that the constant a has already been
chosen. Then, choosing the constant b to minimize the BMSE
E
,X
[(a X b)
2
]
is equivalent to nding b that minimizes E
[( b)
2
], where

= a X. This problem is solved in Lemma 1, and the
optimal b is b = E
(), i.e.
b = E
() = E
,X
(a X) = E
() a E
X
(X). (16)
Substituting (16) into E
,X
[(a X
. .
b)
2
] yields:
E
{[ E
()]
2
} = var
() = var
,X
(a X) (17)
=
2
+ a
2
2
X
2 a cov
,X
(, X) (18)
which is easy to minimize with respect to a. In particular,
dierentiating (18) with respect to a and setting the result to
zero yields
2 a
2
X
2 cov
,X
(, X) = 0
i.e.
a cov
,X
(X, X) cov
,X
(, X) = 0
and, nally,
cov
,X
(a X , X) = 0
which is the famous orthogonality principle. Clearly, the optimal
a is
a
opt
=
cov
,X
(, X)
2
X
(19)
and (13) follows. We summarize the orthogonality principle:
cov
,X
(a
opt
X , X) = 0 (20)
or, equivalently,
cov
,X
_

(X)
. .
LMMSE est. of
based on X
, X
_
= 0. (21)
Substituting (19) into (17) yields
MMSE
linear
= cov
,X
(a
opt
X, a
opt
X)
. .
var
,X
(a
opt
X)
= cov
,X
(a
opt
X, ) a
opt
cov
,X
(a
opt
X, X)
. .
0, by (20)
=
2
cov
2
,X
(, X)
2
X
and (15) follows. By completing the squares, it is easy to check
that, for any a R,
var
,X
(a X) = var
,X
(a X + a
opt
X a
opt
X)
= var
,X
_
(a
opt
X) (a a
opt
) X
_
= var
,X
(a
opt
X)
. .
MBMSE
linear
+(a a
opt
)
2
var
X
(X)
2 (a a
opt
) cov
,X
(a
opt
X, X)
. .
0, by (20)
=
2
[cov
,X
(, X)]
2
2
X
. .
see (15)
+
2
X
(a a
opt
)
2
(22)
which proves MMSE optimality of (19). 2
Comments:
E
X
[
(X)] = E
()
also true for the MMSE estimate, see (4).
If
,X
= 0, i.e. and X are uncorrelated, then
(X) = E
() = constant
i.e. LMMSE estimation ignores the observation X.
If
,X
= 1, i.e. E
() and X E
X
(X) are linearly
dependent with probability one, then the LMMSE estimate
is perfect.
LMMSE vs. MMSE
In general, the LMMSE estimate is not as good as the MMSE
estimate.
Example: Suppose that
X U(1, 1) uniform pdf, see the table of distributions
and
= X
2
.
The MMSE estimate of based on X is
(X) = E
| X
(| X) = X
2
which is perfect. To nd the LMMSE estimate of based on
X, we need
E
X
(X) = 0
E
() =
_
1
1
x
2
1
2
dx =
1
3
cov
,X
(, X) = E
,X
(X) 0 = 0 and X uncorr.
yielding the LMMSE estimate
(X) = E
() =
1
3
i.e. the observation X is totally ignored even though it
completely determines .
An class of random signals for which the MMSE estimate is
linear is the class of jointly Gaussian random signals, e.g.
and X in the additive Gaussian noise channel example on p. 8.
Linear MMSE Estimation:
A Geometric Formulation
We rst introduce some background:
A vector space V (e.g. the common Euclidean space) consists
of a set of vectors that is closed under two operations:
vector addition: if v
1
, v
2
V, then v
1
+v
2
V and
scalar multiplication: if a R and v V, then a v V.
An inner product, (e.g. scalar product product in Euclidean
spaces), is an operation u v satisfying
commutativity: u v = v u,
linearity: (a u + b v) w = a u w + b v w, and
the inner product of any vector with itself
is non-negative: u u 0, and
u u = 0 if and only if u = 0.
The norm of u is dened as u =

u u.
u and v are orthogonal (written u v) if and only if
u v = 0.
A vector space with an inner product is called an inner-
product space. Example: Euclidean space with the scalar
product.
How about a vector space for random variables?
Consider random variables X and Y as vectors in an inner-
product space V that contains all RVs dened over the same
probability space, with
vector addition: V
1
+ V
2
V,
scalar multiplication: a V V,
inner product: V
1
V
2
= cov
V
1
,V
2
(V
1
, V
2
) (check that it is a
legitimate inner product),
the norm of V : V =
_
var
V
(V
2
) =
V
.
Hence,
inner product cov
X,Y
(X, Y )
norm of X
X
norm of Y
Y
cos
X,Y
.
The linear MMSE estimation problem can be recast in the
above geometric framework after substituting the optimal b
from (16) into E
,X
{[a X b]
2
}, yielding
var
,X
(a X) = a X
2
.
We wish to minimize this variance with respect to a.
Clearly, a X
2
is minimized if
( a X) X
i.e. if
cov
,X
(a X, X) = 0
and, consequently, the MMSE-optimal linear term a is
a
opt
=
cov
,X
(, X)
var
X
(X)
=
cov
,X
(, X)
2
X
.
To summarize:
Orthogonality principle:
(a
opt
X) X
i.e.
cov
,X
(a
opt
X, X) = 0 (23)
see (20).
Additive White Noise Channel
Consider again the communication channel example on p. 8,
with input having mean
and variance
2
and noise
W having mean zero and variance
2
, where and W are
independent and the measurement X is
X = + W.
Find the LMMSE estimate of based on X and the resulting
BMSE (MBMSE
linear
). We need
E
() =
E
X
(X) = E
X,W
( + W) = E
() + E
W
(W) =
and
cov
,X
(, X) = cov
,W
(, + W)
and W uncorr.
=
2
cov
X
(X) = cov
X,W
( + W, + W)
and W uncorr.
=
2
+
2
.
The LMMSE estimate of X is
(X) =
cov
,X
(, X)
2
X
[X E
X
(X)] + E
X
(X)
=

2
+
2
(X
X
) +
X
=

2
+
2
X +

2
+
2

=
1
2
X +
1
2
+
1
which is the same as the MMSE estimate in (8).

Example: Estimating the Bias of a Coin
Suppose that (prior) pdf of heads of a coin is
f
() = U( | 0, 1) = i
(0,1)
().
We ip this coin N times and record the number of heads X.
Then, if the coin ips are independent, identically distributed
(i.i.d.), the conditional pdf of X given = is
f
X |
(x| ) =
_
N
x
_
x
(1)
Nx
= Bin(x| N, ) binomial pdf.
(24)
Find the MMSE and LMMSE estimates of based on X.
MMSE:
f
| X
( | x) f
() f
X |
(x| )
i
(0,1)
()
x
(1 )
Nx
= Beta( | x + 1, N x + 1)
see the table of distributions. Now, the MMSE estimate of
is
MMSE
(x) = E
| X
(| X = x) =
x + 1
N + 2
.
LMMSE: We need
= E
() =
1
2
mean of uniform(0, 1) pdf
X
= E
,X
(X)
iter. exp.
= E
[E
X |
(X| )]
= E
( N
..
mean of binomial pdf in (24)
) =
1
2
N
and
2
X
cond. var.
= E
{ var
X |
(X| )
. .
var of binomial in (24)
} + var
{ E
X |
[X| ]
. .
mean of binomial in (24)
}
= E
[N (1 )] + var
(N )
= N E
[(1 )] + N
2
var
()
= N (
1
2

1
3
) + N
2
1
12
=
N (N + 2)
12
cov
,X
(, X) = E
,X
(X)
X
iterated exp.
= E
{E
X |
[X| ]}
N
4
= E
[E
X |
(X| )]
N
4
= E
{ N
..
mean of binomial in (24)
}
N
4
=
N
3

N
4
=
N
12
.
Now,
(X) =
cov
,X
(, X)
2
X
(X
X
) +
=
N/12
N (N + 2)/12
(X
1
2
N) +
1
2
=
X + 1
N + 2
.
In this example, the MMSE and LMMSE estimates of are the
same.
Linear MMSE Estimation: the Vector Case
(FIR Wiener Filter)
Consider the signal of interest with prior knowledge described
by the pdf
f
()
and an N-dimensional random vector X representing the
observations.
The MMSE estimate of X is the conditional expectation
E
| X
(| X)
which may be dicult to nd in practice, since it requires
knowledge of the joint distribution of and X.
The linear MMSE estimate of X is easier to nd, since it
depends only on the means, variances, and covariances of
the random variables and vectors involved.
Linear MMSE Estimation via the Orthogonality
Principle
We wish to nd an N 1 vector a and a constant b such that
(X) = a
T
X + b =
N
i=1
a
i
X
i
+ b
minimizes the BMSE
BMSE = E
,X
{[

(X)]
2
}
where
X =
_
_
X[0]
. . .
X[N 1]
_
_
.
Suppose rst that the constant vector a has already been
chosen. Then, choosing the constant b to minimize the BMSE
BMSE = E
,X
[(a
T
X b)
2
]
is equivalent to nding b that minimizes E
[( b)
2
] where
= a
T
X. This problem is solved in Lemma 1 and the
optimal b is
b = E
() = E
,X
(a
T
X) = E
()a
T
E
X
(X). (25)
We view , X[0], . . . , X[N 1] as vectors in an inner-product
space. The linear MMSE estimation problem can be cast into
our geometric framework after substituting the optimal b in
(25) into BMSE = E
,X
{[

(X)]
2
}, yielding
var
,X
(a
T
X) = a
T
X
2
. (26)
We minimize this variance with respect to a.
Clearly, a
T
X
2
is minimized if a is chosen to satisfy the
orthogonality principle:
(a
T
X) subspace V
N
spanned by X[0], X[1], . . . , X[N 1]
or, equivalently,
cov
,X
(a
T
X, X[n]) = 0 n = 0, 1, . . . , N 1 (27)
which gives the following set of equations:
cov
,X[n]
(, X[n]) cov
X
_
N1
l=0
a
l
X[l], X[n]
_
= 0
or
N1
l=0
cov
X[n],X[l]
(X[n], X[l]) a
l
= cov
X[n],
(X[n], ). (28)
Dene the crosscovariance vector between X and and
covariance matrix of X as
X,
= cov
X,
(X, ) =
_
_
cov
X[0],X
(X[0], )
cov
X[1],
(X[1], )
.
.
.
cov
X[N1],
(X[N 1], )
_
_
and
X
= cov
X
(X)
and use these denitions to compactly write (28):
X
a =
X,
.
If
X
is a positive denite matrix, we can solve for a:
a
opt
=
1
X

X,
(29)
and, nally, the LMMSE estimate of is [using (25)]
(X) = a
T
opt
X + E
() a
T
opt
E
X
(X)
=
T
X,
1
X
. .
a
T
opt
[X E
X
(X)] + E
(). (30)
Compare this result to the scalar case in (13):
(X) =
cov
,X
(, X)
2
X
. .
a
opt
[X E
X
(X)] + E
().
We now nd the minimum BMSE of our LMMSE estimator:
substitute (29) into (26), yielding
MBMSE
linear
= cov
,X
(a
T
opt
X, a
T
opt
X)
= cov
,X
(a
T
opt
X, ) cov
,X
(a
T
opt
X, a
T
opt
X)
= cov
,X
(a
T
opt
X, ) cov
,X
(a
T
opt
X, X)
. .
= 0, see (27)
a
opt
= cov
,X
(a
T
opt
X, ) (31)
which can also be written as
MBMSE
linear
= cov
,X
(

(X), ) (32)
and further simplied:
MBMSE
linear
=
2
X
a
T
opt
cov
X,
(X, )
. .
X,
see (29)
=
2
T
X,
1
X

X,
. (33)
Compare this result to the scalar case in (15):
MBMSE
linear
=
2
cov
2
,X
(, X)
2
X
.
Example: Additive Noise Channel
Again:
N(
,
2
)
where
and
2
are known hyperparameters. We collect

multiple observations X[n], modeled as
X[n] = + W[n] n = 0, 1, . . . , N 1
where W[n] are zero-mean uncorrelated RVs with known
variance
2
. We also know that and W[n] are uncorrelated
for all n. Find the LMMSE estimate of based on
X =
_
_
X[0]
. . .
X[N 1]
_
_
.
Find also the minimum BMSE.
By the orthogonality principle (27), we have:
cov
,X[n]
(, X[n]) cov
X
_
N1
l=0
a
l
X[l], X[n]
_
= 0
for n = 0, 1, . . . , N 1. Here,
cov
,X[n]
(, X[n]) = cov
,W[n]
( + W[n], )
= cov
(, ) + cov
W[n],X
(W[n], )
= var
() =
2
(34)
cov
X[l],X[n]
(X[l], X[n]) = cov
,W
( + W[l], + W[n])
=
_

2
, l = n
2 +
2
, l = n
(35)
and, therefore,
= (
2 +
2
) a
0
+
2
a
1
+ . . . +
2
a
N1
=
2
a
0
+ (
2 +
2
) a
1
+ . . . +
2
a
N1

=
2
a
0
+
2
a
1
+ . . . + (
2 +
2
) a
N1
.
Now, by symmetry,
a
opt,1
= a
opt,2
= . . . = a
opt,N
=

2
N
2
+
2
yielding
(X) =

2
N
2
+
2
N1
n=0
(X[n]
) +
=

2
N
2
+
2
_
N1
n=0
X[n]
_
+

2
N
2
+
2

(36)
=
N
2
X +
1
2
+
1
where
X =
1
N
N1
n=0
X[n].
The minimum average MSE of our LMMSE estimator follows
by using (31):
MBMSE
linear
= cov
,X
(

(X), )
=
2
cov
,X
(
(X), )
see (36)
=
2
cov
,X
_

2
N
2
+
2
_
N1
n=0
X[n]
_
,
_
see (34)
=
2
N
2
+
2
N1
n=0
cov
X[n],
(X[n], )
=
2
N
2
+
2
N
2
=

2
2
N
2
+
2
=
_
N
2
+
1
_
1
which is the same as
2
N
in (15) of handout # 4.

L 4 B

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

L 4 B

Uploaded by

Copyright:

Available Formats

Minimum Mean-Square Error (MMSE) and

Linear MMSE (LMMSE) Estimation

() unbiased on average. (4)

coming from the following model:

EE 527, Detection and Estimation Theory, # 4b 14

and X are conditionally independent given =

() and conditional pdf/pmf f

which is the same as the MMSE estimate in (8).

are known hyperparameters. We collect

You might also like