Parameter Estimation

Parameter Estimation
Saravanan Vijayakumaran
sarva@ee.iitb.ac.in
Department of Electrical Engineering

Indian Institute of Technology Bombay
October 25, 2012
1 / 44
Motivation
System Model used to Derive Optimal Receivers
s(t) Channel y (t)
y (t) = s(t) + n(t)
s(t) Transmitted Signal

y (t) Received Signal
n(t) Noise
Simplified System Model. Does Not Account For
Propagation Delay
Carrier Frequency Mismatch Between Transmitter and
Receiver
Clock Frequency Mismatch Between Transmitter and
Receiver
In short, Lies! Why?
3 / 44
You want
answers?
4 / 44
I want the truth!
5 / 44
You
can't
handle
the
truth!
. . . right at the beginning of the course. Now you can.
6 / 44
Why Study the Simplified System Model?
s(t) Channel y (t)
y (t) = s(t) + n(t)
Receivers estimate propagation delay, carrier frequency

and clock frequency before demodulation
Once these unknown parameters are estimated, the
simplified system model is valid
Then why not study parameter estimation first?
Hypothesis testing is easier to learn than parameter
estimation
Historical reasons
7 / 44
Unsimplifying the System Model
Effect of Propagation Delay
Consider a complex baseband signal

X
s(t) = bn p(t nT )
n=
and the corresponding passband signal

h i
sp (t) = Re 2s(t)ej2fc t .
After passing through a noisy channel which causes

amplitude scaling and delay, we have
yp (t) = Asp (t ) + np (t)
where A is an unknown amplitude, is an unknown delay

and np (t) is passband noise
8 / 44
Effect of Propagation Delay
The delayed passband signal is
h i
sp (t ) = Re 2s(t )e j2fc (t )
h i
= Re 2s(t )e j ej2fc t
where = 2fc mod 2. For large fc , is modeled as

uniformly distributed over [0, 2].
The complex baseband representation of the received
signal is then
y (t) = Ae j s(t ) + n(t)
where n(t) is complex Gaussian noise.
9 / 44
Effect of Carrier Offset
Frequency of the local oscillator (LO) at the receiver differs
from that of the transmitter
Suppose the LO frequency at the transmitter is fc
h i
sp (t) = Re 2s(t)e j2fc t .
Suppose that the LO frequency at the receiver is fc f

The received passband signal is
yp (t) = Asp (t ) + np (t)
The complex baseband representation of the received

signal is then
y (t) = Ae j(2ft+) s(t ) + n(t)
10 / 44
Effect of Clock Offset
Frequency of the clock at the receiver differs from that of
the transmitter
The clock frequency determines the sampling instants at
the matched filter output
Suppose the symbol rate at the transmitter is T1 symbols
per second
Suppose the receiver sampling rate is 1+
T symbols per
second where || 1 and may be positive or negative
The actual sampling instants and ideal sampling instants
will drift apart over time
11 / 44
The Solution
Estimate the unknown parameters , , f and
Timing Synchronization Estimation of
Carrier Synchronization Estimation of and f
Clock Synchronization Estimation of
Perform demodulation after synchronization
12 / 44
Hypothesis testing was about making a choice between
discrete states of nature
Parameter or point estimation is about choosing from a
continuum of possible states
Example
Consider the complex baseband signal below
y (t) = Ae j s(t ) + n(t)
The phase can take any real value in the interval [0, 2)
The amplitude A can be any real number
The delay can be any real number
14 / 44
System Model for Parameter Estimation
Consider a family of distributions
Y P ,
where the observation vector Y Rn for n N and

Rm is the parameter space
Example:
Y =A+N
where A is an unknown parameter and N is a standard
Gaussian RV
The goal of parameter estimation is to find given Y
An estimator is a function from the observation space to
the parameter space
15 / 44
Which is the Optimal Estimator?
Assume there is a cost function C which quantifies the
estimation error
C :R
such that C[a, ] is the cost of estimating the true value of
as a
Examples of cost functions
Squared Error C[a, ] = (a )2
Absolute Error C[a, ] = |a |
0 if |a |
Threshold Error C[a, ] =
1 if |a | >
16 / 44
With an estimator we associate a conditional cost or risk
conditioned on
n h io
R () = E C (Y),
Suppose that the parameter is the realization of a

random variable
The average risk or Bayes risk is given by
n o
r () = E R ()
The optimal estimator is the one which minimizes the

Bayes risk
17 / 44
Given that
n h io h i
R () = E C (Y), = E C (Y), =
the average risk or Bayes risk is given by

n h io
r () = E C (Y),
h i
= E E C (Y), Y
The optimal estimate for can be found by minimizing for

each Y = y the posterior cost
h i
E C (y), Y = y

18 / 44
Minimum-Mean-Squared-Error (MMSE) Estimation
C[a, ] = (a )2
The posterior cost is given by
h i2
2

E ((y) ) Y = y = (y)

2(y)E Y = y

2

+E Y = y
The Bayes estimate is given by

MMSE (y) = E Y = y
19 / 44
Example 1: MMSE Estimation
Suppose X and Y are jointly Gaussian random variables
Let the joint pdf be given by

1 1 T 1
pXY (x, y ) = 1
exp (s ) (s )
2|| 2 2
2
x x x x y
where s = ,= and =
y y x y y2
Suppose Y is observed and we want to estimate X
The MMSE estimate of X is

XMMSE (y ) = E X Y = y

20 / 44
The conditional distribution of X given Y = y is a Gaussian
RV with mean
x
X |y = x + (y y )
y
and variance
X2 |y = (1 2 )x2
Thus the MMSE estimate of X given Y = y is
x
XMMSE (y ) = x + (y y )
y
21 / 44
Suppose A is a Gaussian RV with mean and known
variance v 2
Suppose we observe Yi , i = 1, 2, . . . , M such that
Yi = A + Ni
where Ni s are independent Gaussian RVs with mean 0

and known variance 2
Suppose A is independent of the Ni s
The MMSE estimate is given by
Mv 2
2
A1 (y) +
AMMSE (y) = Mv 2
2
+1
1 PM
where A1 (y) = M i=1 yi
22 / 44
Minimum-Mean-Absolute-Error (MMAE) Estimation
C[a, ] = |a |
The Bayes estimate ABS is given by the median of the
posterior density p(|Y = y)

Pr < t Y = y
Pr > t Y = y , t < ABS (y)

Pr < t Y = y
Pr > t Y = y , t > ABS (y)

p(|Y = y)

Pr < t Y = y

Pr > t Y = y
t ABS (y)
23 / 44
R
For Pr[X 0] = 1, E[X ] = 0 Pr[X > x] dx
Since |(y) | 0

E |(y) |Y = y

Z

= Pr |(y) | > x Y = y dx

Z0

= Pr > x + (y)Y = y dx

0
Z

+ Pr < x + (y)Y = y dx

Z 0

= Pr > t Y = y dt

(y)
(y)
Z

+ Pr < t Y = y dt

24 / 44

Differentiating E |(y) |Y = y wrt to (y)

E |(y) |Y = y
(y)
Z

= Pr > t Y = y dt

(y) (y)
Z (y)

+ Pr < t Y = y dt

(y)

= Pr < (y)Y = y Pr > (y)Y = y

The derivative is nondecreasing tending to 1 as

(y) and +1 as (y)
The minimum risk is achieved at the point the derivative
changes sign
25 / 44
Thus the MMAE ABS is given by any value such that

Pr < t Y = y
Pr > t Y = y , t < ABS (y)

Pr < t Y = y
Pr > t Y = y , t > ABS (y)

Why not the following expression?

Pr < ABS (y)Y = y = Pr ABS (y)Y = y
Why not the following expression?

Pr < ABS (y)Y = y = Pr > ABS (y)Y = y

MMAE estimation for discrete distributions requires the

more general expression above
26 / 44
Maximum A Posteriori (MAP) Estimation
The MAP estimator is given by

MAP (y) = argmax p Y = y

It can be obtained as the optimal estimator for the

threshold cost function

0 if |a |
C[a, ] =
1 if |a | >
for small > 0
27 / 44
For the threshold cost function, we have1
h i
E C (y), Y = y
Z

= C[(y), ]p Y = y d

Z (y)

Z

= p Y = y d +
p Y = y d

(y)+
Z

Z (y)+

= p Y = y d p Y = y d
(y)
Z (y)+

= 1 p Y = y d

(y)
The Bayes estimate is obtained by maximizing the integral

in the last equality
1
Assume a scalar parameter for illustration
28 / 44
p(|Y = y)
R (y)+
p Y = y
(y)
(y)

R (y)+
The shaded area is the integral p Y = y d

(y)
To maximize this integral, the location of (y) should be

chosen to be the value of which maximizes p(|Y = y)
29 / 44
p(|Y = y)
R (y)+
p Y = y
(y)
MAP (y)
This argument is not airtight as p(|Y = y) may not be

symmetric at the maximum
But the MAP estimator is widely used as it is easier to
compute than the MMSE or MMAE estimators
30 / 44
Maximum Likelihood (ML) Estimation
The ML estimator is given by

ML (y) = argmax p Y = y

It is the same as the MAP estimator when the prior

probability distribution of is uniform
It is also used when the prior distribution is not known
31 / 44
Example 1: ML Estimation
Yi N (, 2 )
where Yi s are independent, is unknown and 2 is known

The ML estimate is given by
M
1 X
ML (y) = yi
M
i=1
Assignment 5
32 / 44
Yi N (, 2 )
where Yi s are independent, both and 2 are unknown

The ML estimates are given by
M
1 X
ML (y) = yi
M
i=1
M
1 X
2
ML (y) = (yi ML (y))2
M
i=1
Assignment 5
33 / 44
Yi Bernoulli(p)
where Yi s are independent and p is unknown

The ML estimate of p is given by
M
1 X
pML (y) = yi
M
i=1
Assignment 5
34 / 44
Yi Uniform[0, ]
where Yi s are independent and is unknown

The ML estimate of is given by
ML (y) = max (y1 , y2 , . . . , yM1 , yM )
Assignment 5
35 / 44
Reference
Chapter 4, An Introduction to Signal Detection and
Estimation, H. V. Poor, Second Edition, Springer Verlag,
1994.
36 / 44
Parameter Estimation of Random Processes
ML Estimation Requires Conditional Densities
ML estimation involves maximizing the conditional density
wrt unknown parameters
Example: Y N (, 2 ) where is known and 2 is
unknown
(y )2

1
p Y = y = e 22
2 2
Suppose the observation is the realization of a random

process
y (t) = Ae j s(t ) + n(t)
What is the conditional density of y (t) given A, and ?
38 / 44
Maximizing Likelihood Ratio for ML Estimation
Consider Y N (, 2 ) where is unknown and 2 is
known
1 (y )2

p(y |) = e 2 2
2 2
Let q(y ) be the density of a Gaussian with distribution

N (0, 2 )
1 y2

q(y ) = e 2 2
2 2
The ML estimate of is obtained as

p(y |)
ML (y ) = argmax p(y |) = argmax
q(y )
= argmax L(y |)

where L(y |) is called the likelihood ratio

39 / 44
Likelihood Ratio and Hypothesis Testing
The likelihood ratio L(y |) is the ML decision statistic for
the following binary hypothesis testing problem
H1 : Y N (, 2 )
H0 : Y N (0, 2 )
where is assumed to be known

H0 is a dummy hypothesis which makes calculation of the
ML estimator easy for random processes
40 / 44
Likelihood Ratio of a Signal in AWGN
Let Hs () be the hypothesis corresponding the following
received signal
Hs () : y (t) = s (t) + n(t)
where can be a vector parameter

Define a noise-only dummy hypothesis H0
H0 : y (t) = n(t)
Define Z and y (t) as follows
Z = hy , s i
s (t)
y (t) = y (t) hy , s i
ks k2
Z and y (t) completely characterize y (t)
41 / 44
Likelihood Ratio of a Signal in AWGN
Under both hypotheses y (t) is equal to n (t) where
s (t)
n (t) = n(t) hn, s i
ks k2
n (t) is independent of the noise component in Z and has

the same distribution under both hypotheses
n (t) is irrelevant for this binary hypothesis testing problem
The likelihood ratio of y (t) equals the likelihood ratio of Z
under the following hypothesis testing problem
Hs () : Z N (ks k2 , 2 ks k2 )
H0 () : Z N (0, 2 ks k2 )
42 / 44
Likelihood Ratio of Signals in AWGN
The likelihood ratio of signals in real AWGN is
ks k2

1
L(y |s ) = exp hy , s i
2 2
The likelihood ratio of signals in complex AWGN is
ks k2

1
L(y |s ) = exp Re(hy , s i)
2 2
Maximizing these likelihood ratios as functions of results

in the ML estimator
43 / 44
Thanks for your attention
44 / 44

Parameter Estimation

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Parameter Estimation

Uploaded by

Copyright:

Available Formats

Parameter Estimation

Department of Electrical Engineering

October 25, 2012

s(t) Channel y (t)

y (t) = s(t) + n(t)

s(t) Transmitted Signal

. . . right at the beginning of the course. Now you can.

s(t) Channel y (t)

y (t) = s(t) + n(t)

Receivers estimate propagation delay, carrier frequency

and the corresponding passband signal

After passing through a noisy channel which causes

yp (t) = Asp (t ) + np (t)

where A is an unknown amplitude, is an unknown delay

where = 2fc mod 2. For large fc , is modeled as

y (t) = Ae j s(t ) + n(t)

where n(t) is complex Gaussian noise.

Suppose that the LO frequency at the receiver is fc f

yp (t) = Asp (t ) + np (t)

The complex baseband representation of the received

y (t) = Ae j(2ft+) s(t ) + n(t)

y (t) = Ae j s(t ) + n(t)

where the observation vector Y Rn for n N and

Suppose that the parameter is the realization of a

The optimal estimator is the one which minimizes the

the average risk or Bayes risk is given by

The optimal estimate for can be found by minimizing for

The Bayes estimate is given by

where Ni s are independent Gaussian RVs with mean 0

The derivative is nondecreasing tending to 1 as

Why not the following expression?

Why not the following expression?

MMAE estimation for discrete distributions requires the

It can be obtained as the optimal estimator for the

for small > 0

The Bayes estimate is obtained by maximizing the integral

To maximize this integral, the location of (y) should be

This argument is not airtight as p(|Y = y) may not be

It is the same as the MAP estimator when the prior

where Yi s are independent, is unknown and 2 is known

where Yi s are independent, both and 2 are unknown

where Yi s are independent and p is unknown

where Yi s are independent and is unknown

ML (y) = max (y1 , y2 , . . . , yM1 , yM )

Suppose the observation is the realization of a random

Let q(y ) be the density of a Gaussian with distribution

The ML estimate of is obtained as

where L(y |) is called the likelihood ratio

where is assumed to be known

Hs () : y (t) = s (t) + n(t)

where can be a vector parameter

Define Z and y (t) as follows

Z and y (t) completely characterize y (t)

n (t) is independent of the noise component in Z and has

The likelihood ratio of signals in complex AWGN is

Maximizing these likelihood ratios as functions of results

You might also like