You are on page 1of 44

Parameter Estimation

Saravanan Vijayakumaran
sarva@ee.iitb.ac.in

Department of Electrical Engineering


Indian Institute of Technology Bombay

October 25, 2012

1 / 44
Motivation
System Model used to Derive Optimal Receivers

s(t) Channel y (t)

y (t) = s(t) + n(t)

s(t) Transmitted Signal


y (t) Received Signal
n(t) Noise
Simplified System Model. Does Not Account For
Propagation Delay
Carrier Frequency Mismatch Between Transmitter and
Receiver
Clock Frequency Mismatch Between Transmitter and
Receiver
In short, Lies! Why?
3 / 44
You want
answers?

4 / 44
I want the truth!

5 / 44
You
can't
handle
the
truth!

. . . right at the beginning of the course. Now you can.

6 / 44
Why Study the Simplified System Model?

s(t) Channel y (t)

y (t) = s(t) + n(t)

Receivers estimate propagation delay, carrier frequency


and clock frequency before demodulation
Once these unknown parameters are estimated, the
simplified system model is valid
Then why not study parameter estimation first?
Hypothesis testing is easier to learn than parameter
estimation
Historical reasons

7 / 44
Unsimplifying the System Model
Effect of Propagation Delay
Consider a complex baseband signal

X
s(t) = bn p(t nT )
n=

and the corresponding passband signal


h i
sp (t) = Re 2s(t)ej2fc t .

After passing through a noisy channel which causes


amplitude scaling and delay, we have

yp (t) = Asp (t ) + np (t)

where A is an unknown amplitude, is an unknown delay


and np (t) is passband noise
8 / 44
Unsimplifying the System Model
Effect of Propagation Delay
The delayed passband signal is
h i
sp (t ) = Re 2s(t )e j2fc (t )
h i
= Re 2s(t )e j ej2fc t

where = 2fc mod 2. For large fc , is modeled as


uniformly distributed over [0, 2].
The complex baseband representation of the received
signal is then

y (t) = Ae j s(t ) + n(t)

where n(t) is complex Gaussian noise.

9 / 44
Unsimplifying the System Model
Effect of Carrier Offset
Frequency of the local oscillator (LO) at the receiver differs
from that of the transmitter
Suppose the LO frequency at the transmitter is fc
h i
sp (t) = Re 2s(t)e j2fc t .

Suppose that the LO frequency at the receiver is fc f


The received passband signal is

yp (t) = Asp (t ) + np (t)

The complex baseband representation of the received


signal is then

y (t) = Ae j(2ft+) s(t ) + n(t)

10 / 44
Unsimplifying the System Model
Effect of Clock Offset
Frequency of the clock at the receiver differs from that of
the transmitter
The clock frequency determines the sampling instants at
the matched filter output
Suppose the symbol rate at the transmitter is T1 symbols
per second
Suppose the receiver sampling rate is 1+
T symbols per
second where ||  1 and may be positive or negative
The actual sampling instants and ideal sampling instants
will drift apart over time

11 / 44
The Solution
Estimate the unknown parameters , , f and
Timing Synchronization Estimation of
Carrier Synchronization Estimation of and f
Clock Synchronization Estimation of
Perform demodulation after synchronization

12 / 44
Parameter Estimation
Parameter Estimation
Hypothesis testing was about making a choice between
discrete states of nature
Parameter or point estimation is about choosing from a
continuum of possible states

Example
Consider the complex baseband signal below

y (t) = Ae j s(t ) + n(t)

The phase can take any real value in the interval [0, 2)
The amplitude A can be any real number
The delay can be any real number

14 / 44
System Model for Parameter Estimation
Consider a family of distributions

Y P ,

where the observation vector Y Rn for n N and


Rm is the parameter space
Example:
Y =A+N
where A is an unknown parameter and N is a standard
Gaussian RV
The goal of parameter estimation is to find given Y
An estimator is a function from the observation space to
the parameter space

15 / 44
Which is the Optimal Estimator?
Assume there is a cost function C which quantifies the
estimation error
C :R
such that C[a, ] is the cost of estimating the true value of
as a
Examples of cost functions
Squared Error C[a, ] = (a )2
Absolute Error C[a, ] = |a |
0 if |a |
Threshold Error C[a, ] =
1 if |a | >

16 / 44
Which is the Optimal Estimator?
With an estimator we associate a conditional cost or risk
conditioned on
n h io
R () = E C (Y),

Suppose that the parameter is the realization of a


random variable
The average risk or Bayes risk is given by
n o
r () = E R ()

The optimal estimator is the one which minimizes the


Bayes risk

17 / 44
Which is the Optimal Estimator?
Given that
n h io  h i 
R () = E C (Y), = E C (Y), =

the average risk or Bayes risk is given by


n h io
r () = E C (Y),
  h i 
= E E C (Y), Y

The optimal estimate for can be found by minimizing for


each Y = y the posterior cost
 h i 
E C (y), Y = y

18 / 44
Minimum-Mean-Squared-Error (MMSE) Estimation
C[a, ] = (a )2
The posterior cost is given by
  h i2
2

E ((y) ) Y = y = (y)
 

2(y)E Y = y

 
2

+E Y = y

The Bayes estimate is given by


 

MMSE (y) = E Y = y

19 / 44
Example 1: MMSE Estimation
Suppose X and Y are jointly Gaussian random variables
Let the joint pdf be given by
 
1 1 T 1
pXY (x, y ) = 1
exp (s ) (s )
2|| 2 2
     2 
x x x x y
where s = ,= and =
y y x y y2
Suppose Y is observed and we want to estimate X
The MMSE estimate of X is
 

XMMSE (y ) = E X Y = y

20 / 44
Example 1: MMSE Estimation
The conditional distribution of X given Y = y is a Gaussian
RV with mean
x
X |y = x + (y y )
y

and variance
X2 |y = (1 2 )x2
Thus the MMSE estimate of X given Y = y is
x
XMMSE (y ) = x + (y y )
y

21 / 44
Example 2: MMSE Estimation
Suppose A is a Gaussian RV with mean and known
variance v 2
Suppose we observe Yi , i = 1, 2, . . . , M such that

Yi = A + Ni

where Ni s are independent Gaussian RVs with mean 0


and known variance 2
Suppose A is independent of the Ni s
The MMSE estimate is given by

Mv 2
2
A1 (y) +
AMMSE (y) = Mv 2
2
+1

1 PM
where A1 (y) = M i=1 yi

22 / 44
Minimum-Mean-Absolute-Error (MMAE) Estimation
C[a, ] = |a |
The Bayes estimate ABS is given by the median of the
posterior density p(|Y = y)
   

Pr < t Y = y
Pr > t Y = y , t < ABS (y)

   

Pr < t Y = y
Pr > t Y = y , t > ABS (y)

p(|Y = y)  

Pr < t Y = y
 

Pr > t Y = y

t ABS (y)

23 / 44
Minimum-Mean-Absolute-Error (MMAE) Estimation
R
For Pr[X 0] = 1, E[X ] = 0 Pr[X > x] dx
Since |(y) | 0
 

E |(y) | Y = y

Z  

= Pr |(y) | > x Y = y dx

Z0  

= Pr > x + (y) Y = y dx

0
Z  

+ Pr < x + (y) Y = y dx

Z 0  

= Pr > t Y = y dt

(y)
(y)
Z  

+ Pr < t Y = y dt

24 / 44
Minimum-Mean-Absolute-Error (MMAE) Estimation
 

Differentiating E |(y) | Y = y wrt to (y)
 

E |(y) | Y = y
(y)
Z  

= Pr > t Y = y dt

(y) (y)
Z (y)  

+ Pr < t Y = y dt

(y)
   

= Pr < (y) Y = y Pr > (y) Y = y

The derivative is nondecreasing tending to 1 as


(y) and +1 as (y)
The minimum risk is achieved at the point the derivative
changes sign
25 / 44
Minimum-Mean-Absolute-Error (MMAE) Estimation
Thus the MMAE ABS is given by any value such that
   

Pr < t Y = y
Pr > t Y = y , t < ABS (y)

   

Pr < t Y = y
Pr > t Y = y , t > ABS (y)

Why not the following expression?


   

Pr < ABS (y) Y = y = Pr ABS (y) Y = y

Why not the following expression?


   

Pr < ABS (y) Y = y = Pr > ABS (y) Y = y

MMAE estimation for discrete distributions requires the


more general expression above
26 / 44
Maximum A Posteriori (MAP) Estimation
The MAP estimator is given by
 

MAP (y) = argmax p Y = y

It can be obtained as the optimal estimator for the


threshold cost function

0 if |a |
C[a, ] =
1 if |a | >

for small > 0

27 / 44
Maximum A Posteriori (MAP) Estimation
For the threshold cost function, we have1
 h i 
E C (y), Y = y
Z  

= C[(y), ]p Y = y d


Z (y) 

 Z 


= p Y = y d +
p Y = y d

(y)+
Z 

 Z (y)+ 


= p Y = y d p Y = y d
(y)
Z (y)+ 


= 1 p Y = y d

(y)

The Bayes estimate is obtained by maximizing the integral


in the last equality
1
Assume a scalar parameter for illustration
28 / 44
Maximum A Posteriori (MAP) Estimation
p(|Y = y)  
R (y)+
p Y = y
(y)

(y)

 
R (y)+
The shaded area is the integral p Y = y d

(y)

To maximize this integral, the location of (y) should be


chosen to be the value of which maximizes p(|Y = y)

29 / 44
Maximum A Posteriori (MAP) Estimation
p(|Y = y)  
R (y)+
p Y = y
(y)

MAP (y)

This argument is not airtight as p(|Y = y) may not be


symmetric at the maximum
But the MAP estimator is widely used as it is easier to
compute than the MMSE or MMAE estimators

30 / 44
Maximum Likelihood (ML) Estimation
The ML estimator is given by
 

ML (y) = argmax p Y = y

It is the same as the MAP estimator when the prior


probability distribution of is uniform
It is also used when the prior distribution is not known

31 / 44
Example 1: ML Estimation
Suppose we observe Yi , i = 1, 2, . . . , M such that

Yi N (, 2 )

where Yi s are independent, is unknown and 2 is known


The ML estimate is given by

M
1 X
ML (y) = yi
M
i=1

Assignment 5

32 / 44
Example 2: ML Estimation
Suppose we observe Yi , i = 1, 2, . . . , M such that

Yi N (, 2 )

where Yi s are independent, both and 2 are unknown


The ML estimates are given by

M
1 X
ML (y) = yi
M
i=1
M
1 X
2
ML (y) = (yi ML (y))2
M
i=1

Assignment 5

33 / 44
Example 3: ML Estimation
Suppose we observe Yi , i = 1, 2, . . . , M such that

Yi Bernoulli(p)

where Yi s are independent and p is unknown


The ML estimate of p is given by

M
1 X
pML (y) = yi
M
i=1

Assignment 5

34 / 44
Example 4: ML Estimation
Suppose we observe Yi , i = 1, 2, . . . , M such that

Yi Uniform[0, ]

where Yi s are independent and is unknown


The ML estimate of is given by

ML (y) = max (y1 , y2 , . . . , yM1 , yM )

Assignment 5

35 / 44
Reference
Chapter 4, An Introduction to Signal Detection and
Estimation, H. V. Poor, Second Edition, Springer Verlag,
1994.

36 / 44
Parameter Estimation of Random Processes
ML Estimation Requires Conditional Densities
ML estimation involves maximizing the conditional density
wrt unknown parameters
Example: Y N (, 2 ) where is known and 2 is
unknown
(y )2
 
1
p Y = y = e 22
2 2

Suppose the observation is the realization of a random


process
y (t) = Ae j s(t ) + n(t)
What is the conditional density of y (t) given A, and ?

38 / 44
Maximizing Likelihood Ratio for ML Estimation
Consider Y N (, 2 ) where is unknown and 2 is
known
1 (y )2

p(y |) = e 2 2
2 2

Let q(y ) be the density of a Gaussian with distribution


N (0, 2 )
1 y2

q(y ) = e 2 2
2 2

The ML estimate of is obtained as


p(y |)
ML (y ) = argmax p(y |) = argmax
q(y )
= argmax L(y |)

where L(y |) is called the likelihood ratio


39 / 44
Likelihood Ratio and Hypothesis Testing
The likelihood ratio L(y |) is the ML decision statistic for
the following binary hypothesis testing problem

H1 : Y N (, 2 )
H0 : Y N (0, 2 )

where is assumed to be known


H0 is a dummy hypothesis which makes calculation of the
ML estimator easy for random processes

40 / 44
Likelihood Ratio of a Signal in AWGN
Let Hs () be the hypothesis corresponding the following
received signal

Hs () : y (t) = s (t) + n(t)

where can be a vector parameter


Define a noise-only dummy hypothesis H0

H0 : y (t) = n(t)

Define Z and y (t) as follows

Z = hy , s i
s (t)
y (t) = y (t) hy , s i
ks k2

Z and y (t) completely characterize y (t)

41 / 44
Likelihood Ratio of a Signal in AWGN
Under both hypotheses y (t) is equal to n (t) where

s (t)
n (t) = n(t) hn, s i
ks k2

n (t) is independent of the noise component in Z and has


the same distribution under both hypotheses
n (t) is irrelevant for this binary hypothesis testing problem
The likelihood ratio of y (t) equals the likelihood ratio of Z
under the following hypothesis testing problem

Hs () : Z N (ks k2 , 2 ks k2 )
H0 () : Z N (0, 2 ks k2 )

42 / 44
Likelihood Ratio of Signals in AWGN
The likelihood ratio of signals in real AWGN is

ks k2
  
1
L(y |s ) = exp hy , s i
2 2

The likelihood ratio of signals in complex AWGN is

ks k2
  
1
L(y |s ) = exp Re(hy , s i)
2 2

Maximizing these likelihood ratios as functions of results


in the ML estimator

43 / 44
Thanks for your attention

44 / 44

You might also like