You are on page 1of 33

ESTIMATION THEORY

Outline

1. Random Variables 2. Introduction 3. Estimation techniques 4. Extensions to Complex Vector Parameters 5. Application to communication systems

[Kay93] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory, Prentice-Hall, New Jersey, 1993. [Cover-Thomas91] T. M. Cover and J. A. Thomas, Elements of Information Theory, Wiley, New York, 1991.

Random Variables
Denitions

A random variable

is a function that assigns a number to every outcome of an experiment. is completely characterized by:

A random variable

Its cumulative distribution function (cdf): Its probability density function (pdf):
Properties

The probability that

lies between

and

then is
 

 

The mean of

is given by m E

The variance of

is given by var E m m
     

Random Variables
Examples

Uniform random variable:


pdf:

mean and variance: Gaussian random variable:

var

pdf: mean and variance:

exp
 




var

can then be determined by

are given by

we have

, we can dene

and

and

From this follows the popular Bayes rule

For independent random variables

and

For two random variables

and

Random Variables

The conditional pdfs

Two random variables

The joint pdf:

The joint cdf:

The marginal pdfs

Random Variables
Function of random variables

Suppose

is a function of the random variables

and

, e.g.,

Corresponding increments in the cdf of

Hence, the expectation over The mean of is given by m The variance of

and the joint cdf of

and

are the same

var

is given by E m m
   

equals the joint expectation over

and

Random Variables
Vector random variables

A vector random variable

is a vector of random variables




Its cdf/pdf is the joint cdf/pdf of all these random variables. The mean of is given by m


 

E E

m The covariance matrix of




is given by E m m
 


cov

cov

Introduction
Problem Statement

Suppose we have an unknown scalar parameter that we want to estimate from an observed vector , which is related to through the following relationship

where

is a random noise vector with probability density function (pdf)

The estimator is of the form

Note that

Hence, the performance of the estimator

itself is a random variable.

should be described statistically.

Introduction
Special Models

To solve any estimation problem, we need a model. Here, we will look deeper into two specic models: The linear model: The relationship between and is then given by

where

is the model vector and

is the noise vector, which is assumed to have , cov

mean , m

The linear Gaussian model: This model is a special case of the linear model, where the noise vector is assumed to be Gaussian (or normal) distributed: exp



, and covariance matrix

det

Estimation Techniques

We can view the unknown parameter as a deterministic variable Minimum Variance Unbiased (MVU) Estimator Best Linear Unbiased Estimator (BLUE) Maximum Likelihood Estimator (MLE) Least Squares Estimator (LSE) is viewed as a random variable

The Bayesian philosophy:

Minimum Mean Square Error (MMSE) Estimator Linear Minimum Mean Square Error (LMMSE) Estimator

Minimum Variance Unbiased Estimation


A natural criterion that comes to mind is the Mean Square Error (MSE):

mse

The MSE depends does not only depend on the variance but also on the bias. This means that an estimator that tries to minimize the MSE will often depend on the parameter , and is therefore unrealizable. Solution: constrain the bias to zero and minimize the variance, which leads to the so-called Minimum Variance Unbiased (MVU) estimator: unbiased: m for all

var

Remark: The MVU does not always exist and is generally difcult to nd. 10

minimum variance: var is minimal for all


Minimum Variance Unbiased Estimation (Linear Gaussian Model)

For the linear Gaussian model the MVU exists and its solution can be found by means of the Cramer-Rao lower bound (see notes, [Kay93], [Cover-Thomas91]):

Properties: m

var

is Gaussian distributed, i.e.,

11

Best Linear Unbiased Estimation

In this case we constrain the estimator to have the form Unbiased:

m Minimum variance:

for all

var

The rst condition can only be satised if we assume a linear model for m :

Hence, we have to solve min

cov

is minimal for all

cov

subject to

12

Best Linear Unbiased Estimation


Problem: Solution: Proof: Using the method of the Lagrange multipliers, we obtain cov

min

cov

subject to

cov

cov

cov

cov

Setting the gradient with respect to cov


The Lagrange multiplier

to zero we get cov

is obtained by the contraint cov cov

Properties:

var

cov

13

Best Linear Unbiased Estimation (Linear Model)

For the linear model the BLUE is given by

Remark: For the linear model the BLUE equals the MVU only when the noise is Gaussian.

14

Maximum Likelihood Estimation

Since the pdf of


depends on , we often write it as a function that is parametrized on : . This function can also be interpreted as the likelihood function, since it

tells us how likely it is to observe a certain . The Maximum Likelihood Estimator (MLE) nds the that maximizes

The MLE is generally easy to derive. Asymptotically, the MLE has the same mean and variance as the MVU (but not asymptotically equivalent to the MVU).

for a certain .

15

Maximum Likelihood Estimation (Linear Gaussian Model)

For the linear Gaussian model, the likelihood function is given by exp


det

It is clear that this function is maximized by solving min

 

16

Maximum Likelihood Estimation (Linear Gaussian Model)

Problem: Solution: Proof:

min

Rewriting the cost function that we have to minimize, we get

Setting the gradient with respect to


Remark: For the linear Gaussian model, the MLE is equivalent to the MVU estimator.

to zero we get

17

Least Squares Estimation

The Least Squares Estimator (LSE) nds the for which

is minimal

Properties: No probabilistic assumptions required The performance highly depends on the noise

18

Least Squares Estimation (Linear Model)

For the linear model, the LSE solves min


Problem: Solution: Proof:

min

Remark: For the linear model the LSE corresponds to the BLUE when the noise is white, and to the MVU when the noise is Gaussian and white.


19

As before

Least Squares Estimation (Linear Model)


Orthogonality Condition

Let us compute

For the linear model the LSE leads to the following orthogonality condition:





20

The Bayesian Philosophy


is viewed as a random variable and we must estimate its particular realization .

Again, we would like to minimize the MSE

but this time both

This allows us to use prior knowledge about , i.e., its prior pdf

Bmse

and are random, hence the notation Bmse for Bayesian MSE.

Note the difference between these two MSEs:


mse

E


Bmse

Whereas the rst MSE depends on , the second MSE does not depend on .

   

21

Minimum Mean Square Error Estimator


We know that , so that


Bmse Since

for all , we have to minimize the inner integral for each .


Problem: Solution:

Proof: Setting the derivative with respect to


min

mean of posterior pdf of :


to zero we obtain:

  

Remarks: In contrast to the MVU estimator the MMSE estimator always exists. The MMSE has a smaller average MSE (Bayesian MSE) than the MVU, but the MMSE estimator is biased whereas the MVU estimator is unbiased. 22

Minimum Mean Square Error Estimator (Linear Gaussian Model)

For the linear Gaussian model where is assumed to be Gaussian with mean 0 and variance

, the MMSE estimator can be found by means of the conditional pdf of a




Gaussian vector random variable [Kay93]:

where the last equality is due to the matrix inversion lemma (see notes):

Remark: Compare this with the MVU for the linear Gaussian model.

23

Linear Minimum Mean Square Error Estimator

As for the BLUE, we now constrain the estimator to have the form The Bayesian MSE can then be written as Bmse E E E

Setting the derivative with respect to

The LMMSE estimator is therefore given by

to zero, we obtain

 

 

24

Linear Minimum Mean Square Error Estimator


Orthogonality Condition

Let us compute E

:


The LMMSE leads to the following orthogonality condition:


 

 

25

Linear Minimum Mean Square Error Estimator (Linear Model)

For the linear model where estimator is given by

is assumed to have mean 0 and variance




, the LMMSE

where the last equality is again due to the matrix inversion lemma.

Remark: The LMMSE estimator is equivalent to the MMSE estimator when the noise and the unknown parameter are Gaussian.

26

Summary

linear model deterministic


linear Gaussian model deterministic





MVU

BLUE MLE

same as linear model





LSE

same as linear model Gaussian with mean



  

stochastic with mean ?

and var.

and var.

  

  

MMSE

LMMSE

same as linear model

   

27

Extensions to Complex Vector Parameters

linear model deterministic

linear Gaussian model deterministic





MVU

BLUE MLE

same as linear model





LSE

same as linear model Gaussian with mean


stochastic with mean 0 and cov. ?

and cov.

MMSE

LMMSE

same as linear model

28

 

has length K

 

has length L

 


 





Application to Communications

 


  

channel

















 








29

, we obtain

 

 




 

  

  

Application to Communications

and

  

 

. . .

. . .

..

..

. . .

   

Channel estimation model:

..

Symbol estimation model:

. . .

 

..

Dening

30

Application to Communications

Most communications systems (GSM, UMTS, WLAN, ...) consist of two periods: Training period: During this period we try to estimate the channel by transmitting some known symbols, also known as training symbols or pilots. Data period: During this period we use the estimated channel to recover the unknown data symbols that convey useful information. What kind of processing do we use in each of these periods? During the training period we use one of the previously developed estimation techniques on the channel estimation model, , assuming that is known.

During the data period we use one of the previously developed estimation techniques on the symbol estimation model, , assuming that is known.

31

Application to Communications
Channel estimation

Let us assume that cov

BLUE, LSE (or when the noise is Gaussian also the MVU and MLE):

LMMSE (or when the noise and channel are Gaussian also the MMSE):

Remark: Note that the LMMSE estimator requires the knowledge of which is generally not available.

32

Application to Communications
Symbol estimation

Let us assume that cov

BLUE, LSE (or when the noise is Gaussian also the MVU and MLE):

LMMSE (or when the noise and symbols are Gaussian also the MMSE):

Remark: Note that the LMMSE estimator requires the knowledge of can be set to


if the data symbols have energy




which

and are uncorrelated.

33

You might also like