16 views

Uploaded by Abhishek Sarda

regression iit mumbai notes awesome explanation.

- Chap 14
- TERMINOLOGIA ECONOMETRICA
- Lecture 08
- HEDONICS FOR PERSONAL COMPUTERS: A REEXAMINATION OF SELECTED ECONOMETRIC ISSUES
- Egger and Lassmann (2012)_The Language Effect in International Trade a Meta-Analysis
- Subset Selection in Regression Analysis
- 102b_Lect1_Jan8
- NET June 2008 economics paper iii
- MT 201314 Feedecback
- class2
- 37 Using Octave
- Modelo Multiple 1
- Econo I Course Outline[1]
- The Classical Two-Variable Regression Model
- L1_Classical Linear Regression
- multiple reggresion slide
- Anderson 1997 Decision Support Systems
- iv_specification_tests
- Comparing the Performance of Public and Private
- MA-2015-Option-A.pdf

You are on page 1of 67

Automation Lab

IIT Bombay

Linear and Nonlinear Regression

Department of Chemical Engineering

I.I.T. Bombay

31-Mar-16

Regression

Automation Lab

IIT Bombay

Outline

Linear Regression Problem

Ordinary and Weighted Least Squares formulations through

algebraic viewpoint and geometric interpretations

Ordinary and Weighted Least Squares formulations through

probabilistic viewpoint

Ordinary Least Squares and Minimum Variance Estimation

Ordinary Least Squares and Maximum Likelihood Estimation

Confidences intervals for parameter estimates and

hypothesis testing

Nonlinear regression problem: Nonlinear in parameter

models and maximum likelihood parameter estimation

Examples of linear and nonlinear regression

Appendix: Ordinary Least Squares and Cramer-Rao Bound

31-Mar-16

Regression

31-03-2016

Automation Lab

IIT Bombay

Mathematical Models

physical process

Used in all fields: biology, physiology, engineering, chemistry,

biochemistry, physics, and economics

assigned a definite fixed number or a series of fixed

numbers, for any given set of conditions.

describe the input-output relationships and the structure of

the elements (and the constraints) are not precisely known

31-Mar-16

Regression

Automation Lab

IIT Bombay

Elements of a Model

Independent inputs (x)

Output (y) (dependent variable)

Parameters ()

Transformation operator (T)

Algebraic

Differential

x1

xn

31-Mar-16

Mathematical Model

T ( x1 ,.. xn , 1 ,..,

Regression

y

4

31-03-2016

Mathematical Models

Automation Lab

IIT Bombay

Behavior Prediction/Analysis: Understand the influence of the

independent inputs to a system on the observed system output

System/process/material design

Catalyst design, membrane design

Equipment Design: sizing of processing equipment

Flow-sheeting: deciding flow of material and energy in a

chemical plant

System / process operation: monitoring and control, safety and

hazard analysis, abnormal behavior diagnosis

31-Mar-16

Regression

Automation Lab

IIT Bombay

Transport phenomena based models: continuum equations

describing the conservation of mass, momentum, and energy

(RTD) and other age distributions

Empirical models based on data fitting: Typical examplepolynomials used to fit empirical data, thermodynamic

correlations, correlations based on dimensionless groups

used in heat, mass and momentum transfer, transfer

function models used in process control

31-Mar-16

Regression

31-03-2016

Automation Lab

IIT Bombay

Empirical Modeling

Exact expression relating the dependent and the

independent variable may not be known

can be approximated by a polynomial function with arbitrary

degree of accuracy.

Invoking Weierstrass theorem, relationship between the

dependent and independent variables is approximated as a

polynomial

The order of polynomial used typically depends on range of

values over which approximation has been constructed.

31-Mar-16

Regression

Automation Lab

IIT Bombay

Temperature dependence of resistance

R

R

a bT

T

for

T

[T1

for

T2 ]

[T3

T4 ]

Temperature dependence of Cp

Cp

Cp

a bT

for

T2

[T1

for

T

[T3

T2 ]

T

T4 ]

as function of no. of carbon atoms

T

T

31-Mar-16

a bn cn 2

n

n2

Regression

n3

8

31-03-2016

Automation Lab

IIT Bombay

Y

for

[T1

T P

T2 ] for [ P1

Y a bT cP dT

for [T3 T T4 ] for

eP

[ P3

P

2

Pr

Re

- rA

k0e

x

1 (

1) x

E / RT

fPT

P P4 ]

Nu

P2 ]

: relative volatility

CA

31-Mar-16

Regression

.

9

Automation Lab

IIT Bombay

represented in the following abstract form

Defining x

y

f (x )

x1

x2 .... xm

f ( x ) ...

1 1

2 2

y

1 1

2 2

f p (x)

f i ( x ), we can write

...

p p

Sources of error

Measurement Errors in dependent variable (y)

Modeling or Approximation Errors

in modeling and errors in the measurement of y , then

y

31-Mar-16

1 1

2 2

...

Regression

p p

v

10

31-03-2016

Defining z

z1

1 1

....

...

2 2

z 2 ....

zp

zp

Automation Lab

IIT Bombay

T

p

zT v

variable is a linear function of model parameter

x2 )

g ( x1 )

g (x 2 )

y1 , y2 ,........, yn and S Z

Sy

z( 1) , z( 2) ,...., z( n)

equations yi

z( i )

31-Mar-16

v1 , v2 ,...., vn is minimized.

Regression

11

Choice of Objective Function

n

2 - norm :

v1 ,..., vn

vi

or

i 1

n

v1 ,..., vn

wi vi

where wi

0 for all i

i 1

v1 ,..., vn

vi

or

v1 ,..., vn

i 1

wi vi

i 1

31-Mar-16

two choices because of

(a) Amenability to the

Max

i

interpretations

(c) Ease of interpretation

- Norm

v1 ,..., vn

based formulation is

analytical treatment

1 - Norm

n

Automation Lab

IIT Bombay

from viewpoint of

probability and statistics

vi

Regression

12

31-03-2016

Automation Lab

IIT Bombay

Consider estimation of parameters

( a,b)

y

a bx v

z

from S y

f1 ( x )

az1 bz2

f2 ( x)

1 x

y1 , y2 ,........, yn and S Z

y1

a bx1 v1

...(1)

y2

a bx2

v2

...( 2)

vi

...(i )

yn

a bxi

......

a bxn vn

...( n )

31-Mar-16

z (1) , z( 2 ) ,...., z( n )

.....

yi

y1

y2

1

1

x1

x2

v1

v2

...

yi

...

yn

... ... a

1 xi b

...

vi

...

vn

... ...

1 xn

Regression

13

Automation Lab

IIT Bombay

Number of unknown variables =

2 (parameters) + n (errors)

Number of equations = n

Number of equations < number of unknowns

solution.

To estimate model parameters, we resort to optimization

The necessary conditions for optimality provide 2 additional

constraints so that the combined system of equations has a

unique solution.

31-Mar-16

Regression

14

31-03-2016

Automation Lab

IIT Bombay

vi

az1 i

yi

J

a

J

b

and

Ordinary least square

n

( vi ) 2

J

i 1

n

wi ( vi ) 2 ( wi

0 for all i )

i 1

31-Mar-16

(a) Leads to analytical solution

(b) Has nice geometric

interpretation

(c) Facilitates interpretation

and analysis through

statistics

Regression

15

Automation Lab

IIT Bombay

Using vector - matrix notation

Y A

V TV

vi

i 1

J

V TV

2 A T Y A

1

OLS AT A A T Y

( xT By )

x

(xT Bx)

x

31-Mar-16

By

( xT By )

y

BT x

Regression

16

31-03-2016

Geometric Interpretations

1

1

v1

v2

...

vi

...

1

y1

y2

...

yi

...

...

vn

1 x1

1 x2

... ... a

1 xi b

... ...

1 xn

yn

x1

x2

b

...

xn

31-Mar-16

Automation Lab

IIT Bombay

lies in the

Vector Y

column space of matrix A

Assumption : True behavior

Y

A *

* : True Parameters

A

Defining Y

V

we have Y

Y A A * V

V

Regression

17

Geometric Interpretations

Automation Lab

IIT Bombay

A

Y A

AT V

TAT

V

i.e. vector V

: a projection of vector Y on the column space of A.

Y

V

H

A AT A

Y Y

I A AT A

AT Y

H Y : lying in the column space of A

Y

V

31-Mar-16

Regression

18

31-03-2016

Geometric Interpretations

31-Mar-16

Automation Lab

IIT Bombay

Regression

Ethanol-Water Example

19

Automation Lab

IIT Bombay

Experimental Data

Density and weight percent of ethanol in ethanol-water mixture

31-Mar-16

Regression

20

10

31-03-2016

Automation Lab

IIT Bombay

Ethanol-Water Example

31-Mar-16

Regression

21

Automation Lab

IIT Bombay

pressure dependence of reaction yield

Y

a bT

cP dT 2

eP 2

fPT

Corresponding model equations

Yi

a bTi

cPi

dTi

Defining

z (i )

1 Ti

Yi

Pi

Ti 2

a b c d

T

Pi 2

e

PiTi

f

z ( i ) vi for i 1,2,....., n

31-Mar-16

ePi

fPiTi

vi for i 1,2,..., n

Y1

Y2

....

Yn

.... .... .... .... ....

Y

n 1

A

n 6

Regression

T1

Tn

.... ....

P1T1

a

b

....

f

6 1

v1

v2

....

vn

V

n 1

22

11

31-03-2016

Automation Lab

IIT Bombay

Generalization of OLS

of a general multi - linear model of the form

y

from S y

1 1

...

2 2

p p

y1 , y2 ,........, yn and S Z

v

z (1) , z ( 2 ) ,...., z ( n )

y1

z11

z21

.... ....

z p1

y2

z1 2

z22

.... ....

z p2

....

....

....

.... ....

....

....

yn

z1 n

z2 n

.... .... z pn

vn

p 1

V

n 1

Y

n 1

A

n p

31-Mar-16

v1

.

v2

....

Regression

23

Automation Lab

IIT Bombay

Defining weighting matrix

Let W

diag w1

wi

w2 .... wn

0 for all i

Min

wi vi

V T WV

i 1

Subject to Y

A V

J

V T WV

2 A T W Y A

1

A T WA A T W Y

Selecting W

31-Mar-16

Regression

24

12

31-03-2016

Automation Lab

IIT Bombay

Laboratory experimental

data on Yield obtained from a

catalytic process at various

temperatures and pressures

Y

75.9

0.0757 T

3.21 P

Ref.: Ogunnaike, B. A., Random

Phenomenon, CRC Press, London,

2010

31-Mar-16

Regression

31-Mar-16

Regression

25

Automation Lab

IIT Bombay

26

13

31-03-2016

Estimated Model

Automation Lab

IIT Bombay

Y

75.9

0.0757 T

3.21 P

31-Mar-16

Regression

27

Automation Lab

IIT Bombay

Boiling points of a series of hydrocarbons

31-Mar-16

Regression

28

14

31-03-2016

Automation Lab

IIT Bombay

Candidate Models

Linear Model : T

a bn

Quadratic Model : T

n2

250

200

Linear Model

T = 39*n - 170

Quadratic Model

T = - 3*n2 + 67*n - 220

150

100

50

Data 1

Linear Model

Quadratic Model

0

-50

-100

-150

-200

-250

0

4

5

6

n, No. of Carbon Atoms

31-Mar-16

Regression

Unaddressed Issues

10

29

Automation Lab

IIT Bombay

the data set size, n, matrix A and vector Y change

Matrix A is same but only Y changes (due to

measurement errors)

n is same but a different set of input conditions i.e.

different A matrix is chosen

How do we compare estimates generated through two

independent sets of experiments?

Can we come up with confidence intervals for true

parameters?

31-Mar-16

Regression

30

15

31-03-2016

Automation Lab

IIT Bombay

systematically select a most suitable model?

If identified model is used for prediction, how to quantify

uncertainties in the model predictions?

Linear algebra/optimization based treatment of model

parameter estimation problem does not help in answering

these questions systematically.

Remedy: Formulate and solve the parameter estimation

using framework of probability and statistics

31-Mar-16

Regression

31

Automation Lab

IIT Bombay

Notations

Consider n indepedndent random variables

Y1 , Y2 ,........, Yn

one for each Yi , i.e. a set of realizations of Y1 , Y2 ,........, Yn

Sy

y1 , y2 ,........, yn

Model for RV Yi relating

Model Residuals V

Model for RV Yi

relating Random Error Vi

Yi

(i ) T

Vi

Model relating

realizations of RVs Yi and Vi

yi

(i ) T

* vi

Yi

Model relating

realizations of RVs Y , and V

i

*

31-Mar-16

T

z( i ) Vi

yi

Regression

( i) T

v

i

32

16

31-03-2016

Automation Lab

IIT Bombay

Y1

Y2

....

Yn

z1(1)

z2(1)

( 2)

1

(2)

2

.... ....

z (p1)

(2)

p

V1

V2

....

....

Vn

....

z1( n )

z2( n ) .... .... z (pn )

Y

(n 1)

.... .... z

( p 1) ( n 1)

A

(n p )

y1

z11

z21

y2

....

yn

2

2

z1

....

z1 n

Y

n 1

z

....

z2n

.... ....

Note :

Bold Y and bold V

represent Vectors

of Random Variables

z p1

v1

2

p

2

....

represent vectors

v2

....

vn

p 1

V

n 1

of Random Variables

.... .... z

.... .... ....

.... .... z pn

A

n p

31-Mar-16

Note :

Y and V

of " realizations"

Regression

33

Automation Lab

IIT Bombay

Notations

Note

*

(ordinary) : Parameter Estimates (a realization of )

Set S Z

z (1) , z ( 2) ,...., z ( n )

i.e. there are no errors in measurements

or knowledge of z (i ) : i 1,2,..., n

31-Mar-16

Regression

34

17

31-03-2016

Automation Lab

IIT Bombay

V

*

1 1

*

n

...

zp

0 and Var V

EV

, i.e.

been made about the form of FV (v )

vector and known exactly. Thus,

*

1 1

EY

where

*

1

,...,

*

p

*

p p

...

31-Mar-16

Regression

35

Automation Lab

IIT Bombay

yi , z1 i ,...,z pi :i 1,2 ,...,n

and corresponding model equations

* i

1 1

Yi

* i

p p

...

Vi

Vi

* i

1 1

Yi

...

* i

p p

Note : RVs Yi

* i

1 1

...

* i

p p

Vi for i 1,2,....n

31-Mar-16

Regression

36

18

31-03-2016

Automation Lab

IIT Bombay

collecting all model equations, we have

Y1

Y2

....

Yn

z11

z1 2

...

z1 p

Y

n 1

.... z p1

.... z p2

... ...

.... z pn

...

...

V1

V2

....

Vn

*

1

*

1

n p

p 1

V

n 1

or,

Y

31-Mar-16

A*

Regression

37

Since E Vi

Automation Lab

IIT Bombay

0,

it implies that E V

0n 1

and

EY

E A *

Let R

A *

Cov( V )

E[ VV T ]

and var(Vi )

R

, it follows that

E[ VV T ]

In

from measuremen ts Y

31-Mar-16

Regression

A *

V

38

19

31-03-2016

Automation Lab

IIT Bombay

is obtained by minimizing objective function

VTV

Y A

Y A

with respect to

Note : V T V

vi 2

i 1

minimizes the sample variance of the modeling errors

OLS

AT A

AT Y

3/31/2016

State Estimation

39

Automation Lab

IIT Bombay

Since Y is a realization of RV Y

it follows that

1

1

AT A AT Y

A T A A T A* V

OLS

1

E

AT A AT E Y

OLS

A T A A T A *

31-Mar-16

Regression

40

20

31-03-2016

Automation Lab

IIT Bombay

1

Defining matrix L

AT A

LY * LV

OLS

Cov OLS

E OLS

LRLT

Difficulty :

* OLS

2

LLT

1

n

LE VV T LT

AT A

Remedy : Estimate

AT

from samples

Y AOLS

TV

where V

31-Mar-16

2 AT A

Regression

41

Automation Lab

IIT Bombay

Given measurements Y

Y

A *

R n and a model

V

and V

EV

0n 1 and Cov V

Suppose we want to find an unbiased

estimate, , of unknown, * R p , such that

Cov

E * *

is as small as possible.

31-Mar-16

Regression

42

21

31-03-2016

Automation Lab

IIT Bombay

Taking clues from the OLS solution,

LY where L is a ( p n ) matrix

Cov

E * *

E LY * LY *

Minimum Variance

Parameter Estimation Problem :

Find Matrix L such that Cov is as small as possible.

3/31/2016

State Estimation

43

Automation Lab

IIT Bombay

The unbiasedness requirement implies that

E * E LY * E L A* V * 0

E LA I *

Since E V

AV

LA

.Min

L

J L

J Cov

J Cov LY

subject to LA

3/31/2016

State Estimation

I

44

22

31-03-2016

Automation Lab

IIT Bombay

a scalar objective function. Consider a scalar function

J Cov

tr E * *

T

1

tr E * *

2

Var 1 Var 2 .... Var p

a scalar function of Cov( ) can be formulated as

Min

L

J L

1

tr E LY * LY *

2

subject to LA

3/31/2016

State Estimation

45

Automation Lab

IIT Bombay

J

T

1

tr E * *

tr LA I

2

is minimized w.r.t. L, where

E * *

* *

LY L A* V * LV

3/31/2016

State Estimation

46

23

31-03-2016

Automation Lab

IIT Bombay

Since E VV T R , it follows that

T

E T

* *

LE VV T LT

* *

LRLT

or E * *

LRLT

reformulated as minimizing objective function

J

1

tr LRLT tr LA I

2

with respect to L

3/31/2016

State Estimation

47

Automation Lab

IIT Bombay

Using results

tr BAC

BT CT and

tr ABAT

A BT

B

A

A

the necessary conditions for optimality are

J

L

LR

J

AT

[0]

I LA [0]

Thus, we have

T

L

and I LA

3/31/2016

AT R

T

A T R 1A [0]

State Estimation

48

24

31-03-2016

Automation Lab

IIT Bombay

This implies

AT R 1A

T

AT R

A T R 1A A T R

MV

A T R 1A AT R 1 Y

LY

1

*

A T R 1A AT R 1 V

MV

which implies

Cov

E MV

min

* MV

3/31/2016

AT R 1A

State Estimation

49

Automation Lab

IIT Bombay

MV

LY

A T R 1A A T R 1 Y

WLS

X

A T WA A T W Y

Gauss-Markov theorem

The minimum variance unbiased linear estimator is

identical to the weighted least square estimator

when the weighting matrix is selected as inverse

of the measurement error covariance matrix.

31-Mar-16

Regression

50

25

31-03-2016

Automation Lab

IIT Bombay

Any other linear unbiased estimator of ,

say ~

~

LY, where ~

L

L will have

tr Cov

tr Cov ~

MV

R

Cov( V )

I,

is the minimum variance estimator, i.e.

MV

AT (

I) 1 A A T (

Cov( MV )

3/31/2016

I) 1 Y

AT A

AT A A T Y

1

OLS

Cov( OLS )

State Estimation

51

Automation Lab

IIT Bombay

Insights

errors in the parameter estimates can be reduced by

increasing the sample size.

OLS estimator can be viewed as an estimator

that minimizes sample variance of model residuals

that yields the parameter estimates with the minimum

possible variance (the most efficient linear estimator)

about the distribution of the model residuals.

31-Mar-16

Regression

52

26

31-03-2016

Automation Lab

IIT Bombay

explains the data best from candidate models, we

need to test hypothesis whether an estimated

model coefficient is close to zero or not close to

zero, i.e. whether the associated term in the

model can retained or neglected

We need to generate confidence intervals for the

true model parameters

We need to use the estimated model for carrying

out predictions

Thus, we cannot proceed further unless we select

a suitable distribution for the model residuals

31-Mar-16

Regression

53

Automation Lab

IIT Bombay

.

0.6

0.4

Y

Y

114.4 0.123 t 3.305 10- 5 t 2 V

0.2

Model

Developed

using OLS

0

-0.2

-0.4

Data

Linear Model

Quadratic Model

-0.6

-0.8

1850

1900

1950

2000

Year

31-Mar-16

Regression

54

27

31-03-2016

Automation Lab

IIT Bombay

Statistics

Linear Model

OLS

8.187 4.168 10

2

15.56

AT A

1.8208 10

3 T

-8.074 10

-8.074 10

4.19 10

Quadratic Model

OLS

114.4

0.123 3.305 10

2

AT A

1.582 10

4.29 104

4.46 101

1.16 10

4.46 101

4.63 10 2

1.202 10

31-Mar-16

5 T

1.16 10 2

1.202 10

7

3.118 10

Regression

55

Automation Lab

IIT Bombay

18

Normalized residual : ~

vi

16

14

Histogram

of Quadratic model

residuals (normalized)

10

8

6

15

4

2

0

-3

-2

-1

0

1

Normalized Residual

Histogram

of Linear model

Residuals (normalized)

10

Frequency

Frequency

12

vi

0

-3

31-Mar-16

Regression

-2

-1

0

1

Normalized Residual

56

28

31-03-2016

Choice of Distribution

Automation Lab

IIT Bombay

Penalizes square of the deviations from zero error (i.e.

mean)

Thus, it favors errors close to zero

Moreover, positive and negative errors of equal

magnitude are equally penalized

are approximately bell shaped in most LS

estimation

Thus, it is reasonable to assume that the model

residuals have Gaussian/normal distribution

This choice also follows from a generalized version

of the Central Limit Theorem

31-Mar-16

Regression

57

Automation Lab

IIT Bombay

Vi

( i)

1 1

Yi

( i)

p p

...

and identically distributed.

No distribution was specified for Vi

Now we additional ly assume that each Vi is Gaussian, i.e.

Vi ~ N 0,

for i 1,2,....n

or in other words

V ~ N 0n 1 ,

31-Mar-16

Regression

In

58

29

31-03-2016

Automation Lab

IIT Bombay

Modeling Error Densities

True

Regression

Line

31-Mar-16

Regression

59

Automation Lab

IIT Bombay

Consequences of Gaussianity

and i.i.d., we can construct the likelihood function

for unknown parameters as follows

L

NV1 v1|

NV2 v2|

where

NVi vi|

vi

ln L

31-Mar-16

yi

1

2

exp

zi

...

1 1

n

ln 2

2

n

ln

2

Regression

vi2

2 2

i

p p

vi2

i 1

60

30

31-03-2016

Automation Lab

IIT Bombay

Alternatively

L

f V V |

n/ 2

exp

exp

1

n/ 2

ln L

n

ln 2

2

n

ln

2

1

1

V T 2I V

2

1

Y A T Y A

2 2

Y A

Y A

ln L

31-Mar-16

A T A Y

Regression

61

Automation Lab

IIT Bombay

OLS as ML Estimator

1

A T A A T Y

ML

OLS

E ML

2

and Cov

AT A

ML

~ N * ,

ML

AT A

Cov OLS

samples from the Gaussian distribution, then

the OLS estimator turns out to be identical to

the Maximum Likelihood (ML) estimator.

31-Mar-16

Regression

62

31

31-03-2016

Consequences of Gaussianity

Automation Lab

IIT Bombay

1

1

AT A A T Y * A T A AT V

MV

OLS

ML

1

2

E * and Cov

AT A

pii

A TA

it follows that the marginal pdf of is univariate normal

i

~ N * , p

i

i

ii

31-Mar-16

Regression

63

Automation Lab

IIT Bombay

Since i ~ N *i , pii

P

za / 2

i *i

pii 2

za / 2

Difficulty :

is unknown

Remedy : estimate

1

n

TV

*

Question : what is the distribution of i 2i ?

pii

31-Mar-16

Regression

64

32

31-03-2016

Automation Lab

IIT Bombay

1

Consider

T

V V

* T

vi2

V TV

i 1

2

n

Since V

TA * 0

it follows that V

1

2

1 T

V V

2

V TV

1

AT A

31-Mar-16

AT A

AT A

2

p

1 T

V V~

2

Regression

2

n p

65

Automation Lab

IIT Bombay

Now 2

1

n

2

2

TV

i 1

1

p

i i

pii

Thus, (1

P t

i

31-Mar-16

/ 2, n p

2

n p

vi2

~

2

i 1

and

Y A

vi2 where V

/

pii

(/ )

i

n

~

p

2

n p

Z/

p

2

n p

~ Tn

p

p

t

1

ii

Regression

/ 2, n p

ii

66

33

31-03-2016

Automation Lab

IIT Bombay

Linear Model

OLS

8.187 4.168 10

P

A A

2 2*

p22

3 T

; 2

15.56

-8.074 10

2

4.19 10

4.19 10

P 4.168 10

/ 2 ,140

2.763 10

4

2

t

P 3.622 10

31-Mar-16

4.168 10

~ T142

2.763 10

/ 2 ,140

0.95

1.977

/ 2 ,140

3

2 2*

2.763 10

1.8208 10 2

-8.074 10

*

2

1.8208 10

*

2

4.714 10

0.95

Regression

67

Automation Lab

IIT Bombay

Hypothesis Testing

are often not clear about the terms to be

included in the model. For example, for the global

temperature data, should be develop a linear

model or a quadratic model?

To measure the importance of the contribution of

i' th component of * to E Y

*

1 1

31-Mar-16

...

*

i

*

p p

Null hypothesis

H0 :

*

i

Alternate hypothesis

H1 :

*

i

Regression

68

34

31-03-2016

Automation Lab

IIT Bombay

Hypothesis Testing

If H 0 is true then,

0

i

~ Tn

pii

Reject H 0 if

pii

/ 2 ,n p

p value

Let k

pii

p value 2 P Tn p k

31-Mar-16

Regression

69

Automation Lab

IIT Bombay

Quadratic Model

114.4

0.123 3.305 10

1.582 10

4.29 104

P

AT A

4.46 101

4.46 101

1.16 10

5 T

4.63 10

1.16 10

1.202 10

1.202 10

7

3.118 10

the quadratic term is contributing to the mean of Y

31-Mar-16

Null hypothesis

H0 :

*

3

Alternate hypothesis

H1 :

*

3

Regression

70

35

31-03-2016

Automation Lab

IIT Bombay

Hypothesis Testing

If H 0 is true then,

0

3

p33

3

1.582 3.118 10

Since

3.305 10

p33

Since k

0.05

4.705 t0.025,139

4.705, p value

p33

~T142

1.582 3.118 10 11

we reject H 0

3

11

2 P T139

1.9772

4.705

term contributes to the correlation

31-Mar-16

Regression

71

Automation Lab

IIT Bombay

Mean Response

Consider model Y z i

Suppose we select z i

zi

Yi

* Vi

y0 , j

z0

v0, j

Y0

EY z0

z0

an estimate of

Y0

Y

31-Mar-16

z0

Regression

72

36

31-03-2016

Automation Lab

IIT Bombay

Mean Response

Is Y0 an unbiased estimate of

E Y0

E z0

z0

Y0

Y0

Thus we have

Y

Cov Y0

or

E z0

and E

Y0

Cov Y0

2

Y0

z0

z0

z0

T

AT A

zT0 *

T

z0

z0

Cov z 0

T

z0

Pz0

Y ~ N zT0 * ,

0

31-Mar-16

z0

Pz0

Regression

73

Automation Lab

IIT Bombay

Mean Response

Since, in practice, we rarely know true

Y2

2 z 0

where 2

Y

Y0

1

n

Y0

Thus, (1

P Y0

Pz0

TV

and V

V

z0

Pz0

( / )

Y A

Z/

2

n p

~ Tn

/ 2 ,n p

31-Mar-16

z0

Pz0

Y0

Regression

/ 2 ,n p

z0

Pz0

1

74

37

31-03-2016

Automation Lab

IIT Bombay

Future Response

Given the model at fixed z i

Y z0

z0

Y0

z0

* V0

Since we do not know * ,

an estimate of Y0 can be constructed using as follows

T

Y

z 0

0

response, we are interested in finding a prediction interval

that with a given degree of confidence will contain the

response.

31-Mar-16

Regression

75

Automation Lab

IIT Bombay

Future Response

Consider Y0 Y0

Since V~N( 0,

), it follows that

Y0 ~N zT0 * ,

and

Y0 ~N z 0

* ,

z0

Pz0

Y , z 1 , Y , z 2 ,...., Y , z n used to obtain .

1

Y0 Y0 ~N 0,

31-Mar-16

Regression

z0

Pz0

76

38

31-03-2016

Automation Lab

IIT Bombay

Future Response

1

z0

Y0

z0

Pz0

Y0

z0

/

z0

Pz0

Z

2

n p

n

Thus, for any 0

P

Y0

/ 2 ,n p

31-Mar-16

1 we have

T

z0

T

z0

~Tn

Pz0

/ 2 ,n p

Regression

77

Automation Lab

IIT Bombay

Prediction Interval

A 100 1

% prediction interval

i.e. Y0

z0

Recall : 1

Y z 0 , then is :

/ 2 ,n p

z0

Pz0

Y

31-Mar-16

/ 2 ,n p

z0

Regression

Y0

E Y0 ) is

Pz0

78

39

31-03-2016

Automation Lab

IIT Bombay

CI and PI

prediction interval (PI):

Confidence interval (CI) is on a fixed parameter

of interest (like E[Y0] )

Prediction interval (PI) is on a random variable

(like Y0 )

At any z0, the prediction interval on future

response is wider than the confidence interval on

the mean response.

31-Mar-16

Regression

79

Automation Lab

IIT Bombay

displacement (x, inch3) data for various cars. An expert

displacement as: Y = mx + c

Proposed Model

Y

c mx V

T 33.73

c m

0.047 T

2

9.3941 ; P

A A

0.2219

-6.69 10

-6.69 10 4

2.347 10 6

31-Mar-16

Regression

80

40

31-03-2016

Automation Lab

IIT Bombay

40

raw data

regression model

mean response: lower

mean response: upper

ind. pred.:lower

ind. pred.:upper

35

y (gasoline mileage)

30

25

20

Note :

15

PI is narrowest

Mean Response CI

10

at x0

Prediction Interval

5

0

x and

increases as we

move away from x

100

150

x (engine displacement)

31-Mar-16

450

500

Regression

81

Automation Lab

IIT Bombay

is able to adequately explain the response variable Y?

Variability Analysis

Given data set yi , z i : i 1,2,...., n

n

SSY

yi

i 1

n

y i

yi

y i

i 1

yi

y i

y i

2 yi

y i y i

i 1

SS E

SS R

Residuals

Regression

2 yi

i 1

31-Mar-16

yi yi

2 V i yi

i 1

Regression

82

41

31-03-2016

Automation Lab

IIT Bombay

Variability Analysis

implies that

AT Y A

T

Y A A

T

Y A A

0T

vi yi

i 1

Note

Models used in multilinea r regression

are typically of the form

Yi

i

2 2

i.e. z1 i

...

i

p p

Vi

1 for i 1,2,..., n

31-Mar-16

Regression

83

Automation Lab

IIT Bombay

Variability Analysis

1n 1

1 1 ... 1 T

0

A T Y A A T V

vi

i 1

n

vi y

i 1

SSY

Total Variability

vi

i 1

SS E

SS R

31-Mar-16

Regression

84

42

31-03-2016

Variability Analysis

Automation Lab

IIT Bombay

of the residuals and is denoted as SS Re s or SS E

A good measure of the quality of the fit is

R2

SS R

SYY

SS E

SYY

(a direct measure of the quality of fit)

R2 quantifies the proportion of the variability in

the response variable explained by the input variable.

A good fit should result in high R2

31-Mar-16

Regression

Variability Analysis

R2

SS R

SYY

85

Automation Lab

IIT Bombay

Total observed variation in Y

Note : 0

R2 1

that the model adequately captures the relevant

information contained in the data.

Conversely, the coefficient of determination close to 0

indicates a model that is inadequate to captures the

relevant information contained in the data.

In general, it is possible to improve R2 by introducing

additional parameters in a model. However, note that

the improved R2 can be, at times, misleading.

31-Mar-16

Regression

86

43

31-03-2016

Automation Lab

IIT Bombay

Variability Analysis

An alternate measure of model fit

SS E /(n p )

2

Radj

1

SSY /( n 1)

SS E /( n

regardless of no. of variable in model

2

Radj

: penalizes a model that improve R2

2

Relatively high and comparable valaues of R 2 and Radj

adequately without using excessive number of parameters.

31-Mar-16

Regression

87

Automation Lab

IIT Bombay

SSY

1237.54,

R

SS R

955.72

0.77

Linear Model

SSY

R

6.6934,

0.6192,

SS R

2.5491

2

adj

0.6164

Quadratic Model

SSY

R

6.6934,

0.6715,

Note : 0

31-Mar-16

SS R

2

adj

R2

Regression

2.1989

0.6668

1

88

44

31-03-2016

Automation Lab

IIT Bombay

Boiling points of a series of hydrocarbons

31-Mar-16

Regression

89

Automation Lab

IIT Bombay

Candidate Models

Linear Model : T

a bn

Quadratic Model : T

n2

250

200

Linear Model

T = 39*n - 170

Quadratic Model

T = - 3*n2 + 67*n - 220

150

100

50

Data 1

Linear Model

Quadratic Model

0

-50

-100

-150

-200

-250

0

31-Mar-16

4

5

6

n, No. of Carbon Atoms

Regression

10

90

45

31-03-2016

Automation Lab

IIT Bombay

10

8

6

4

6.3373

0

-2

-4

-6

-8

0

4

6

n, No. of Carbon Atoms

Quadratic Model : T

31-Mar-16

Regression

91

Automation Lab

IIT Bombay

Confidence Interval

1.9464

P

(A A)

Thus, 1

-0.9107

0.0893

0.0893 -0.0536 0.0060

0.05 and t0.025,5

*

i

2.5706

Parameter 2 : 55.079, 78.254

Parameter 3 : -4.2807, -1.7670

3rd component of to the quadratic model

we can test hypothesis 3*

31-Mar-16

Regression

0

92

46

31-03-2016

Automation Lab

IIT Bombay

Hypothesis Testing

Null hypothesis

H0 :

*

3

Alternate hypothesis H1 :

0

3

3

If H 0 is true then,

~T

pii 0.4889 5

*

3

0.01, test of H 0 is to

Reject H 0 if

t

4.0321

0.4889 0.005, 5

Fail to reject H 0 Otherwise

3

-3.0238

Test Statistics : k

6.1845

0.4889 0.4889

Since k 4.0321, the null hypothesis is rejected

31-Mar-16

Regression

93

Automation Lab

IIT Bombay

Hypothesis Testing

p value

k

p value

2 P T5

6.1845

0.0016

Thus, there is strong evidence that the quadratic

term contributes to the correlation between the

boiling point and the carbon number

2

Coefficien t of Determinat in R 2 and Radj

Linear Model

: R2

0.974

2

Radj

0.9698

Quadratic Model : R 2

0.997,

2

Radj

0.9958

31-Mar-16

Regression

94

47

31-03-2016

Analysis of Residuals

Automation Lab

IIT Bombay

Linear Model

Normalized

Residuals show

a pattern

2

1

0

-1

Quadratic Model

Normalized

Residuals are

Randomly spread

Between +/- 2

-2

-3

Linear Model

Quadratic Model

-4

-5

0

31-Mar-16

4

6

n, No. of Carbon Atoms

Regression

95

Automation Lab

IIT Bombay

Laboratory experimental

data on Yield obtained from a

catalytic process at various

temperatures and pressures

(n = 32)

y

75.9 0.0757 x1

3.21x2

Ref.: Ogunnaike, B. A., Random

Phenomenon, CRC Press, London,

2010

31-Mar-16

Regression

96

48

31-03-2016

Automation Lab

IIT Bombay

2.5

2

75.8660

1.5

0.0757

0.5

3.2120

0

-0.5

-1

0.9415

-1.5

-2

-2.5

0

10

15

20

Sample No.

31-Mar-16

25

30

35

Regression

97

Automation Lab

IIT Bombay

Confidence Interval

P

(A A)

-0.0925 0.0010

0.0

-0.6500

0.0

0.4000

0.05 and t0.025, 29

*

i

2.0452

Parameter 2 : 0.015, 0.137

Parameter 3 : 1.994, 4.43

2 nd component of to the proposed model

we can test hypothesis

31-Mar-16

Regression

*

2

0

98

49

31-03-2016

Automation Lab

IIT Bombay

Hypothesis Testing

Null hypothesis

H0 :

*

2

Alternate hypothesis H1 :

0

2

2

If H 0 is true then,

~T

p22 0.0298 29

*

2

Reject H 0 if

0.05, test of H 0 is to

t

2.0452

0.0298 0.025, 29

Fail to reject H 0 Otherwise

2

0.0757

Test Statistics : k

2.5439

0.0298 0.0298

Since k 2.0452, the null hypothesis is rejected

31-Mar-16

Regression

99

Automation Lab

IIT Bombay

Hypothesis Testing

p value

k

p value 2 P T29 2.5439 0.0166

Note : p value level of significance ( 0.05)

At level of significan ce

Reject H 0 if

0.01, test of H 0 is to

t

0.0298 0.005, 29

Fail to reject H 0 Otherwise

Since k

Note : p value

31-Mar-16

2.7564

Regression

100

50

31-03-2016

Reaction Rate Equations : -RA

k0e

E / RT

CA

Automation Lab

IIT Bombay

Nu

Pr

Re

Sh

Sc

Re

x

1 (

1) x

Thermodynamic correlations

RT

a

P

V b V T V b

RT

a

P

V b V2

B

ln Pv

A

T C

Van Der Waal model

Antoine Equation

31-Mar-16

Regression

101

Nonlinear-in-Parameter Models

Automation Lab

IIT Bombay

Defining x

x1

x2 .... xn

....

T

m

g ( x, * )

: Model residual,

* : True parameters

Parameter Estimation

Ordinary least square

OLS

Min

problem has to be solved

using numerical

optimization tools.

( i )2

i 1

OLS

Min

31-Mar-16

wi ( i ) 2

( wi

0 for all i )

i 1

Regression

102

51

31-03-2016

Automation Lab

IIT Bombay

yi , x i : i 1,2,..., n

Sy

and model equations

g x i ,

Yi

for i 1,2,..., n

i

Yi

g x ,

i

~ N (0,

31-Mar-16

) for i 1,2,....n

Regression

103

Automation Lab

IIT Bombay

Consequences of Gaussianity

Under the assumption that

for unknown parameters as follows

L

N 1 e1|

e2|

.... N

en|

where

1

2

N i ei|

ei

log L()

31-Mar-16

yi

exp

ei2

2 2

g x i ,

n

log(2 )

2

n

log(

2

Regression

1

2

ei2

i 1

104

52

31-03-2016

n

ln 2

2

ln L

n

ln

2

g x i ,

yi

Automation Lab

IIT Bombay

i 1

Min

ln L

Min

i 1

g x i ,

yi

ML

Min

i 1

yi

g x i ,

OLS

Thus, under the Gaussian assumption, the OLS estimator turns out

to be identical to the Maximum Likelihood (ML) estimator.

31-Mar-16

Regression

105

Automation Lab

IIT Bombay

Gauss-Newton Method

Consider Taylor series based approximation

in the neighborhood of a guess solution,

g x i ,

g x i ,

g x i ,

Yi

g x ,

g x i ,

for i 1,2,...., n

Defining Yi( k )

Yi( k )

31-Mar-16

Yi

g x i ,

z( i ,k )

and z i ,k

Vi k

Regression

Vi

g x i ,

for i 1,2,..., n

106

53

31-03-2016

Automation Lab

IIT Bombay

Gauss-Newton Method

T

y1 k

.....

ynk

Y

z 1,k

........

T

z n ,k

k

Ak

Ak

v1k

.....

vnk

k

~ N 0,

2

k

estimate of

k

Ak

Ak

is

Ak

a new guess for as k 1 k

k

and continue the iterations till the folloing

termination criterion is satisfied

k

k 1

(tolerance) where

yi

g x i ,

i 1

31-Mar-16

Regression

107

Automation Lab

IIT Bombay

Let N represent the optimum solution obtained

when the Gauss Newton method terminates.

From the properties of OLS, it follows that

Cov

An estimate

2

V

2

N

2 N

V

( N)

V

1

n

N is only

translation of RV N we can argue that Cov N

is identical to that of Cov( ), i.e.

Cov

31-Mar-16

2

V

Regression

108

54

31-03-2016

Automation Lab

IIT Bombay

Defining P N

* .

N , it follows that ~N * , 2 p N

distribution of V

i

i

ii

V N

Since

2 N

V

2

V

and

i

N

V

2

n p

/ 2,n p

N /

V

piiN

N

V

piiN

2

n p

*

i

Thus, 1

P i

, it follows that

2

n p

Z/

~Tn

piiN

*

i

31-Mar-16

/ 2, n p

*

i

piiN

Regression

109

Linearizing Transformations

Automation Lab

IIT Bombay

can be derived using variable transformations

log Nu

log Pr

log Nu

log

rA

log Re

log Pr

log(k0 )

log

log Re

E 1

R T

1

1

x

n log(C A ) V

1

1

y

1/

models can be used for estimating parameters of the

transformed model.

31-Mar-16

Regression

110

55

31-03-2016

Automation Lab

IIT Bombay

Difficulty

The Original Model Residual ( ) cannot be transformed.

T

Min

i 1

Vi

is NOT equivalent to estimating by solving for

Min

2

i

i 1

a good initial guess for solving the nonlinear optimization problem.

31-Mar-16

Regression

111

Automation Lab

IIT Bombay

By this approach, we try to approximate

2

i

wiVi 2

Min

WLS

wi

i 1

Vi

for i 1,2,..., n

Vi

g 0

g

i

i

Vi

g

i

i

1 2

Choose wi

g

i

31-Mar-16

Regression

112

56

31-03-2016

Automation Lab

IIT Bombay

WLS Example

Consider model

Y

f1 x

f2 x

Y

fp

...

ln Y

ln

ln f1 x

....

ln f p

T

Min

i 1

T

Vi

ln

Yi

0,Vi 0

Min

i 1

31-Mar-16

1

Yi

ln yi

ln yi

....

ln Y

ln Yi

ln Yi

Vi

i

vi

ln Yi

Choose wi

yi2

Regression

113

Automation Lab

IIT Bombay

WLS Example

Thus, we solve for

T

Min

T

yi2 vi

Min

T

i 1

ln yi

yi2( ln yi

i 1

diag y12

ln y1

YT

and A

y22 ....

ln y2

yn2 ,

.... ln yn

ln f1 x 1

.... ....

ln f1 x 2

.... .... ln f p 1 x 2

....

1

....

ln f1 x

ln f p 1 x 1

.... ....

n

....

.... .... ln f p 1 x

T ,WLS

31-Mar-16

A T WA

Regression

AT W Y T

114

57

31-03-2016

Automation Lab

IIT Bombay

Some of the earliest contributions to chemical engineering science

occurred in the form of correlations that shed light on mass and heat

transfer in liquids and gases. These correlations were usually

presented in the form of dimensionless numbers, combinations of

physical variables that enable general characterization of transport

and other phenomena in a manner that is independent of specific

physical dimensions of the equipment in question. One such correlation

regarding mass-transfer in falling liquid films is due to Gilliland and

Sherwood, (1934). It relates the Sherwood number, Sh, (a dimension

number representing the ratio of convective to diffusive mass

transport) to two other dimensionless numbers: the Reynolds number,

Re, (a dimensionless number that gives a measure of the ratio of

inertial forces to viscous forces) and the Schmidt number, Sc (the

ratio of momentum diffusivity (viscosity) to mass diffusivity; it

represents the relative ease of molecular momentum and mass

transfer).

Ref.: Ogunnaike, B. A., Random Phenomenon, CRC Press, London, 2010

31-Mar-16

Regression

115

Automation Lab

IIT Bombay

A sample of data from the original

large set (of almost 400 data points!)

is shown in the table.

Postulated model : Sh

Re Sc

0.023, 0.830, 0.440

Parameters estimated using

linearizing transformation and OLS

0.0371,

0.777,

0.4553

OLS

OLS

OLS

linearizing transformation and WLS

0.0372,

0.7852,

0.4298

WLS

31-Mar-16

WLS

Regression

WLS

116

58

31-03-2016

Automation Lab

IIT Bombay

Parameter Estimation

Estimation using Gauss Newton Method

parameters to start the Gauss Newton method.

yi k

g x i ,

yi

Shi

Rei

Sci

for i 1,2,..., n

z

z

i ,k

Rei

g x i ,

i ,k

Sci

Initial Guess :

Generated using

the OLS solution

0.0371,

ln Rei

k

ln Sci

0.777,

0.4553

for i 1,2,..., n

31-Mar-16

Regression

117

Automation Lab

IIT Bombay

Parameter Estimation

0.0338,

0.7885,

0.4361

GN

Cov

GN

GN

95 % confidence intervals

10

-0.679 -0.276 36.707

Method

R2

: [1.06 10

5.70 10 2 ]

: [7.15 10

8.62 10 1 ]

: [3.07 10

5.65 10 1 ]

R2 (adjusted)

Linearized (OLS)

0.9795

0.9768

Linearized (WLS)

0.9797

0.9770

Gauss Newton

0.9798

0.9772

31-Mar-16

Regression

Note: R2

calculated

using the

original

model

118

59

31-03-2016

Automation Lab

IIT Bombay

Results of Regression

ln(Sh/Sc ) -ln(Sh/Sc0.44)

0.01

0.005

-0.005

-0.01

-0.015

7.5

31-Mar-16

Transformed Model (WLS)

Gauss Newton

8.5

ln(Re)

9.5

10

Regression

Results of Regression

119

Automation Lab

IIT Bombay

reference/true values of the parameters, then, from

the figure we can conclude that

Estimates obtained using linearizing transformation and

OLS are far from the reference values

Estimates obtained using linearizing transformation and

WLS are relatively closer to the reference values

Estimates obtained through the Gauss Newton method

are closest to the reference values

31-Mar-16

Regression

120

60

31-03-2016

Automation Lab

IIT Bombay

Michaelis-Menten Model

has been treated with Puromycin at several concentrations

r

C

C

1

2

31-Mar-16

Regression

121

Automation Lab

IIT Bombay

Parameter Estimation

Linearizing Transformation :

1

r

C V

195.8 and

0.04841

1,OLS

1

ri

1

ri

1

i

Ci Vi

Vi

g( i )

Vi

i

2 ,OLS

1

i

ri

1

2

i

ri

1

r

Choose wi

ri4

208.67 and

0.05309

1,WLS

31-Mar-16

2 ,WLS

Regression

122

61

31-03-2016

Automation Lab

IIT Bombay

Parameter Estimation

using the linearizing transformation.

212.68 and

0.06412

1,GN

Cov

10

48.26

0.044

2 ,GN

95 % confidence intervals

0.044

6.857 10

Method

R2

: [1.972 10 2 2.282 10 2 ]

: [4.567 10

R2 (adjusted)

Linearized (OLS)

0.9378

0.9315

Linearized (WLS)

0.9516

0.9468

Gauss Newton

0.9613

0.9574

8.257 10 2 ]

Note: R2

calculated

using the

original

model

The estimates obtained using the linearizing transformation and WLS are

relatively closer to the estimates obtained using Gauss Newton method.

31-Mar-16

Regression

123

Automation Lab

IIT Bombay

Michaelis-Menten Model

220

200

r, Reaction Rate

180

160

140

120

Experimental data

OLS using Linearized Model

Gauss-Newton using Nonlinear Model

WLS using Linearized Model

100

80

60

40

0

31-Mar-16

0.2

0.4

0.6

0.8

C, Concentration (ppm)

Regression

1.2

124

62

31-03-2016

Automation Lab

IIT Bombay

Summary

viewpoint has number of merits.

OLS is shown to be unbiased and most efficient (minimum

variance) estimator among the set of all linear unbiased

estimators

For Gaussian modelling errors, OLS turns out to be the

maximum likelihood estimator and the variance of the

parameter estimates attains the Cramer-Rao bound (see

Appendix). Thus, OLS is the minimum variance unbiased

estimator i.e. best amongst the set of all linear/nonlinear

unbiased estimators.

31-Mar-16

Regression

125

Automation Lab

IIT Bombay

Summary

The probabilistic viewpoint facilitates estimation of

Confidence intervals for the model parameters

Bounds on predicted outputs

add/remove terms in heuristic models

The probabilistic viewpoint provides a systematic method to

select a most suitable model from a set of multiple

candidate models.

31-Mar-16

Regression

126

63

31-03-2016

Automation Lab

IIT Bombay

Summary

which remain untouched in these notes. For example,

what if the values of the independent variable x (or z)

itself are corrupted with random noise?

what if we want to update model parameters on-line,

i.e. when size of the data matrix increases with time as

we collect more and more data?

A list of useful references that discuss such issues in

detail is given on the next slide.

31-Mar-16

Regression

127

Automation Lab

IIT Bombay

References

Jevanovich College Publisher, New York, 1988.

Ogunnaike, B. A., Random Phenomenon, CRC Press, London, 2010.

Myers, R. H., Montgomery, D. C., Vinning, G. G., Robinson, T. J.,

Generalized Linear Models with Applications in Engineering and

Sciences (2nd Ed.), Wiley, N. J., 2010.

Gourdin, A. and M Boumhrat; Applied Numerical Methods. Prentice

Hall India, New Delhi.

Crassidis, J. L. and Junkins, J. L., Optimal Estimation of Dynamic

Systems (2nd Ed.), Chapman and Hall/CRC, 2012.

31-Mar-16

Regression

128

64

31-03-2016

Automation Lab

IIT Bombay

being pursued for more than last 300 years. Through

these notes, we have only touched the tip of this

iceberg. Hope these notes give you some confidence to

explore the iceberg later when you need to do it in your

professional career!

Thank You!

31-Mar-16

Regression

129

Automation Lab

IIT Bombay

31-Mar-16

Regression

130

65

31-03-2016

Automation Lab

IIT Bombay

Cramer-Rao Inequality

errors, the Cramer-Rao inequality give us a lower

bound on the covariance of the errors between the

estimated quantities and the true values.

Consider the conditional density f Y Y |

The Cramer - Rao inequality for an arbitrary unbiased

is given by

weighted least square estimate X

W

E W

W

*

* T

F

ln f Y Y |

ln f Y Y |

3/31/2016

or F

ln f Y Y |

T

State Estimation

131

Automation Lab

IIT Bombay

from measurements Y

A*

Gaussian distribution, i.e. V ~ N 0, R

Now, if is given, then

E Y|

Cov Y|

Cov Y A

Cov V

This implies

f Y Y |

or f Y Y |

3/31/2016

exp

N A ,R

1

Y A

2

State Estimation

R 1Y A

132

66

31-03-2016

Automation Lab

IIT Bombay

1

Y A T R 1 Y A ln( )

2

and the Fisher information matrix become

ln f Y Y |

ln f Y Y |

T

AT R 1A

Cov W

E W

AT R 1A

MV

A T R 1A A T R 1 V

which implies

E MV

MV

3/31/2016

A T R 1A

State Estimation

133

Automation Lab

IIT Bombay

obtained using the Gauss-Markov theorem

is the most efficient possible estimate

For the regression problem under consideration

1

AT A AT V

MV

OLS

and R

E OLS

I, which implies

* OLS

AT A

Gaussian distribution implies that

E OLS

OLS

3/31/2016

State Estimation

134

67

- Chap 14Uploaded bywhist888
- TERMINOLOGIA ECONOMETRICAUploaded byIsaí Coa
- Lecture 08Uploaded byNaeem Ahmed Hattar
- HEDONICS FOR PERSONAL COMPUTERS: A REEXAMINATION OF SELECTED ECONOMETRIC ISSUESUploaded byKálosi Szilárd
- Egger and Lassmann (2012)_The Language Effect in International Trade a Meta-AnalysisUploaded byTan Jiunn Woei
- Subset Selection in Regression AnalysisUploaded byarifulmondal
- 102b_Lect1_Jan8Uploaded byvaleriafedyk
- NET June 2008 economics paper iiiUploaded bySunil Yadav
- MT 201314 FeedecbackUploaded byChris Davies
- class2Uploaded byNdivhuho Neosta
- 37 Using OctaveUploaded bygrecatprep9045
- Modelo Multiple 1Uploaded bynavs05
- Econo I Course Outline[1]Uploaded byAbbas Raza
- The Classical Two-Variable Regression ModelUploaded bythrphys1940
- L1_Classical Linear RegressionUploaded byMan Ho Li
- multiple reggresion slideUploaded byGülben
- Anderson 1997 Decision Support SystemsUploaded byIim Mucharam
- iv_specification_testsUploaded byVictor Haselmann Arakawa
- Comparing the Performance of Public and PrivateUploaded bynobitaenoki
- MA-2015-Option-A.pdfUploaded bycoolanki16
- Oliver Linton - Econometrics.pdfUploaded bysilvio de paula
- First Five Chapter GujratiUploaded byনিশীথিনী কুহুরানী
- kristen honors thesis presentationUploaded byapi-223080053
- HW_III (4).pdfUploaded byAnonymous os8J9R
- CheatsheetUploaded byसुनिल बज्राचार्य
- Econometrics BruceUploaded byWendel Mirbel
- Exercises Chapter 6Uploaded byBinasxx
- 1_12(1)Uploaded byRatna Yuniarti
- ECON2206 Assignment 2 William Chau z3376203Uploaded byPeter Dundaro
- GWR_WhitePaper.pdfUploaded byAgus Salim Arsyad

- tut1-2016-qUploaded byAbhishek Sarda
- Gas Absorption 2Uploaded byAbhishek Sarda
- Gas Absorption 2Uploaded byAbhishek Sarda
- Midsem-2012Uploaded byAbhishek Sarda
- Draft Report Ht 305Uploaded byAbhishek Sarda
- Apping GuideUploaded byAnirudh Subramanyam
- Quiz Sheet 1 SolutionUploaded byAbhishek Sarda
- Course Content CL305 2016 S1Uploaded byAbhishek Sarda
- Critical diameter of insulationUploaded bySemihBaşkale
- tut2-solnsUploaded byAbhishek Sarda
- Course OutlineUploaded byAbhishek Sarda
- DistillationUploaded byAbhishek Sarda
- DiameterUploaded byAbhishek Sarda
- Convective Mass Transfer.pdfUploaded byAnonymous 4XZYsImTW5
- Beau Ti Full MessagesUploaded byAbhishek Sarda
- Midsem-2012Uploaded byAbhishek Sarda
- Apping GuideUploaded byAnirudh Subramanyam
- Transient Flow AnalysisUploaded byAbhishek Sarda
- Culture IQ SternbergUploaded byAbhishek Sarda
- SolutioncxfUploaded byAbhishek Sarda
- Pentadienyl CationUploaded byAbhishek Sarda
- PsychoUploaded byAbhishek Sarda
- thermodynamicsUploaded byAbhishek Sarda
- R3000Uploaded byAbhishek Sarda
- mmvliqUploaded byAbhishek Sarda
- Emulsion Coplymerization of Mma and BaUploaded byAbhishek Sarda
- Issue 1 January 2015 BMW Models Price List NewUploaded byAbhishek Sarda

- lec3-08Uploaded byalanpicard2303
- post konsul uji delta niat.docUploaded byTUTIK SISWATI
- Lecture 09Uploaded bynandish mehta
- chap12_2Uploaded byNuur Ahmed
- A Complete Tutorial which teaches Data Exploration in detail.pdfUploaded byTeodor von Burg
- Statistik TugasUploaded byhalcyon12
- Time Series Lecture 1Uploaded byJason Wu
- Module 6 Analysis of VarianceUploaded byLao Mjie
- chapter06-091211160706-phpapp01Uploaded byshan4600
- EconometriaUploaded byJosePaulPeralta
- Desriptive Statistics_Zarni AmriUploaded byAVG2011
- Chapter 7Uploaded bysupermanedit
- Descriptive and Inferential StatisticsUploaded byabusroor2008
- Ece201 Probability Theory and Random Process Th 1.10 Ac29Uploaded bynetgalaxy2010
- Logistic Regression sample size calculationUploaded byAmado Saavedra
- Probability and StatisticsUploaded bytaylor
- AsaUploaded byShailesh Dewan
- May 2003 Exam AnswersUploaded byPete Jacopo Belbo Caya
- Chisquare.indep.sol1Uploaded byMunawar
- Generating Random Variables - Simulating methodsUploaded byyouyiz
- Regresion__15226_4fps%5B1%5DUploaded byCybelle M. Lopez Valentin
- AP Statistics Test ReviewUploaded byPongman92
- Livro_distribuicoesUploaded byVanêzä Souza
- Stat515 Midterm II Fall 2010Uploaded byRishabh Vaish
- TUGAS Statistics.docx 2003Uploaded byratna
- Monte carlo Simulation.xlsxUploaded byAnuj Popli
- Discrete DistributionsUploaded bymanishkapoor04
- cogsysII-9.pdfUploaded byabhijeet borkar
- hw4591Uploaded byRamai Azami
- Fundamentals of Statistical Reasoning in EducationUploaded byGabychis