You are on page 1of 137

Statistical Digital Signal Processing and Modeling

Prof. Dr. Guido Schuster University of Applied Sciences of Eastern Switzerland in Rapperswil (HSR)

Chapter 4 Signal Modeling Slides follow closely the book Statistical Digital Signal Processing and Modeling by Monson H. Hayes and most of the figures and formulas are taken from there
1

Introduction
The goal of signal modeling is that a parametric description of the signal is available This can be used for filter design and/or interpolation and/or extrapolation and/or compression We always use the same model, which is the output of a causal linear shift-invariant filter that has a rational system function The filter input is typically a discrete impulse (n)

Direct Least Squares Method


The modeling error is the difference between the unit sample response h(n) and the desired signal x(n) The goal is to minimize the squared error Which implies

This results in a set of nonlinear equations


These are hard to solve and hence the direct least squares method is not used in practice
3

Exact Signal Matching


Let x(n) be a signal that we want to model with a causal first-order all-pole filter unit sample response This unit sample response has two degrees of freedom b(0) and a(1) Setting h(0)=x(0) and h(1)=x(1) implies Assuming x(0) is not zero this results in the following model

Exact Signal Matching


If we increase the order of the model to include one pole and one zero the problem becomes a bit more complicated This unit sample response has three degrees of freedom b(0), b(1) and a(1) Setting h(0)=x(0),h(1)=x(1) and h(2)=x(2) implies These equations are nonlinear, but in this case can still easily be solved
5

Pad Approximation
The previous example showed, that for matching the number of unit sample response h(n) values equal to the degrees of freedom (p+q+1) in the system function H(z) can result in a set of nonlinear equations This is true in general, but there is an elegant trick to avoid these nonlinear equations and STILL match a given number of x(n) values with the sample unit response of a linear time invariant filter

Pad Approximation
Instead of working with the system function directly We use a little trick In the time domain, this becomes a convolution For n>q, the right hand side is zero Or written in matrix notation

Pad Approximation
This equation is solved in two steps. First for the poles, then with the now known poles, for the zeros.

Hence the ap(k) parameters can be found solving this set of linear equations

Pad Approximation
There a three cases which have to be handled
Case I: Xq is non-singular
Hence the inverse exist and the coefficients of Ap(z) are unique

Case II: Xq is singular and a solution exists.


The coefficients of Ap(z) are not unique since there are non-zero solutions z to the homogeneous equation. Hence for any solution and for any z this combination is also a solution. In this case the solution where the most coefficients are zero might be a good choice

Case III: Xq is singular and no solution exists.


Hence the assumption, that a(0) is not zero was incorrect. So we set a(0) to zero and solve now instead of Since Xq is singular, there is a nonzero solution to this

Pad Approximation
Having found the coefficients of Ap(z), the second step is to find the coefficients of Bq(z) using the first q+1 equations Or in matrix notation

Or using the convolution formula directly

10

Pad Approximation

11

All Pole Model Pad Approximation


Assume that H(z) is an all pole model, i.e., q=0

Hence the Pad equations become

Or in matrix notation Since X0 is a lower triangular Toeplitz matrix the denominator coefficients may be found easily by back substitution
12

Pad Approximation Example


The first six values of a signal are given The goal is to find a three different Pad approximation with three degrees of freedom
All pole (q=0, p=2) FIR (q=2, p=0) IIR (q=1, p=1)
1.5 1 0.5

0.5

1.5

2.5

3.5

4.5

All pole model (q=0, p=2)

13

Pad Approximation Example


The last two equations result in Substituting the given values And solving for a(1) and a(2) From the first equation follows Hence the model for x(n) is Note that the model is not stable, since the poles are outside the unit circle, even though the unit sample response matches x(0), x(1) and x(2)
Impulse Response 1.5 1
300 400 Impulse Response

0.5 0
200

Amplitude

-0.5 -1 -1.5 -2

Amplitude

100

-100

-2.5 -3
-200 0 5 10 15 n (samples) 20 25

0.5

1.5

2.5 3 n (samples)

3.5

4.5

14

Pad Approximation Example


FIR model (q=2, p=0), the result is trivial

Amplitude

Hence the model is simply and the first three values of the unit sample response are matched and all other values are zero

Impulse Response 1.5

0.5

0.5

1.5

2.5 3 n (samples)

3.5

4.5

15

Pad Approximation Example


One pole and one zero IIR model (q=1,p=1) The Pad equations are The pole can be found from the last equation This results in Knowing the pole allows us to calculate the zeros with the top two equations Hence the model has this unit sample response identical to x(n) (Special Case!)
16

Pad Approximation Example


How can the Pad equations be solved if Xq singular? Let x(n) be approximated with an IIR model (p=2, q=2),i.e., 5 degrees of freedom The last two equations are used for finding the poles. Clearly this is a singular set of equations and no non-zero a (1), a(2) combination can be found This implies, that the assumption a(0)=1 is incorrect.
17

Pad Approximation Example


Hence setting a(0) to 0 results in a new set of equations These have a non-trivial solution Using the first three equations to determine the zeros results in And the model becomes with a unit sample response Hence the first 5 values are which only matches the first 4 values of x(n) !
18

Pad Approximation Example


The Pad approximation can also be used for filter design The goal is an ideal halfband lowpass with a jet unspecified filter delay of nd The unit sample response is This response should now be approximated using the Pad approach with 11 degrees of freedom
Once with FIR (p=0, q=10) Once with IIR (p=5,q=5)

19

Pad Approximation Example


Since we can exactly match 11 values of i(n), it makes sense to set nd to 5, so that we can capture most of the energy with the 11 degrees of freedom As always, the FIR is easy
Note that this means, that we simply multiply the ideal filter unit sample response with a rectangular window (window method)

The IIR model follows the steps we used before

20

Pad Approximation Example

21

Pronys Method
Impulse Response 1.5 1 0.5 0
Amplitude

Pad matches perfectly the number of samples in x(n) which correspond to the degree of freedom (p+q+1) What happens after p+q+1 is of no concern in the Pad method Prony does not match p+q+1 samples perfectly, but tries to use its degree of freedom such that an overall mean squared error is minimized

-0.5 -1 -1.5 -2 -2.5 -3

0.5

1.5

2.5 3 n (samples)

3.5

4.5

Impulse Response 400

300

200
Amplitude

100

-100

-200

10

15 n (samples)

20

25

22

Pronys Method
Prony uses the same trick as Pad, but is also concerned about values beyond p+q+1, where the equal sign does not hold anymore The way the error is defined results in a linear problem, which is much easier to solve then the nonlinear equations of the direct least squares method

23

Pronys Method
Since bq(n)=0 for n>q the error can be written explicitly (a(0) =1) Instead of setting e(n)=0 for n=0,..,p+q as in the Pad approximation, Pronys method begins by finding the coefficients ap(k) that minimize the squared error As with Pad, since we focus on samples n>q, the error depends only on the coefficients ap(k)

24

Pronys Method
These coefficients can be found be setting the partial derivatives to zero Since the partial derivative of e*(n) with respect to ap*(k) is x* (n-k) this equation leads to the orthogonality principle Substituting the error expression into this leads to Or equivalently

25

Pronys Method
This can be simplified by using the following definition, which is very similar to the sample autocorrelation sequence (here we are not dividing by the number of samples N, and the sum does not go over all samples, it starts at q+1) This is now a set of p linear equations in the p unknowns ap(1),, ap(p) referred to as the Prony normal equations Or in matrix notation

26

Pronys Method
The Prony normal equations can be expressed using the data matrix Xq containing p infinite-dimensional column vectors The autocorrelation matrix Rx may be written in terms of Xq as follows The vector of autocorrelations rx may also be expressed in terms of Xq as follows Hence this is an equivalent form of the Prony normal equations

27

Pronys Method
If Rx is nonsingular then the coefficients ap(k) that minimize the MSE are Or equivalently Note that is also called the pseudo-inverse

28

Pronys Method
Now the value for the modeling error can be determined

It follows from the orthogonality principle that the second term is zero, therefore the minimum modeling error is Which can be written in terms of the autocorrelation sequence

29

Pronys Method

The normal equations can be written slightly differently, which will become handy later on as so called augmented normal equations Or in matrix notation
30

Pronys Method
Once the coefficients ap(k) have been found, the coefficients bq(k) are found in the same fashion as with the Pad approximation. In other words, the error is set to 0 for n=0,,q. This can be done using the convolution formula directly Or with a matrix multiplication

31

Pronys Method

32

Pronys Method Example


Given a pulse x(n) Find the IIR (p=1,q=1) model using Pronys method

Pronys normal equations

For p=1 In general Hence


33

Pronys Method Example


This results in The denominator of H(z) becomes The numerator coefficients we find in general with a matrix multiplication In particular Thus the model for x(n) is

34

Pronys Method Example


For the minimum squared error we have Since This becomes For example, if N=21

The error corresponds to a minimum squared error The true error results in a squared true error of

(this is the error the direct method tries to minimize)

35

Pronys Method Example


Impulse Response 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Amplitude

10

15

20 n (samples)

25

30

35

40

Impulse Response 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Amplitude

10

15

20 n (samples)

25

30

35

40

36

Pronys Method Example


Comparing Pronys solution with the solution Pad would produce Using the last equation to find a(1) results in a(1)=-1 and using the top two equations to find b(0) and b(1) results in b (0)=1 and b(1)=0 The unit sample response is The model error e(n) is And the true error is 0 for x(n), n=0,..,21 and 1 for all other n!
37

Pronys Method Example


The same lowpass as with the Pad approximation should be designed Again we use an IIR (p=5,q=5) model and nd=5 Solving Pronys normal equations for p=5

Using these coefficients to find the Bq(p) coefficients


38

Pronys Method Example

39

Pronys Method
We may also formulate Pronys method in terms of finding the least squares solution to a set (infinitely many) of overdetermined linear equations, which all want the error to be zero, but that is not possible, since we have only p +q+1 degrees of freedom In matrix notation

For such an overdetermined set of linear equations the least squares solution can be found using the pseudo-inverse
40

Shanks Method
In Pronys method the numerator coefficients are found by setting the error to zero for n=0,,q This forces the model to be exact for those values but does not take into acount values greater than q A better approach is to perform a least squares minimization of the model error over the entire length of the data record

41

Shanks Method
Note that H(z) can be interpreted as a cascade of two filters

Once Ap(z) has been determined the unit sample response can be computed Instead of forcing e(n) to zero for the first q+1 values of n as in Pronys method, Shanks method minimizes the squared error
42

Shanks Method
Note that this is the same error as with the direct method, but since the poles are already determined, solving for the zeros is a linear problem and hence much simpler

As we did with Pronys method, we use the deterministic autoand cross-correlation sequences
Note that here the lower limit starts at 0, while for Pronys method the lower limit starts at q +1
43

Shanks Method
In matrix form, these equations are

These can be simplified by noticing

Since g(n)=0 for n<0 (causal filter) the last term is zero for k>=0 or l>=0 hence Therefore each term rg(k,l) depends only on the difference between k and l
44

Shanks Method
Therefore Becomes now Since rg(k-l)=rg*(l-k) (*) The above equation can be written in matrix form

Or more compactly

(*) Note that there is an error in the book: rx(k-l)=rx*(l-k)

45

Shanks Method
The minimum squared error can now be found Now e(n) and g(n) are orthogonal hence the second term is zero. Therefore the minimum error is Or in terms of rx(k) and rxg(k) Since rgx(-k)=rxg*(k)

46

Shanks Method
Shanks method can also be interpreted as finding a least squares solution to an overdetermined set of linear equations, by letting Now writing this convolution as a overdetermined set of linear equations results in

Or equivalently The pseudo-inverse will then find the least squares solution (Matlab: bq=G0\x0 )
-1

47

Shanks Method

48

Shanks Method Example


Back to the unit pulse of length N For an IIR model (p=1,q=1) Pronys method (and also Shanks method) result in this denominator polynomial Shanks method is now used to find the numerator polynomial We need to know g(n)

49

Shanks Method Example


Finding g(n) by using the inverse z transform resulting in

Now the auto- and crosscorrelations sequences are needed

50

Shanks Method Example


Therefore

And

Solving For bq(0) and bq(1) results in


P r o n y

And for N=21 the model becomes with a true error of

51

All-Pole Modeling
The main advantage of all-pole models is, that there are fast algorithms (Levinson-Durbin recursion) to solve the Prony normal equations. Another reason is, that many physical processes, such as speech, can be well modeled with an all-pole model The error that we are concerned of in Pronys method for finding the ap(k) coefficients is (q=0)

52

All-Pole Modeling
Since x(n)=0 for n<0 then the error at time n=0 is equal to x(0)-b(0) and hence does not depend on the a coefficients. Hence we can include it in the sum of the squared errors and the minimization will still result in the same a coefficients Following the steps of the Prony derivation we arrive at Where now Note for Prony the sum would start at q+1=1, but since we minimize the above error it starts at 0
53

All-Pole Modeling
Again this can be simplified, since x(n)=0 for n<0, the term on the right is zero for k>=0 or l>=0. This means that rx(k,l) depends only on the difference between k and l We define

Hence the all-pole normal equations are Or in matrix form, using the conjugate symmetry
54

All-Pole Modeling
The modeling error is given by We still need to find b(0). The obvious choice would be Since the entire unit sample response is scaled by b(0), it might be better to select b(0) such, that the overall energy in x(n), rx(0), is equal to the overall energy of unit sample response h(n), rh(0) Without proof, this can be achieved by setting
55

All-Pole Modeling
As with the Pad approximation and Pronys method, the all-pole modeling can be interpreted as finding the least squares solution to the following set of overdetermined linear equations

56

All-Pole Modeling

57

All-Pole Modeling Example


Let us find a first-order all-pole model of the form For a given signal x(n) The autocorrelation sequence is The all-pole normal equations are

Which for a first-order model results in


58

All-Pole Modeling Example


Therefore And the modeling error is Selecting b(0) to match the energy The model becomes

59

All-Pole Modeling Example


Let us find a second-order all-pole model for the same signal x(n) Which for a second-order model results in In this example Resulting in The modeling error becomes Selecting b(0) to match the energy
60

(Note since rx(n>1)=0, the modeling error only depends on a(1))

All-Pole Modeling
The all-pole normal equations can again be brought into a special form, which contains the modeling error

61

Linear Prediction
We now establish the equivalence between all-pole signal modeling and linear prediction Recall that Pronys method finds the set of all-pole parameters that minimize the sum of the squared errors

62

FIR Least Squares Inverse Filters


Let us find an inverse filter for the filter g(n) Or in the z-domain In other words, we are looking for an equalizer H(z) such that. Note that most of the time, equality is not possible, since we cannot equalize lost frequency bands with an infinite amplification. Furthermore, as we will see later on, noise amplification is also a problem of inverse filters
63

FIR Least Squares Inverse Filters


To be sure that the inverse filter is stable, we constrain it to be a FIR filter of length N The error of the approximation is Now the goal is to minimize the sum of the squared error This is the same as Shanks method

64

FIR Least Squares Inverse Filters


Since this is Shanks method the solution for the optimum least squares inverse filter is Where And Since d(n)=(n) and g(n<0)=0 Hence the matrix equation becomes Or compactly written
65

FIR Least Squares Inverse Filters


The minimum squared error is Since Results in Again, this can be formulated as finding the least squares solution to a set of overdetermined linear equations by setting e(n>=0)=0 This results in

Which can be solved with the pseudo-inverse

66

FIR Least Squares Inverse Example


The goal is to find the FIR least squares inverse for the system having a unit sample response (| |<1)

FIR inverse of length N=2 Autocorrelation sequence of g(n)

67

FIR Least Squares Inverse Example


In general the normal equations are

In this example Hence the solution for h(0) and h (1) is With a squared error of And a system function of
68

FIR Least Squares Inverse Filters with Delay


In many cases, constraining the least squares inverse filter to minimize the difference between hN(n)*g(n) and (n) is overly restrictive. Often a delay is tolerable, i.e., d(n)= (n-n0) Clearly we need to solve where rdg(k) has now changed to Hence the equations that define the coefficients of the FIR least squares inverse filter are And the minimum squared error is
69

FIR Least Squares Inverse Filters with Delay


Again, this can be formulated as finding the least squares solution to a set of overdetermined linear equations by setting e(n>=0)=0 This results in where the 1 on the right hand side is now at position n0+1 (in this example n0=2)

Which can be solved with the pseudo-inverse

70

FIR Least Squares Inverse Filters with Delay Example


The goal is to find the FIR least squares inverse filter of length N=11 for the system Since h(n) is of length 11 and g(n) is of length 3, then is of length 13. Hence the delay n0 must be between 0 and 12

71

FIR Least Squares Inverse Filters


So far the goal was to find a FIR inverse filter which resulted in a pure delay Sometimes it might be interesting to finding a filter which results in something else than a pure delay Since there where no assumptions about d(n) in the derivation, the same result holds In the interpretation as an overdetermined set of linear equation problem, the right hand side simply becomes a vector containing d(n)

72

Finite Data Records


So far we have assumed, that the data x(n) is available for all times In reality, this never the case and hence we need to deal with the fact, that the data is only known in the interval [0, N] There are two distinct approaches to this problem, the autocorrelation method and the covariance method. Since they are most often used in the context of all-pole modeling, this is the focus of this section

73

The Autocorrelation Method


Assume x(n) is only known over the finite interval [0,N] and should be approximated using an all-pole model Using Pronys method, the coefficients ap(k) are found that minimize where If x(n) is unknown outside [0,N], then e(n) cannot be evaluated for n<p or for n>N So the autocorrelation method assumes, that all values outside the interval [0,N] are zero

74

The Autocorrelation Method


Now we can use the Prony normal equations to find an all-pole model for the windowed signal

The main difference is, that for the summation of the autocorrelation sequence, the lower limit is now k and the upper limit N The Toeplitz structure of the normal equations are preserved, which means, that the LevinsonDurbin recursion can be used to efficiently find the solution

75

The Autocorrelation Method

76

Autocorrelation Method Example


Given the signal x(n), whose first N+1 values are known Use the autocorrelation method to find a first-order all-pole model of the form The normal equations are in general

And in this example (q=1) it follows

77

Autocorrelation Method Example


Since and rx(k)=0 for k>N Therefore Hence

78

The Autocorrelation Method


Applying the window to x(n) forces the signal to zero outside the interval [0,N]. If this is a bad assumption, then the model will not be very good. The signal in the previous example is the unit sample response of the all-pole filter but clearly the solution is biased Without proofing it, a nice property of this windowing is, that the allpole model will be stable

79

The Autocorrelation Method


Again, these results can also be obtained as the least squares solution to a overdetermined set of linear equations using the pseudo-inverse Specifically, in the autocorrelation method we would like to find a set of coefficients ap(k) so that the error e(n) is equal to zero for n>0 Or in matrix form

80

The Autocorrelation Method


Or in matrix form

And the solution can be found with the pseudo-inverse

-1

81

The Autocorrelation Method


Clearly the autocorrelation method can also be used for pole-zero modeling Simply find the poles with the windowed data and then use these poles and the windowed data to find the zeros Since q is then not 0 anymore, the normal equations are not Toeplitz anymore and hence one of the main advantages of the autocorrelation method, the fast Levinson-Durbin recursion, is lost

82

The Autocorrelation Method


Besides just using a rectangular window, the autocorrelation method can also be used with other windows One reason for this is, that for n=0,1,..,p-1 the prediction is based on less than p previous values By having for example a hann window, the beginning and the end of the data is scaled down and hence it is easier to predict with fewer values
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

10

15

20

25

30

83

The Covariance Method


The covariance method does not make any assumptions about the values outside the interval [0,N] This usually results in better models, but both advantages of the autocorrelation method are lost, the fast recursion (Toeplitz) and the guaranteed stability Since the error in Pronys method is defined as It is clear, that if x(n) is only known in the interval [0,N], the error can only be calculated in the interval [p,N]

84

The Covariance Method


This can be seen in the right figure

Hence it makes sense, to define the sum of the squared error such, that only e(n) are used which can also be calculated

85

The Covariance Method


Since no data outside of the original interval is used, no assumptions about the signal outside of the original interval are necessary Since the only difference to Pronys method is the definition of the sum of the squared errors, the same normal equations still hold The calculation of the autocorrelation sequence needs to reflect the fact, that only data inside the original interval should be used
86

The Covariance Method


As with the autocorrelation method, the covariance method may be formulated as a least squares approximation problem If we set the error equal to zero for n in the interval [p,N] we have a set of (N-p+1) linear equations for which the least squares solution can again be found using the pseudo-inverse Note that these equations are a subset of the equations used for the autocorrelation method

87

The Covariance Method

88

Covariance Method Example


Let us revisit the problem where x (n) in the interval [0,N] is given by The goal is to find a first order allpole model using the covariance method In general

In this example Where

89

Covariance Method Example


Again Hence

Therefore

With The model becomes

90

Comparison Example
The goal is to model x(n) as the unit sample response of a second-order all-pole filter The first 20 values are The autocorrelation method uses a windowed signal and then applies Pronys method The normal equations are Where

91

Comparison Example
Evaluating this sum at k=0,1 and 2 results in Hence the normal equations become Solving for a(1) and a(2) we get Hence the denominator polynomial is The modeling error is Setting b(0) to match the energy
92

Comparison Example
The goal is to model x(n) as the unit sample response of a second-order all-pole filter The first 20 values are The covariance method uses a different definition of the error and then applies Pronys method The normal equations are Where

93

Comparison Example
Evaluating this sum we find Hence the normal equations become singular! Hence a lower order is possible => a(2) =0 Solving for a(1) Hence the denominator polynomial is Setting b(0) to 1 the model is Matching the data perfectly for n=0,1,,N
94

Stochastic ARMA Models


So far we have modeled deterministic signals as the unit sample response of a LTI system In this section, the goal is to model stochastic signals, as the output of an causal linear shift-invariant filter, given the input is a unit variance white noise sequence An ARMA(p,q) process may be generated by filtering a unit variance white noise v(n) with a causal linear shift-invariant filter having p poles and q zeros

95

Stochastic ARMA Models


The goal is now to find the missing coefficients by minimizing the mean square error
This though will result in a nonlinear problem

The autocorrelation sequence of an ARMA(p,q) process satisfies the Yule-Walker equations (Eq. 3.115) Where cq(k) is the convolution of bq(k) and h*(-k) And rx(k) is a statistical autocorrelation
96

Stochastic ARMA Models


Since h(n) is causal, then cq(k) =0 for k>q and the YuleWalker equations for k>q are a function only of the coefficients ap(k) This can be expressed in matrix form for k=q+1,,q+p which is a set of p linear equations with p unknowns ap(k)

These equations are called the Modified Yule-Walker equations Hence this approach is called the Modified Yule-Walker Equation (MYWE) method If the autocorrelation is unknown, then an estimate is used

97

Stochastic ARMA Models


After the coefficients ap(k) have been determined, the next step is to find the MA coefficients bq(k) One approach is filtering x(n) with Ap(z) resulting in y(n) with power spectrum Now the moving average coefficients bq(k) may be estimated from y(n) using one of the moving average techniques presented later on

98

Stochastic ARMA Models


A direct approach for finding the coefficients bq(k) uses the Yule-Walker equations for k=0,,q

Since cq(k)=0 for k>q the sequence cq(k) is then known for all k>=0 (but not for k<0) We denote the z-transform of the causal part of cq(k) Similarly the anti-causal part is
99

Stochastic ARMA Models


Recall Hence

Multiplying Cq(z) by Ap*(1/z*) results in the power spectrum of the MA(q) process Since ap(k)=0 for k<0 , Ap*(1/ z*) only contains positive powers of z. The causal part of Py(z) therefore
100

Stochastic ARMA Models


Although cq(k) is unknown for k<0, the causal part of Py(z) may be computed from the causal part of cq(k) and the coefficients ap(k) Using the conjugate symmetry of Py(z) we may then determine Py(z) for all z Finally, performing a spectral factorization of Py(z) produces the polynomial Bq(z)

101

Stochastic ARMA Models Example


The goal is to find an ARMA (1,1) model for a real-valued random process with The modified Yule-Walker equations are in general

In this example Resulting in This was the easy part, now lets find the MA coefficients
102

Stochastic ARMA Models Example


In general

In this example Resulting in Hence Multiplying with Results in Hence the causal part of Py(z)
103

Stochastic ARMA Models Example


Using the symmetry of Py(z) Which, after spectral factorization results in Which in turn leads to the desired ARMA(1,1) model

104

Stochastic ARMA Models


Just like the Pad approximation, we only used the autocorrelation sequence between q and q+p to estimate ak(p) If the autocorrelation sequence is known for values larger than q+p, then this knowledge can also be used in an extension quite similar to Pronys method

105

Stochastic ARMA Models


From the last L-q equations, we get Which is a set of overdetermined linear equations in the unknowns ap(k) Hence the least squares solution can be found using the pseudo-inverse

106

Stochastic AR Models
This is clearly a special case of an ARMA(p,q=0) model Hence its autocorrelation sequence must satisfy the Yule-Walker equations Writing these in matrix form for k>0 using the conjugate symmetry of rx(k)

Solving these p equations for the p unknowns ak(p) is called the Yule-Walker method
107

Stochastic AR Models
Note that these equations are equaivalent to the normal equations for all-pole modeling using Pronys method. The only difference is in the definition of the autocorrelation sequence Statistical definition for the Yule-Walker method Deterministic definition for Pronys method But if we need to estimate the autocorrelation sequence for the Yule-Walker method?
108

Stochastic MA Models
This is clearly a special case of an ARMA(p=0,q) model The Yule-Walker equations (which are nonlinear) relating the autocorrelation sequence to the filter coefficients bq(k) are Instead to solving these directly one approach uses spectral factorization. Since an autocorrelation sequence of an MA(q) process is zero for |k|>q the power spectrum has the following form
109

Stochastic MA Models
Using the spectral factorization given in Eq. 3.102, where Q(z) is a minimum phase monic (q(0) =1) polynomial of degree q, i.e., |k|<=1

Hence And Q(z) is the minimum phase version of Bq(z) that is formed by replacing each zero of Bq(y) that lies outside the unit circle with one that lies inside the unit circle at the conjugate reciprocal location
110

Stochastic MA Models
In summary, given the autocorrelation sequence, we get the power spectrum, which is then factored into a minimum phase (Q(z)) and a maximum phase polynomial Q*(1/z*) Hence the process can now be modeled as the output of a minimum phase FIR filter, driven by a unit variance white noise Note the model is not unique. Any one (1-kz-1) factor in Q (z) can be replaced by (1-k*z-1)

111

Stochastic MA Models Example


For a given autocorrelation sequence, find an MA(1) model The power spectrum is Performing spectral factorization In general

In particular 0=4 and k=-1/4, hence as a minimum phase FIR filter or as a maximum phase FIR Filter

112

Power Spectrum Estimation


Spectrum estimation is an important problem is stochastic modeling Since the autocorrelation is usually unknown Px must be estimated from a sample realization A direct approach is to estimate the autocorrelation sequence and then transform. But with N+1 values of x(n) we can only estimate rx(-N<=k<=N)
113

Power Spectrum Estimation


This estimate is limited by two factors
Since the autocorrelation was estimated, all errors in this estimation will directly be in the estimation of Px Since the estimation of rx is of limited length 2N+1, the frequency resolution of Px is limited too

The estimate can be improve by including prior knowledge about the process. For example, the process x(n) is an AR(p) process Now the Yule-Walker method with estimated autocorrelations could be used to estimate the missing parameters

114

Spectrum Estimation Example


Given 64 samples of an AR(4) process generated by filtering unit variance zero mean white noise through the following model

The model coefficients are

Which corresponds to a filter having a pair of poles at and at


115

Spectrum Estimation Example


Estimating the autocorrelation sequence for |k|<N using the following equation and substituting the resulting values in this equation we obtain a power spectrum estimate called (a) On the other hand, using the estimates of the autocorrelation in the Yule-Walker equations we get Using these coefficients in results in the power spectrum estimate in (b)
116

Spectrum Estimation Example

117

Exercises

118

Exercise

119

Solution

120

Exercise

121

Solution

122

Solution

123

Solution

124

Exercise

125

Solution

126

Exercise

127

Solution

128

Exercise

129

Solution

130

Exercise

131

Solution

132

Exercise

133

Solution

134

Exercise

135

Solution

136

Solution

137

You might also like