You are on page 1of 11

Nonlinear regression

Regression is fitting data by a given function (surrogate) with


unknown coefficients by finding the coefficients that minimize
the sum of the squares of the difference with the data.
In linear regression, the assumed function is linear in the
coefficients, for example, =
1
sin +
2
cos.
Regression is nonlinear, when the function is a nonlinear in the
coefficients (not x), e.g., =
1
sin(
2

The most common use of nonlinear regression is for finding
physical constants given measurements.
Example: fitting crack propagation data with Paris law:
Fit
0
, , (
( )
2
2
1
2
0
(1 )
2
m
m
m
m
a NC a o t

(
= A +
(

Review of Linear Regression
Surrogate is linear
combination of

given
shape functions
For linear
approximation
Difference (residual)
between

data and
surrogate
Minimize square
residual
Differentiate to obtain

1
( )
b
n
i i
i
y b
=
=

x
1
( )
b
n
j j i i j
i
r y b X
=
= =

x r y b
( ) ( )
T T
X X = r r y b y b
y b
T T
X X X =
1 2
1 x = =
Basic equations
General form = (, , e. g. , =
1

1
+
2
sin(
3

2


Residuals

Rms error
Finding the coefficients requires the solution of an
optimization problem.
However, minimizing the sum of squares is a specialized
problem with specialized algorithms. Matlab lsqnonlin is
very good.

( , )
i i i
r y y = x b
2 2
1
rms i
i
y
e r
n
=

Example Linear vs. Nonlinear Regression
y(1) = 20, y(2) = 7, y(3) = 5, and y(4) = 4.
Data suggests a rational function =
1
+

2


Compare to quadratic polynomial =
1
+
2
+

2

Both use three coefficients
Get



1 1.5 2 2.5 3 3.5 4
2
4
6
8
10
12
14
16
18
20


Data
Rational function
Polynomial
2
6.99
1.99
0.61
36.5 20 3
2
rational
quadratic
x
x
y x
y =

+ =
Estimating uncertainty in coefficients
Brute force approach, generate noise in data
and repeat multiple times
Alternatively linearize about optimum set of
coefficients b*

Now perform linear regression with

.
Provides improvement to solution
Provides estimate of uncertainty in

, which is
an estimate for the uncertainty in

( , *)
n
i i i j
j
j
y
r y y b
b
|
=
c
= = A
c

x b
Model based error for linear
regression
The common assumptions for linear regression
Surrogate is in functional form of true function
The data is contaminated with normally distributed
error with the same standard deviation at every point.
The errors at different points are not correlated.
Under these assumptions, the noise standard
deviation (called standard error) is estimated as.

Similarly, the standard error in the coefficients is



2

T
y b
n n
o =

r r
( )
1

) (

=
ii
T
i
X X b se o
Rational function example
Linearize with respect to bs

Perform fit by linear regression
=1.0e-007* [0.1435 0.3685 0.1230]
Finally perform error analysis


Standard errors range between 4% to 10% of
the bs (1.99 -6.99 0.612)
( )
*
2 3 2 2
1 1
2 *
*
3 3
3
b b b b
y b r b
b x b x
b x
A A
= + = A +


T T
X X X A = b r
2
0.102
T
y
n n
|
o = =

r r
( )
1

( ) [0.20, 0.50, 0.024]


T
i
ii
se b X X o

= =
Application to crack propagation
Paris law and its solution


Coppe, A. ,Haftka, R.T., and Kim, N.H. (2011) " Uncertainty Identication of
Damage Growth Parameters Using Nonlinear Regression" AIAA Journal ,Vol
49(12), 28182621
Properties to be identified from measurements
( )
( )
2
2
1
2
0
(1 )
2
m
m
m
m da m
C K K a a NC a
dN
o t o t

(
= A A = A = A +
(

T
0
{ , , } m a b = b
meas
i i
a a b v = + +
meas
( , )
calc
i i i
a r N a = b
Example with only m unknown
Simulation with b=0 v=[-1,1]mm, m=3.8
Excellent agreement between Monte Carlo (1,000
repetitions) simulation and linearization.
0 500 1000 1500 2000 2500
10
-4
10
-3
10
-2
10
-1
Number of cycles at inspection
U
n
c
e
r
t
a
i
n
t
y

i
n

m


Standard error
Standard deviation
All three unknown
Difficult to differentiate between initial crack
size and bias
0 500 1000 1500 2000 2500
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
Number of cycles at inspection
U
n
c
e
r
t
a
i
n
t
y

i
n

a
0


Standard error
Standard deviation
0 500 1000 1500 2000 2500
10
-3
10
-2
10
-1
10
0
10
1
10
2
Number of cycles at inspection
U
n
c
e
r
t
a
i
n
t
y

i
n

m


Standard error
Standard deviation 0 500 1000 1500 2000 2500
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
Number of cycles at inspection
U
n
c
e
r
t
a
i
n
t
y

i
n

b


Standard error
Standard deviation
meas
i i
a a b v = + +
Problems
Using the data for the rational function, repeat the fit and
the uncertainty calculation for an exponential decay
model =
1
+
2


Instead of using the data in Slide 4, generate your own
data for 31 uniformly distributed points (1,1.3,)from the
identified algebraic model

= 1.99
6.99
0.612
and
contaminate the data with normally distributed random
noise with zero mean and standard deviation of 1.
Compare the standard error from linear regression with
the value you get by repeating the process multiple times
using different realizations of the noise.