Professional Documents
Culture Documents
Luis Gutiérrez
Departamento de Estadı́stica
Facultad de Matemáticas
llgutier@mat.uc.cl
Logo
Y = f (X ) +
Logo
y1 1
y = ... y = ..
.
yn n
y la matriz de diseño X ,
1 x11 · · · x1k
.. .. ..
X = . . .
1 xn1 · · · xnk
Logo
Definición (1)
Al modelo
y = Xβ +
se le llama modelo de regresión clásico, si se cumplen los siguientes
supuestos:
1. E() = 0
2. Cov () = E(T ) = σ 2 I
3. La matriz de diseño X tiene rango completo, es
decir, rango(X ) = k + 1 = p
4. ∼ N(0, σ 2 I )
Para covariables estocásticas estos supuestos se asumen condicional
Logo
aX
Propiedades
1. E(y ) = X β
2. Cov (y ) = σ 2 I
3. y | X ∼ Nn (X β, σ 2 I )
Logo
Logo
Logo
a homoscedastic variance
b homoscedastic variance, errors
10 4
5 2
0 0
−5 −2
−10 −4
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
5 2.5
0 0
−5 −2.5
−10 −5
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Fig. 3.1 Illustration for homo- and heteroscedastic variances: The graphs on the left show simu-
lated data together with the true regression line. In the graphs on the right the corresponding errors
are displayed. The data are based on the model yi ! N."1 C 2xi ; 1/ [panels (a)—homoscedastic Logo
variance and (b)—homoscedastic variance, errors] and yi ! N."1 C 2xi ; .0:1 C 0:3.xi C
3//2 / [panels (c)—funnel-shaped heteroscedastic variance and (d)—funnel-shaped heteroscedastic
variance, errors]
1000
1600 750
500
1200
net rent in Euro
250
residuals
800 0
−250
400
−500
0 −750
Fig. 3.2 Munich rent index: illustration of heteroscedastic variances. The left panel shows a
scatter plot between net rent and area together with the estimated regression line. The right panel
displays the corresponding residuals versus area
Logo
a b
errors with positive autocorrelation errors with positive autocorrelation
observations and regression line time series of the errors
2
5
1
0
−5
−1
−10 −2
−3 −2 −1 0 1 2 3 1 11 21 31 41 51 61 71 81 91 101
x i
0
1
0
−5
−1
−2
−10
−3 −2 −1 0 1 2 3 1 11 21 31 41 51 61 71 81 91 101
x i
Fig. 3.3 Illustration for autocorrelated errors: Panels (a) and (b) show errors with positive
autocorrelation and panels (c) and (d) correspond to negative autocorrelation. The respective
graphs on the left show the (simulated) data including the (true) regression line. The graphs on the
right-hand side display the corresponding errors. In case of negative autocorrelation, observations Logo
are connected in order to emphasize the changing algebraic sign. The data with positive correlation
are simulated according to the model yi D !1 C 2xi C "i where "i D 0:9"i!1 C ui and
ui " N.0; 0:52 /. The data with negative correlation in the errors are simulated according to
yi D !1 C 2xi C "i where "i D !0:9"i!1 C ui and ui " N.0; 0:52 /
2 2
0 0
−2 −2
−4 −4
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
c residuals
1
.5
−.5
−1
−3 −2 −1 0 1 2 3
Fig. 3.4 Illustration for correlated residuals when the model is misspecified: Panel (a) displays
(simulated) data based on the function E.yi j xi / D sin.xi / C xi and "i ! N.0; 0:32 /.
Panel (b) shows the estimated regression line, i.e., the nonlinear relationship is ignored. The Logo
corresponding residuals can be found in panel (c)
a b
observations of x1 against time observations of x2 against time
2 10
−2
−10
−4
−6
−20
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
i i
c d
regression y versus x1 and x2, residuals over time regression y versus x1, residuals over time
10 10
5 5
0 0
−5 −5
−10 −10
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
i i
−5
−10
0 10 20 30 40 50 60 70 80 90 100
i
Logo
Fig. 3.5 Illustration for autocorrelated errors if relevant covariates showing a temporal trend are
ignored. Panels (a) and (b) show the covariates over time. Panels (c–e) display the residuals for
the regression models yi D ˇ0 C ˇ1 xi1 C ˇ2 xi2 C "i (correct model), yi D ˇ0 C ˇ1 xi1 C "i (x2
ignored), and yi D ˇ0 C ˇ1 xi2 C "i (x1 ignored)
Errores aditivos:
Hay situaciones donde los errores pueden asumirse
multiplicativos
100 100
y
50 50
0 0
0 1 2 3 0 1 2 3
x1 x2
c d
scatter plot: log(y) versus x1 scatter plot: log(y) versus x2
6 6
4 4
2
log(y)
log(y)
2
0 0
−2 −2
0 1 2 3 0 1 2 3
x1 x2
Fig. 3.6 Example for a multiplicative model: Panels (a) and (b) show scatter plots between
simulated data y and x1 , respectively, x2 based on the model yi D exp.1 C xi1 ! xi2 C "i / with Logo
"i " N.0; 0:42 /. Panels (c) and (d) display scatter plots of log.y/ versus x1 and x2 , respectively
with multiplicative errors "Qi D exp."i /. Models with multiplicative error structure
are more plausible forLuis Gutiérrez
exponential Métodos
relationships Estadı́sticos
since the AvanzadostoI
errors are proportional
El modelo de regresión lineal clásico
3000 3000
2000 2000
sales
sales
1000 1000
0 0
6 6.25 6.5 6.75 7 7.25 7.5 7.75 8 8.25 8.5 6 6.5 7 7.5 8 8.5 9 9.5
price price of competing brand
Fig. 3.7 Supermarket scanner data: scatter plot between the sales of a particular brand and its
price [panel (a)] and the price of a competing brand [panel (b)], respectively
Hence, we can treat an exponential model within the scope of linear models by Logo
taking the logarithm of the response variable. Panels (c) and (d) in Fig. 3.6 show
scatter plots between the logarithmic response value log.y/ and the covariates x1
and x2 for the simulated model (3.3), which provides clear evidence of linear
Luis Gutiérrez Métodos Estadı́sticos Avanzados I
El modelo de regresión lineal clásico
15 15
5 5
0 0
20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160
area in sqm area in sqm
c residuals
d residuals
10 10
5 5
residuals
residuals
0 0
−5 −5
−10 −10
20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160
area in sqm area in sqm
e f
average residuals average residuals
6 4
4 2
2 0
residuals
residuals
0 −2
−2 −4
−4 −6
20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160
area in sqm area in sqm
Fig. 3.8 Munich rent index: illustration for modeling nonlinear relationships via variable trans- Logo
formation. The left column shows the estimated regression line including observations [panel (a)],
corresponding residuals [panel (b)], and average residuals for every distinct covariate value [panel
(c)]. The right column displays the estimated nonlinear relationship rentsqmi D 4:73 C 140:18 ! 1
1=areai [panel (d)] and the corresponding residual plots [panels (e) and (f)]
yi = β0 + β1 xi + i
donde, yi := rentsqmi y xi := areai . En los paneles (b), (d) y (f) el
modelo fue:
yi = β0 + β1 f (xi ) + i ,
donde f (xi ) = 1/xi , ası́, la matriz de diseño para este segundo
modelo es
1 1/30
1 1/37
X = ... ..
.
1 1/73 Logo
1 1/73
Luis Gutiérrez Métodos Estadı́sticos Avanzados I
El modelo de regresión lineal clásico
yi = β0 + β1 x1 + β2 x12 + i
yi = β0 + β1 x1 + β2 x12 + β3 x13 + i
donde, yi := rentsqmi y xi := areai .
Logo
15 15
5 5
0 0
20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160
area in sqm area in sqm
c residuals d residuals
10 10
5 5
residuals
residuals
0 0
−5 −5
−10 −10
20 40 60 80 100 120 140 160 20 40 60 80 100 120 140 160
area in sqm area in sqm
Fig. 3.9 Munich rent index: illustration for modeling nonlinear relationships using polynomials.
The upper panels show fitted quadratic and cubic polynomials including observations. The lower Logo
panels display the corresponding residuals
The results are obtained with STATA. Note that with other statistics packages we sometimes
obtained slightly different
Luisresults due to rounding Métodos
Gutiérrez errors. Estadı́sticos Avanzados I
El modelo de regresión lineal clásico
Logo
( (
1 xi = 1, 1 xi = c − 1,
di,1 = ··· di,c−1 =
0 en otro caso, 0 en otro caso,
Logo
Logo
y = β0 + β1 z1 + β2 z2 + β3 x + β4 xz1 + β5 xz2 +
Logo
Logo
Note que
LS(β) = T
= y T y − 2y T X β + β T X T X β (1)
Logo
∂LS(β)
= −2X T y + 2X T X β
∂β
∂LS(β)
luego, resolviendo la ecuación ∂β |β=β̂ =0
nos queda,
β̂ = (X T X )−1 X T y .
Logo
β̂ = (X T X )−1 X T y
Logo
∂l(β, σ 2 ) n 1
2
= − 2 + 4 (y − X β)T (y − X β)
∂σ 2σ 2σ
2 (y − X β̂)T (y − X β̂)
σ̂ML =
n
2 ) y proponga un estimador insesgado de σ 2
encuentre E(σ̂ML
Logo
Logo
Logo
Logo
We can generalize these observations for arbitrary linear models: The method of
least squares yields parameter estimates ˇO such that the residuals "O and the predicted
values yO are orthogonal to each other. This can
Luis Gutiérrez
be easily proved using properties of
Métodos Estadı́sticos Avanzados I
El modelo de regresión lineal clásico
0 ≤ R2 ≤ 1
Pn
Cuando R 2 es cercano a 1 entonces la ˆ2i
i=1 es pequeña
indicando un buen ajuste
Cuando R 2 es cercano a 0, la suma de los residuos al
cuadrado es relativamente grande indicando un ajuste pobre
Logo
2
Var (β̂j ) = Pσn
(1−Rj2 ) i=1 (xij −x̄j )
2
β̂ ∼ N(β, σ 2 (X T X )−1 )
(β̂−β)T (X T X )(β̂−β)
σ2
∼ χ2p
Logo
yi = β1 xi + i , i = 1, . . . , n
donde xi = i, i = 1, . . . , n. En este caso tenemos que:
n
1 T 1X 2 1
Xn Xn = xi = (1 + . . . + i 2 + . . . + n2 ) → ∞
n n n
i=1
Logo
i ) = σ 2 (1 − hii )
2 Var (ˆ
Logo
1. ˆ ∼ N(0, σ 2 (I − H))
T ˆ
ˆ 2
2. σ2
= (n − p) σ̂σ2 ∼ χ2n−p
Logo
Residuos studentizados
ˆ
ri∗ = σ̂ (1+X T (X T(i)X )−1 X )1/2 ∼ tn−p−1
(i) i (i ) (i ) i
Pruebas de hipótesis
H0 : C β = d vs. H1 : C β 6= d ,
d es un vector de dimensión r
Logo
Pruebas de hipótesis
Ejemplo 1:
β0
Sea C = 0 1 0 , β = β1 y d = 0, entonces la
β2
hipótesis H0 : C β = d nos queda
β0
H0 : 0 1 0 β1 = 0 ⇐⇒ H0 : β1 = 0
β2
Logo
Pruebas de hipótesis
Ejemplo 2:
1 0 0 β0 0
Sea C = 0 1 0 , β = β1 y d = 0 , entonces
0 0 1 β2 0
la hipótesis H0 : C β = d nos queda
1 0 0 β0 0
H0 : 0
1 0 β1 = 0 ⇐⇒
0 0 1 β 0
2
β0 0
H0 : β 1 = 0
β2 0
Logo
Pruebas de hipótesis
Ejemplo 3:
β0
Sea C = 0 1 −1 , β = β1 y d = 0, entonces la
β2
hipótesis H0 : C β = d nos queda
β0
H0 : 0 1 −1 β1 = 0 ⇐⇒ H0 : β1 − β2 = 0 ⇐⇒
β2
H0 : β 1 = β 2
Logo
Pruebas de hipótesis
Pruebas de hipótesis
H0 : C β = d vs. H1 : C β 6= d ,
Pruebas de hipótesis
H0 : βj = 0 vs. H1 : βj 6= 0
β̂j
tj = 1/2
∼ tn−p
\
Var (β̂j )
Se rechaza H0 si |t| > tn−p (1 − α/2)
Logo
Pruebas de hipótesis
H0 : β 1 = 0 vs. H1 : β 1 6= 0
1 −1
\
F = (β̂ 1 )T Cov (β̂ 1 ) (β̂ 1 ) ∼ Fr ,n−p
r
Se rechaza H0 si F > Fr ,n−p (1 − α)
Logo
Pruebas de hipótesis
n − p R2
F = ∼ Fk,n−p
k 1 − R2
Se rechaza H0 si F > Fk,n−p (1 − α)
Logo
−1
xT T T
0 β̂ ± tn−p (1 − α/2)σ̂(x 0 (X X ) x 0 )
1/2
−1
xT T T
0 β̂ ± tn−p (1 − α/2)σ̂(1 + x 0 (X X ) x 0 )
1/2
Logo
15
10
0
20 40 60 80 100 120 140 160
area in sqm
Fig. 3.16 Munich rent index: estimated rent per square meter depending on the living area
including 95 % confidence interval (solid lines) and 95 % prediction interval (dashed lines). The
values of the remaining covariates have been set to yearc D 1918, nkitchen D 0, gkitchen D 0,
and year01 D 0. Additionally included are the observations available for this covariate pattern Logo
Logo
Selección3.4 de variables
Model y Selection
Choice and Variable de modelos 141
−.5
y −1
−1.5
1
.5 x1
0
1.5
1
x2
.5
0
1
.5 x3
0
Logo
−1.5 −1 −.5 0 .5 1 1.5
Table 3.4 Results for the correctly specified model based on covariates x1 and x3
Variable Coefficient Standard error t-value p-value 95 % Confidence interval
intercept !0.967 0.039 !24.91 <0.001 !1.042 !0.889
x1 0.173 0.055 3.17 0.002 0.065 0.281
x3 0.226 0.052 4.33 <0.001 0.123 0.330
Cuando el modelo
independent es especificado
and uniformly distributed on [0,1].correctamente
The variable x2 is definedno
as xsolo la
2 D x1 C u,
where
variable x is also
ues uniformly distributed
significativa, sinoon [0,1].
que Thus, the variables
también la x1 and x2 arexhighly
variable Logo
3
correlated. Finally, the response variable y is simulated according to the model 1 , la cual
habı́a sido no significante en el modelo completo
y j x1 ; x2 ; x3 " N.!1 C 0:3x1 C 0:2x3 ; 0:2 2/:
2
Luis Gutiérrez Métodos Estadı́sticos Avanzados I
El modelo de regresión lineal clásico
yi = β0 + β1 xi + β2 xi2 + . . . + βl xil + i
Consideremos la siguiente medida de la calidad del ajuste
n
1X
MSE(l) = (yi − ŷi (l))2 ,
n
i=1
Logo
a training data
b validation data
−.7 −.7
−.8 −.8
−.9 −.9
y
−1 −1
−1.1 −1.1
−1.2 −1.2
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x
c regression line
d polynomial regression with l=2
−.7 −.7
−.8 −.8
−.9 −.9
−1 −1
−1.1 −1.1
−1.2 −1.2
0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1
x x
e polynomial regression with l=5
f MSE for training and validation data
−.7 .008
−.8 .007
−.9 .006
−1 .005
−1.1 .004
−1.2 .003
0 .2 .4 .6 .8 1 0 1 2 3 4 5 6 7 8 9
x degree of polynomial
Fig. 3.17 Simulated training data yi [panel (a)] and validation data yi! [panel (b)] based on 50 Logo
design points xi , i D 1; : : : ; 50. The true model used for simulation is yi D !1C0:3xi C0:4xi2 !
0:8xi3 C "i with "i " N.0; 0:072 /. Panels (c–e) show estimated polynomials of degree l D 1; 2; 5
based on the training set. Panel (f) displays the mean squared error MSE.l/ of the fitted values
in relation to the polynomial degree (solid line). The dashed line shows MSE.l/, if the estimated
polynomials are used to predict the validation data yi!
Derive una expresión para el AIC del modelo lineal con errores
Logo
Gaussianos?
Derive una expresión para el BIC del modelo lineal con errores
Gaussianos?
Logo
Logo