Professional Documents
Culture Documents
Managing Editor:
M. HAZEWINKEL
Centrefor Mathematics and Computer Science, Amsterdam, The Netherlands
Volume 389
Asymptotic Theory of
Nonlinear Regression
by
Alexander V. Ivanov
Glushkov Institutefor Cybernetics.
Kiev. Ukraine
An Rights Reserved
1997 Springer Science+Business Media Dordrecht
Originally published by Kluwer Academic Publishers in 1997
Softcover reprint ofthe hardcover Ist edition 1997
No part of the material protected by this copyright notice may be reproduced or
utilized in any fonn or by any means, electronic or mechanical,
including photocopying, recording or by any infonnation storage and
retrieval system, without written pennission from the copyright owner.
Contents
INTRODUCTION 1
1 CONSISTENCY 5
1 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . .. 5
2 Large Deviations of the Least Squares Estimator in the Case of
Errors Having an Exponential Moment . . . . . . . . . . . . . . .. 9
3 Large Deviations of the Least Squares Estimator in the Case of
Errors with a Moment of Finite Order . . . . . . . . . . . . . . .. 25
4 The Differentiability of Regression Functions and the Consistency
of the the Least Squares Estimator . . . . . . 45
5 Strong Consistency . . . . . . . . . . . . . . . 58
6 Taking the Logarithm of Non-Linear Models. 73
v
vi CONTENTS
ApPENDIX 289
I Subsidiary Facts . . . . . . 289
II List of Principal Notations. 298
COMMENTARY 303
Chapter 1 303
Chapter 2 304
Chapter 3 305
Chapter 4 306
BIBLIOGRAPHY 309
INDEX 325
Introduction
and the coefficients of the approximating polynomials play the role of the estimated
parameters. The non-linear regression models have important advantages over the
linear ones. The main one consists in the greater adequacy of the non-linear
models that have essentially fewer unknown parameters. Often the parameters of
non-linear models have the meaning of physical variables at the same time as the
linear parameters are usually devoid of physical significance.
The last quarter of a century's study of non-linear regression models has
steadily attracted the interest of specialists; we refer only to the books of Bard [9],
Ratkowsky [192,193]' Gallant [84], Bates and Watts [27], Demidenko [70,71], Seber
and Wild [201], and Ross [196]. However, on introducing into statistics the use of
non-linear regression analysis it is necessary to overcome a series of mathematical
difficulties which do not have analogues in the linear theory. For example, the least
squares estimator (l.s.e.) non-linearly entering in a regression function parameters
can not be found in an explicit form, and this complicates the description of its
mathematical properties. Another principal peculiarity consists in this: character-
istics of the l.s.e. (bias, correlation matrix, and such like) are local, i. e., depend on
unknown values of the parameters. These two cases approximate the model (0.1)
with general models of parametric statistical inference by independent non-equally
distributed observations, although it does not follow that we should neglect that
all inhomogeneity of the sample (0.1) X = (Xl' ... ' Xn) is concentrated only in
the shifts g(j, 8), j = 1, ... , n.
The maximum likelihood method plays an important role in the theory of
parametric estimation, for the application of which it is necessary to know to
within an unknown parameter the distribution (of the density) ofthe observations.
Concurrent with it, the Bayes' method of estimation and the method of maximal
a posteriori density also have such a peculiarity.
Not forgetting about the success, fallen to the the theory of normal regression
(the ej are Gaussian (0,0'2) r.v.-s), let us emphasise that the meaning of regres-
sion analysis consists in the performance of statistical inference using minimal
information about the distribution of the r.v.-s ej. Consequently the methods of
estimation used in regression analysis must be oriented towards the use of mea-
gre information about the errors of observation ej. Such methods are those of
M -estimators, the minimisation of empirical risk, minimal contrast (if the con-
trast function does not depend upon the density of observations), the method of
recurrent estimation, and others (see the book by Birkes and Dodge [36]).
In the stream of publications devoted to the problems of regression analysis
the central place is occupied by the least squares method of estimates of para-
meters, which has a protracted history. The basic statement of the linear theory
was worked out in the researches of Gauss [85] and Markov [153,154], and then
developed by Aitken [7], Neyman and David [166], Linnik [148], Rao [189], and
many others.
(the symbol 'E ' stands in the whole book for the summation over the index j from
1 to n).
If follows that we should recognise that the asymptotic (i. e., n -+ 00) properties
of the l.s.e. with parameter () of the non-linear model (0.1) were in fact not studied
until the works of Jennrich [138] and Malinvaud [152], and the basic results in this
area of mathematical statistics were obtained in the flow of the last two decades.
A large part of the proposed investi~ation devoted to the study of the asymp-
totic statistical properties of the l.s.e. (}n of the parameter () of the model (0.1),
for example such as the probabilities of large deviations l.s.e., is weak and strong
consistency, stochastic asymptotic expansions (s.a.e.) and asymptotic expansions
(a.e.), l.s.e., 9n of dJ., and various functionals of them (in particular, the variance
estimator Ec = 0'2 > a of the errors of observation Cj, and others). A series of
questions is also examined that is connected with the differential geometry of the
a.e. obtained in this book. Less attention is given in this book to the least moduli
estimator (l.m.e.).
DEFINITION: The I. m. e. of the parameter () E e for the observations X = (Xl, ... ,
e
Xn) of the form (0.1) is any random vector (O~, ... ,O~) = On = On (X) E c having
the property
Consistency
1 INTRODUCTORY REMARKS
Here the simplest fact, but one very important in applications, is this: if e c is
compact and g(j, 8) E C(e C ), j 2: 1, then the l.s.e. On exists [138].
Clearly, if e c is compact, and g(j,8) E C(e C ), j 2: 1, then there may hold
an analogous assertion for the l.m.e. On and any other estimates of the parameter
8, which are defined as point optima of a functional continuous in the arguments
(8,X).
It is clear that there is also interest in the case when e is an unbounded set.
Let us mention one assertion covering this case for the continuous functions g(j, 8),
j 2: l.
Let us assume that the functions g(j, 8) E C(e C ), j 2: 1, in which inf L(8, X)
is attained in e c for each X E lRn. Then the l.s.e. On exists [174].
Even if only for one of the functions g(j, 8)
lim \g(j,8)\=oo,
1111-+00
then for each X E lRn and number m > 0 it is possible to determine a compact
set Tx,m C e c such that
Let us note that the assertions mentioned above are a very special case of the
theorems of measurable choice [8, 145] and, in actuality, one is able to prove the
existence of the l.s.e. On for considerably weaker requirements. We do not cite the
corresponding formulations since continuous and differentiable regression functions
only are considered below.
Let En, n ~ 1, be a sequence of statistical experiments generated by inde-
pendent observations X = (Xl"'" Xn), On = On(X) is a certain sequence of
estimators of the parameter (J E e.
DEFINITION: A sequence On, n ~ 1, is called a consistent sequence of estimators
(J (On is a consistent estimator (J) if for any r > 0
P;{IOn-(J1 ~r} ~ O. (1.1)
n-+oo
Since the experimenter does not know the value of the parameter (J, it is im-
portant to guarantee uniform convergence to zero of the probability (1.1) for some
sets of parameters in the set e.
DEFINITION: A sequence On n ~ 1, is called a uniformly consistent sequence of
estimators (J in the set Tee (On is a uniformly consistent estimator (J in the set
T) if for any r > 0
supP9{IOn - (JI ~ r} ~ O. (1.2)
9ET n-+oo
If the observations X have the form (0.1) and On is the l.s.e., then it is not
difficult to adduce an example of a sequence of functions g(j, (J) for which On does
not satisfy (1.1).
EXAMPLE 1: (Cf., [152]). Let
g(j, (J) = e- 9j , (J E e = (0, A), A < OOj Ec~ = 0'2 < 00.
Let us assume that On is a consistent estimator of the parameter (J = (Jo. If
IOn - (Jol < r for r < (Jo, then On satisfies the equation
dLJ;n) = 0
or
o = Lje- 90j cj + Lj(e- tinj - e- 90j )Cj + Lje-tinj(e-90j - e- tinj )
an + f3n + 'Yn,
with
lf3nl < r Lj2e-(9 0 -r)jlcjl,
A = {:~:::j2e-(9o-r)jlejl < r- 1/ 2 }
Then for elementary events from the set {IOn - 80 1 < r} n A the inequality lanl <
r1/2 + c1(80)r holds. Consequently,
By Markov's inequality
whence
00
The contradiction obtained shows that in the example under consideration the
l.s.e. On is not a consistent estimator 80
The larger the l.s.e. On becomes, the property (1.2) does not always hold. Let
us denote
In the model (0.1) let Cj have a Gaussian distribution with parameters (0,1). If
the sequence offunctions g(j, 8) is such that for some 81 , 82 ETC e, 81 :f:. 82 ,
1 00
-00
(P'(X))2
P() dx < 00.
x
8 CHAPTERl. CONS~TENCY
If there exists a consistent estimator On of the parameter 9, then for any 91 ,92 E e,
91 '" 92 ,
PI" = II PilJ.
iEN
The definitions of consistency have the same meaning if in (1.1) and (1.2) we
write PI" instead of PO'. If the estimator On is consistent, then the measures
{PI" ,9 E e} form a class of mutually singular measures. The sequence of r. v .-s
X N which corresponds to the measure PI:
is a shift of the sequence X N which
corresponds to the measure Pl". For the singular measures PI" and PI" the series
E~dg(j, 9d - g(j, 92)]2 diverges [115,203,213]. 1 2
(1.4)
2. ERRORS WITH EXPONENTIAL MOMENTS 9
Let Bn(O) E l3 q be a Borel set, the closure B;(O) of which does not contain O. If
On E Bn(O) then from (1.4) and (1.5) it follows that
In the proofs of the assertions of this and the following sections a general approach
to the study of the probability of large deviations of statistical estimators is used,
which was developed in the works of Ibragimov and Has'minskii [120]. In this
connection moment conditions are imposed upon the r.v.-s j only.
Let us assume that the r.v. j satisfies the condition:
100 There exist constants T > 0 and 0 < R ~ 00 such that
for ItI ~ R.
In this Section it will be shown that the fulfilment of the condition 100 and of
a series of requirements on the functions gU, 0) ensures the exponential bound for
the probability of large deviations of a normalised los.e. On.
Let
be a diagonal matrix of order q with elements din, i = 1, ... ,q along the diagonal.
We shall normalise the matrices d to n en.
Let us write
110. For some a E (0,1] there exist constants Cl = Cl (T) < 00 and C2 = c2(T) ~
o such that
sup sup CP~P(Ul,U2)lul-U2IoQ:$Cl(I+Qc2), Q>O. (2.1)
9ET Ul,U2Evo"(Q)nU;;(9)
Subsequent conditions - conditions of the distinctness of the parameters of the
regression function g(j, 8) - are more subtle. Let
be a sequence of functions,
Wn(O, . .. ,0) = 0,
Let
wmn(X) :$ w~n(x), X ~ 0,
monotonically non-decreasing in n and x, such that for any a >0
lim x G e-'I1 mn (z) = 0; (2.2)
n-too,z-too
wqn(x) :$ w~n(x), X ~ 0,
monotonically non-decreasing in n and x, such that for any a > 0 and some
constant Cs (a) < 00
(2.3)
The fulfilment of the condition IlIo signifies that the function iP;!2 (u, 0) dis-
tinguishes all variables u 1 , . .. ,uq (or, what amounts to the same, the regression
function gU, ()) distinguishes all the parameters ()1, ... ,()q so well that (2.2) holds.
The conditions IIIm , m = 1, ... , q - 1 describe the situation in which the vari-
ables u 1 , ... , u m (the parameters ()1, ... , ()m) are distinguished well only in certain
intervals of values depending upon n, and outside these intervals the functions
iP;!2(U,0) lose sensitivity as the quantities lUll, ... , luml grow. If also luil ~ Zin,
i = 1, ... , m, then good discrimination of the variables as lulo ---+ 00 is expressed
by the relation (2.2), which is realised at the expense of the function iP;!2(U, 0) in
response to the growth of lum+1I, ... , luql. The condition IIIq includes the cases
when iP;!2(u,0) discriminates the variables well (relation (2.3)) only inside the
parallelepiped defined by the sequences Zin, i = 1, ... , q.
EXAMPLE 2: Let
gU, ()) = ()1 cos ()2 j, j 2: 1,
e = (0, A) x (h,7r - h), > 0, A < 00,
h
T = [a, bl x [ep,7r - epl, ep > h, a > 0, b < A.
Let us set
Then
. ( 1) -3/2 2)
(u 1)2 + ()1 (()1 + n
-1/2 1)
un ( 1 _ sm n .+ 1'2 n 3/2 u
2 + Xn,
2nsm'2n- u
sin(n + 1)n- 3 / 2 p2 )
'fln(p\p2)=(pl)2+a 2n ( 1- . 2 _ +Xn ,
2n 8m!n 3/2 pl
12 CHAPTER 1. CONSISTENCY
a2
. ( 1;1)
Wln(x)=mm ( 1- 20
1T2 ) ) x-x.
2
Let us introduce the events
THEOREM 1: Let the conditions 100 , 110 , IIIm be satisfied for some
Then there exist positive constants C4 -C7 such that for H > ~ m ~ q.
(2.4)
(2.5)
a~O) d~flRq, a~m) = {u E IRq : luil ~ Zin, i = 1, ... ,m}, m = 1, ... ,q.
~ p;{ sup
uE,;(H)na~m)nu~ (Ii)
1I(0 + d;;lU,O)) ~!}.
2
(2.6)
If m = q then the sequence U(p) contains only a finite number of non-empty sets
not exceeding [maxl~i~q Zin - H] - 1. Extending the inequality (2.6) we find
Let
2. ERRORS WITH EXPONENTIAL MOMENTS 13
In the condition 100 let R < 00 be a constant. Let us fix i and apply the
Theorem A.l to the r.v.
The conditions of this theorem are satisfied for R from condition 100 and for
where r is the constant from condition 100 , Consequently, in the given case
G= I:rj =r.
Let us take
where one can take C9 = x(s)(/1-s + /1-;/2) for s > 2 and C9 = 2/1-s for 1 ~ s ~ 2. On
the other hand,
(2.13)
From conditions Ho and HIm, inequalities (2.11)-(2.13) for any 8 E T, and any
u l , u ll E U(p), we obtain
where
2. ERRORS WITH EXPONENTIAL MOMENTS 15
Let s be large enough for as > q to hold. Applying Theorem A.3 to the random
field
where "0 depends on s, q, a and does not depend upon h and the set U(p).
In this way, for any 0 E T and R < 00, from (2.8), (2.9) and (2.15) it follows
that
(2.16)
p; { sup 11(0 +
uEU(p)
d~lU, 8) ~ -21 }
and in (2.17)
16 CHAPTER 1. CONSISTENCY
Then the inequalities (2.16) and (2.17) can be rewritten in the following form:
< (2.18)
R=oo,
where
Since the function fn(Q) has no more than a power growth as Q -+ 00, con-
dition HIm and the bound (2.18) give an opportunity to obtain an inequality
extending (2.7):
< (2.19)
R=oo,
Let us assume that q,n(pl, .. . ,pq) E D(O), q,n(Pl, . .. ,pq) = q,n(lql), and that
the function q,n(x), x ~ 0, is monotonically non-decreasing in x and n. Since
q,On{x) ~ q,n(x), x ~ 0, then in (2.2) it is possible to take q,On{x) = q,n(X).
THEOREM 2: Let conditions 100 , IIo and 1110 be satisfied, with
R=oo. (2.22)
Proof: Theorem 2 is an 'isotropic', or, as one can say, 'radial', variant of Theorem 1
and is proved analogously: in the argument it follows that one only substitutes
the Euclidian norm 1 . 1 for the uniform norm 1 . 10.
a = 1, dn -= n l / 2 t q.
and
sup E~ exp{ c28 Wn(ldn(B) (On - B)I)} < 00, R < 00;
(JET
Proof: Let C28 < C19' Integrating by parts, for BET we obtain
E~ exp{c28Wn(ldn(B)(On - B)I)}
= 1 00
eC2SWn(X) d( - P;{ldn(B)(O~ - B)I ~ x})
Let us return to the proof of Theorem 1. Obtaining the inequality (2.8) let us
apply, for fixed i, Theorem A.l to the r.v.
The conditions of Theorem A.l are fulfilled for R from condition 100 and for
and therefore
2. ERRORS WITH EXPONENTIAL MOMENTS 19
if ~On(Ui' 0) ~ ! r~ 0 .
2
Proof: If
rR
~On(Ui'O) < ! _ 0
2
then from (2.23) and condition HIm there follows the inequality
then from (2.23), condition HIm, and (2.24) follows the inequality
where m = 1, ... , q, and a~i = a~i(T) < 00 and b~i = b~i(T) > 0 are some
constants, and ~ = 1,2.
LEMMA 4.1: Let conditions foo, lIo, fIlm be fulfilled for the functions 'I!n E D(m).
Then there exist constants a~i < 00, b~i > 0, f ~i < 00, and g~i > 0 such that
Pq = p;{ sup
uEU(pl ,... ,pq)nu~ (9)
v(9+d;;-lU,9) ~~}
:$ C32 h- q exp { - ~ (~ - 8) RWn(Zln, p2, .. . , pq) } (2.28)
+8- B(C33 + C34 (Q~2B+qw;;-2B (1, Zln) + Q~C2B+QB+qw;;-3B (1, Zln))) hQB - q,
where
Qq = max(zln + pI + l,p2 + 1, ... ,pq + 1) :$ Qq-l + pq + 1,
Qq-l max(zln + pI + 1, p2 + 1, ... ,pq-l + 1).
22 CHAPTER 1. CONSmTENCY
for those values of x for which the right hand side of this inequality has a meaning.
From (2.30), in particular, it follows that
2: {Zln +
p~
< c41e-c421l1n (l,zln) pI ) 2C 2 8+O: B+q
pl=O
(2.32)
where
b1l < -R
2
(1)
--6
2
as - q
as{2c2s + as + Q+ 1)
.
(2.33)
2. ERRORS WITH EXPONENTIAL MOMENTS 23
In the inequality (2.33) let us define the maximum point s* of the function
as - q
/(s) = as(2c2s + as + q + 1) .
Thus we evidently obtain the upper bound values of the constant bl1 for which
the constant 911 still remains positive. Some simple calculations show that
In particular, if C2 = 0, a = 1, then
s* = q + .j2q2 + q, /(s*) = ( y'q + .j2q + 1 ) -2 , bl1 < ~ /(s*).
Analogous arguments show that for C2 = 0 and a = 1 with R = 00,
b2l < 8~ /(s*).
The same bounds hold for bli, b2 i, i = 2, ... , m. Let us denote
+L
m
c4e-cslltmn(H) /li e -(9li/ Cr,)llt n (i,H), R < 00;
i=l
<
+ L hie-(92'/C2,)Ilt~(i,H), R =
m
Cse-c71lt~n(H) 00,
i=l
where
24 CHAPTER 1. CONSISTENCY
Proof: The assertion of the Theorem follows from Theorem 1, Lemma 4.1 and
inequality (2.26).
The following result is closely related to the previous one. Let condition 100 be
satisfied for R < 00. Then there exists a constant a> 0 such that
Eet(!ej!-I'l) < 00 for ItI ~ a.
Consequently constants ro > 0 and Ro > 0 can be found ([172], pp 54-55) such
that for ItI ~ Ro
Eet(!e1!-I'1) ~ e(ro/2)t 2
If conditions HIm and (2.24) are satisifed, then for any () E T and sets ain =
{u E IRq: luil ~ Zin}, i = 1, ... ,m
m
< L p;{ Idin ((}) (9; - (}i) ~ Zin}
i=l
(2.34)
Applying Theorem A.l to the sum of the r.v.-s ei = ICil - /Ll, we find for i =
1, ... ,m,
l exp { - 2~0 (~n 'linU, Z;n) - 1'1)'}' n-1ZOn 'Ii.(i, Z;n) '" 2(ToRo + pd
THEOREM 5: Let the conditions 100 (R < 00), 110 , lIlm, (2.24) and (2.34) be
satisfied. Then
3. ERRORS WITH MOMENT OF FINITE ORDER 25
ZOn,T, (. )
Yn 2n 'Kn Z,Zin -ILl'
Proof: The result follows from Theorem 3 and the inequality (2.35).
3 LARGE DEVIATIONS OF THE LEAST SQUARES ESTIMATOR IN
THE CASE OF ERRORS WITH A MOMENT OF FINITE ORDER
(3.1)
n,x -+ 00.
The constants C2 and a are taken from condition Ill, which reproduces condi-
tion IIo of Section 2 in the following form:
Ill' For some a E (0,1] there exist constants Cl = cdT) < 00 and C2 = c2(T) ~
o such that
sup sup ~~(2(Ul,U2)lul - u21- a ~ cl(1 + QC2), Q >0.
(JET ut,u2Evc(Q)nu:;(J)
THEOREM 6: If the conditions Is, Ih, IIIq+1 and as > q are satisfied, then for
some constants C3, C4 < 00
26 CHAPTER 1. CONSISTENCY
Proof: The proof is analogous to the proof of the Theorem of Section 2. For
p= 0,1, ... let us write
u(p) = (vC(H(p + 2)) \ v(H(p + 1))) n U~(8).
Then for any 8 E T
P9{ldn (8)(On - 8)1 ~ H}
fp;{
~ p=o sup v(8+d;;:lU,8)
uEU(p)
~~};
By the inequality of Theorem A.2 and condition Ill, for Ul, U2 E vC(H(p + 2))n
U~ (0) we have
E;lw(8 + d;;:lUl,8 + d;;:lU2W
~ x(s)(f-Ls + f-L;/2)cf(1 + (H(p + 2W2 )81 u l - u21 s.
Therefore Theorem A.3 applied to the variable for the random field
u E vC(H(p + 2)) n U~(8),
allows one to arrive at the upper bound for the probability (3.3) of the form
c5(H(p + 1))(c2+ o )s\I!;;:2S(H(p + 1)).
Consequently
P9{ldn (8)(On - 8)1 ~ H}
00
In (3.4) let us set H = n l / 2 r, where r > 0 is an arbitrary number. Then with the
condition s > q the inequality
THEOREM 7: Let conditions Is be satisfied, with o:s > q, condition III with C2 = 0,
condition IIIq+2 , and (2.24). Then
where 'Y E (0,1) is some constant such that 2{3'Y - 0: > S-l.
28 CHAPTER 1. CONSISTENCY
b
+P;{ao1lf nO(Zn) ~ Idn(())(()n - ())I ~ Zn}
A
where ao is some constant, and bo = 278/(0.8 + 1). Let H ~ Zn. Then using the
condition 2/3 - a > 8- 1 , analogously to the proof of Theorem 6, we obtain
[Zn/ H ]-1
P1 ~ C12 L (H(p + l))aB1lf~2B(H(p + 1))
p=O
L
[Zn/ H ]
< C12"02B H-(2/3-a)B p-(2/3-a)B
p=1
< C13"0
-2BH-(2/3-a)B
. (3.7)
[aollt:O(Zn)Z;;-l]
< C141lf~2B (zn)z~B L paB
p=1
Therefore by Theorem A.4 applied to the r. v. ej = Iej 1-J-tl, thanks to the inequality
(3.6) we obtain the bound
r1J!~O(Zn)H-l+l
< Cl6'li;;2s(H)HO:s 10 pO:S dp
< c17H-lw;;2s(H)w~'YS(zn)
Then
Pl = P2 = 0,
and by virtue of the Theorem's conditions
REMARK 7.1: In the proof of Theorem 7 the relation (3) of condition IIIq+2 is not
used directly. It shows, however, that we are not justified in arguing also as in the
proof of Theorem 6. In fact, if (3) is satisfied then for x > Zn and n > no
where
(3.10)
The constraint (3.10) covers the bulk of the regression models used in practice,
and allows us to apply different limit theorems of the theory of probability for the
study of the asymptotic statistical properties of the l.s.e. On. Strictly speaking,
the general asymptotic theory of non-linear regression, when the condition (3.10)
is violated, has not been constructed up to now.
We prove an assertion analogous to the preceding theorems of this section, for
H = rn 1 / 2, r > 0 an arbitrary number. Let us set
III q +3' For some Ro > 0 and any r E (0, Ro] there exist numbers ~ = ~(Ro) >
o and p = p(r, Ro) > 0 such that
inf inf _ n-1<Pn(u,0) 2:: p, (3.12)
IIET 1.E(vc(Ro)\V(r))nU;;(II)
(3.13)
Let us denote
Proof: Let r E (0, Ro] be a fixed number, and Ro, p, ~ numbers the existence of
which is guaranteed by the condition IlIq +3' Let us write
v(u) = v(() + n l / 2d;;l(())u,()),
11"(~) = P{s* ~ J.t2 + ~}.
By the inequality (1.6) for any () E T we obtain
+P;{ sup
uE(vC(Ro)\v(rnu:; (8)
v(u) ~ ~}
= P+Pl 2
the diameter of each p(i) is less than is, corresponding in condition 112 to the
numbers c and Ro (the quantity c will be chosen below), Ui E p(i) n iJ~(()), i =
1, ... , lo, lo ~ l (we consider the p( i) to be renumbered such that p( i) n iJ~ (()) :f:. 0
for i ~ lo). Then
+P;{ sup _
u',u"EF(i)nu:;(8)
Iv(u') - v(u")1 !}
~4
32 CHAPTER 1. CONSISTENCY
Using the Chebyshev inequality, the inequality of Theorem A.2 for 8 from condition
Is and (3.11), we obtain
P3 ~ C18<J?~S/2(Ui'O) ~ C19n-s/2.
We remark that
Iv(u') - v(u")1
< Iw(8 + n1/2d~lu',8)11<J?~1(u',O) - <J?~l(U",O)1
1<J?~l(U',O) - <J?~l(U",O)1
Collecting together the bounds for P1 -P4 , we obtain the assertion of Theorem 8 .
REMARK 8.1: Let us assume that
i = 1, ... ,q,
and e be a bounded set. Then Theorem 8 is valid with the assumption IIIq+s in
the simplified version:
For any r > 0 there exists a number p = p(r) > 0 such that
inf inf n-1<pn(() + u,()) > p. (3.14)
(JET uE(9 c -(J)\v(r) -
Let us take one result on the consistency of the l.m.e. On for the model (0.1)
and the symmetric r.v. Cj. Let us write
(2) For any r > 0 and s from condition I~ there exist constants X(B) = 8(B) (r) <
00 such that
sup sup n- 14Bn(U,0) ::; X(B), s ~ 2, (3.17)
(JET uEvC(r)nu:;(J)
sup sup
(JET uEvC(r)nu:;(J)
4on{u,O) ::; x(1), s = 1.
(3.18)
THEOREM 9: Let Cj be a symmetric r.v .. If the assumptions I~, lIs, IIIq+4 are
satisfied, then for any r > 0
O{n-B+1) s>2
sup P;{ldn {()) {On - ())I ~ rn 1 / 2 } ={ ' -,
(JET 0(1), 8 = 1.
34 CHAPTER 1. CONSISTENCY
Clearly
P;{ldn(8)(On - 8)1 ~ rn l / 2 }
= Pl +P2,
where -y E (0,1) is some number.
The probability
Pl = o(n- a+!)
by Theorem A.4. On the other hand,
Since, evidently,
then
(3.20)
Let us set r = ro and "I = 2/ Po. Then by condition III q +4, the inequality (3.20),
and Theorem AA the probability (3.19) is a quantity of order o(n- s +1). Conse-
quently, it remains to estimate the probability
+P;{ inf _
uE( vC(ro)\v(rnu~ (9)
n-1hn(0, u) ~ - "I' ~(r)}
UF(i) = vC(ro).
1
i=l
Since
EICj + gl = 9 1
-9
9+
P (dx) + 2 100 xP (dx),
g+
then
g
-49 / + P(dx) [00 xP(dx) -4 ([00 XP(dx))2
-g 19+ 19+
The convergence of the latter two summands of the right hand side of (3.22) is
quite clear. Consequently
inf Dlcj
g~O
+ gl > 0,
(3.23)
The inequalities (3.21) and (3.23) make it possible to apply Theorem A.5 to
the r.v.-s
17jn = ~jn - E; ~jn
and, basing on this, for s ~ 3 to write
PJ'{n- 1Ihn (O, ui)1 ~ (1 - .Bh' ~(r))
< P(J
n{ Ihn(O, ui)I(E(Jnhn(O,
2
Ui))
-1/2
~
(1 - .Bh' ~(r)n1/2 }
(J.L2 + x(2) (ro))1/2
(T) (J.L2 + x
(2) ( ))8/2
< rO -8+1 (3.24)
x ((1 _ .Bh'(r))8 n .
In Theorem A.5 we took
< 1,
C
r
J1z1>gln
IxlP (dx) --t
n-too
0,
38 CHAPTER 1. CONSISTENCY
where c:' E (0, c:) is some number. Consequently condition (1) of Theorem A.6 is
satisfied.
Let us next verify that condition (2) is satisfied for T = 1:
< E;[c:~ + 21C:i Ix(1) (ro) + (x(1) (ro))2 + (1-1 + x(1) (ro))2]
xX{IC:il < n + 1-1 + 2X(1) (ro)}
< ( . x2P(dx)+41-1x(1)(ro)+2(x(1)(ro))2+1-~.
J1z1<n+J.&1 +2x(1) (ro)
inf inf
9ET uE(9 c -9)\v(r)
n- 1E; R(9 + u) > 1-1
-
+ .6.(r). (3.26)
3. ERRORS WITH MOMENT OF FINITE ORDER 39
Nevertheless, the relations (3.15) and (3.26) are awkward to verify. Let us
mention one sufficient condition for (3.26) to be fulfilled, for example. Let us
assume that the dJ. of the r.v. Cj has a Lebesgue decomposition
"'a> 0
In particular, the l2 estimator is this l.s.e., and the h-estimator it the l.m.e ..
Keeping the notation ~kn(Ul,U2) for non-integral k let us assume that 1 < a < 2,
J..l20: < 00 and:
Ik (1) For any C > 0, R > 0 there exists 8 = 8(c, R) such that
sup sup n-l~o:n(Ul,U2) ~ C; (3.29)
OET Ul ,u2EvC(R)nU~ (0)
IU 1-u21<o
(2) For any R > 0 there exists a constant
0(20:) = ",(20:) (R) < 00
such that
II q +5. For any r > 0 there exists ~ = ~(r) > 0 such that
inf inf n -1/0: Eo S~/O: (0 + n1/2d;;1 (O)u)
(JET uEU~((J)\v(r)
(3.31)
THEOREM 10: Let J.L20: < 00 for some 0: E (1,2), and let the conditions II4 and
IIIq+5 hold. Then for any r > 0
Proof: Although the proof is similar to the proof of Theorem 9, it contains some
details that differ from the preceding arguments.
Let us denote
P;{ln-1/2dn(O)(O~ - 0)1 ~ r}
+P;{ _ inf
uEU;; ((J)\ V(r)
n- 1/O:hn(O,u) ~ -'Y~(r)}
(3.32)
Evidently,
PI! a.c.,
3. ERRORS WITH MOMENT OF FINITE ORDER 41
and
Therefore
Let 0 < C20 < (1 -,,)~(r) be some number. Then for n > no and
C21 = (J-tlja + (1 -,,)~(r) - C20)0I - J-ta > 0,
we have
4iljnOl (u, 0) - S!/OI(O) ~ S!/OI (u) ~ 4iljnOl (u, 0) + S!/OI (0) (mod P;),
and consequently
~ p;{ sup _
uEvc(Ro)nu:;(IJ)
n- 1 / 0I Ih n (O,u)1 ~ "'~(r)} + O(n- 1 ). (3.33)
Let us introduce, as above, the closed sets F(I), ... ,F(l), the diameters of which
do not exceed the number 8, and which correspond to the condition (3.29) with
the numbers Ro and
U
I
F(i) = VC(Ro).
i=1
42 CHAPTER 1. CONSISTENCY
Therefore for the probability P3 , of the right hand side of (3.33), we can write
10
P3 ~ L P;{n- /"lh 1 n ((J, ui)1 ~ (1 - C23h' ~(r)). (3.34)
i=l
We note that
From the inequalities (3.35)-(3.37) it follows that for some constant 0 < C24 <
(1 - C23h' ~(r) and n > no
where the latter bound is valid for each summand of the right hand side of (3.34) .
3. ERRORS WITH MOMENT OF FINITE ORDER 43
Let us show that the condition IIIq+5 in many cases can be replaced with
something more convenient for checking one. It was established above that
n- l / o E;S~/O(fJ) -----+ J.Lljo.
n-+oo
Consequently, from the bound for (3.34) it follows that if instead of (3.30) the
condition
lim sup
n-+oo (JET,pEec
n- l L Ig(j, fJ) - g(j, r)1 2o < 00 (3.38)
sup
(J1>(J2E9 c ;l(Jt-(J21<6
n- l L Ig(j,fJl ) - g(j,fJ2 )IO ~ C;
(3) sup sup Ig(j, fJ) - g(j, r)1 ~ go < 00. (3.40)
j~l (JET,rEe c
as n -+ 00.
44 CHAPTER 1. CONSISTENCY
Proof: Since (3.40) ensures (3.38), then it is sufficient to verify the inequality
(3.39), which is now turned into the inequality
(3.41)
i: i:
Let us indicate one bound for the quantity ~ * (r). Let us consider the function
i: i:
Let us note that
k'(O) = OJ
k"(z) = a fO JxJa-lp'(x -
Loo
z) dx - a
10
tx) xa-1p'(x - z) dx.
- a 1 00
xa-1p'(x - z) dx > aza-lp(O),
i:
Therefore
Let us write
aj = Jg(j, 0) - g(j,r)J.
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 45
Using the conditions of the Theorem and the bound (3.42), for IT - 81 ~ r we
obtain
0:
> 2' Gop(r) ,
where per) is the number from the condition (3.14). Consequently, in (3.41) it is
possible to take
~*(r) ::; (0:/2)G op(r).
_ 0:/31/Ot -,Blxl"
p(x) - 2r(1/0:) e , /3 > 0, 0: E (1,2), x E ~1 ,
for which the lOt-estimators of the parameter B are estimators of the maximum
likelihood.
din(B) = (L g;(j, 8) )
1/2
, gi =
o
OBi g, i = 1, ... ,q.
Let us introduce a series of assumptions.
lI5 . The set 0 is convex. The functions g(j, B), j ~ 1, are continuous on 0 c,
continuously differentiable in 0, whence for any R > 0:
(1) there exist constants iJi = iJi(R) < 00 and!!.i = !!.i(R) < 00 such that
sup sup din (8 + n1/2d;;1(B)u)d~1(B) ::; iJi, i = 1, ... ,q, (4.1)
(JET uEvc(R)nU:;((J)
sup sup din(B) (4i~) (Ul, U2) f/21u1 - u21- 1 :::; ')'i, (4.3)
(JET Ul ,u2Eve(R)nu~
i = 1, ... ,q.
Let us denote
fi (j, U) = gi(j, gh + n 1/ 2d;;1 (O)u),
Let us show that from the condition (4.1) there follows a condition that makes
the requirement II2 of Theorem 8 more precise.
LEMMA 12.1: If (4.1) holds, then
sup sup n-l~n(Ul,U2)lul - u21- 2 :::; 41,8(RW < 00 (4.6)
(JET Ul ,u2Eve(R)nu~ ((J)
where 1,8(R)1 is the norm of the vector ,8(R) = (i31(R), ... ,i3q (R)).
Proof: Let BET be fixed. By the finite increments formula for Ul, U2 E vC(R) n
U:;(B), with the aid of the Cauchy-Bunyakovski inequality we find
n- 14iN(ul,U2)
where 'TIn E (0,1), V f(j, u) is the gradient of the function f(j, .) at the point u.
Then from the inequality obtained in (4.1) it follows that
o E e.
The symmetric matrix 1(0) is non-negative definite. Let Amin(I(O)) be the smallest
eigenvalue of the matrix 1(0). We now introduce a condition which plays an
important role in the following chapters:
If a second derivative exists for the function g(j, 0), then let us set
1/2
diln(O) = ( Lg~I(j,O) ) , i,l = 1, ... ,q.
From (4.6) and IIa, for any element li)I) of the matrix 1(1) we obtain
(4.9)
On the other hand, for u E vC(R) n U~(O) and for the difference between the
general elements of the matrices l(u,O) and 1(0), with the help of (4.1) and (4.3)
we obtain
Ilil(O,U) - lil(O)1
(4.10)
The inequalities (4.9), (4.10) and condition V show that there exist numbers TO > 0
and X'o > 0 such that for any 0 E T (4.4) holds.
Let us now convince ourselves that condition (4.3) is a corollary of IIa. In fact,
we apply the finite increments formula to the function
and repeat the arguments which have led to the inequality (4.6). We then obtain
(4.3) with the constants
After the elucidation of the relations between the conditions introduced above
we are able to formulate the basic assertion of this section.
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 49
Proof: Let 0 < r* :::; ro, where ro is the constant in condition IIIq+6' By Theorem 8
the conditions that flow from the conditions of Theorem 12,
sup P;{lu n (8)1 ~ r*} = o(n-(B-2)/2).
(JET
we have
whence (4.12) follows by arguments analogous to the previous ones. Let us fix
8 E T. Let us introduce the sets
p;{ sup
uE*ve(r)\V( ....TnnU~ (9)
v(u) ~ ~}
[r/ ....Tnl
< L P; {
r=l
sup v( u) ~ -
UEQrn
1}
2
50 CHAPTER 1. CONSISTENCY
where
Let us cover the ball vr+ l)xTn) nv(r*) with an -net N = {u(m)}. The number
of points IN I in N does not exceed the quantity
where the constant c(q) depends only on the dimension q of the parametric set 8.
Let {s(m)} be sets formed by the intersection of balls of radius with centres at
the points u(m) with the set v(r*) n Un(O). Then
{ sUP _ IW(o+nl/2d;;lU'O)I~~nr2xo~T';}
uEv C r+1 )XTn )nv c (r* )nu;; (0)
q- 1 r2 }
c { sup _ "L...J Ib(O
~
+ n 1 / 2d-n 1 uld-:-tn1 > - n 1/ 2 -
- 2
- Xi XT.
r+l 0 n
uEvr+l)XTn)nV(r*)nU;; (0) i=l
(4.14)
where 8 E (O,~) is some number. The last inclusion was made possible thanks to
condition (4.3):
q
sup L bi(O + nl/2d;;lu(m)) - bi(O + nl/2d;;lu) dinl
uEs(m) i=l
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 51
q
< (ns*)1/2 g L 'Yi(r*).
i=1
Let us denote q
'Y(r*) =L 'Yi(r*).
i=1
Then
(4.15)
Let us consider the last summand of the right hand side of (4.15), setting
g = XTn; X = x(r)is a number which will be chosen later. We need to estimate
the probability 1r(~r)' where
Consequently
~r > 0, r = 1, ... ,r1,
if
(4.16)
and
if
If we consider the inequalities (4.16), (4.17) and (4.18) as being satisfied, for
r = 1, ... ,rl we find that
(4.19)
52 CHAPTER 1. CONSISTENCY
by statement (1) of Theorem AA. For r > r1, by statement (2) of the same
theorem
(4.20)
where
(n = o(n-(S-2)/2)
and does not depend upon r.
Using the condition (4.1) let us estimate for fixed m the probability entering
into the sum on the right hand side of the inequality (4.15):
L P;{bi(O + n 1/ 2d;;-l
q
r:2 1,
or
(4.21)
where
we find that
CT~(O) = 1,
and the requirement of Theorem A.5 on the quantity Ps,n(O) is satisfied at the
expense of the condition IV s and the relation (4.2). Therefore
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 53
p; { sup
UE( vC(r")\V(XTn))nU:; (0)
v(u) ~ I}
2
[r"/xTnl
L
rl
[r" /xTnl
X L r- 28 (r + l)s+qx- q(r). (4.23)
r=l
Let us set
x(r) = rq/(q+s) ,
Then the series Er(r + 1)Sr- 2S x S(r) and Er(r + 1)q+sr- 2S x- q(r) converge if the
series Er r- s2 /(s+q) converges, i.e., if 8 2 > 8 + q. Since the latter is stipulated in
the formulation of the theorem, the relation (4.13) will then be established if the
constant x is chosen appropriately.
Clearly the function
r +1 1 1
----;:2 x(r) = rs/(s+q) + r1+s/(s+q)
(4.24)
Let us consider the inequality (4.17). If 4JL2 > 1 then an integral number rl ~ 1
can be found such that for r > rl
Consequently (4.17) will be satisfied for r > rl if (4.24) holds. Let 4JL2 ~ 1. Then
if
(4.25)
54 CHAPTER 1. CONSISTENCY
then even for r = 1 (4.17) will hold. Independently of the value that J.t2 takes, the
inequality (4.18) holds for
r > r1 miner : rs/(s+q) > 1 + r- 1 ).
In this way it is possible, in the formulation of the Theorem, to take x >
max(x1' X2, X3), where the quantities Xi, i = 1,2,3, are obtained from (4.21),
(4.24) and (4.25).
EXAMPLE 4: Let
2 1)1/2
d (J) = ('" 2 (J2 .)1/2 = ( ~ ! (sin(2n + 1)(J )-
1n L.J cos J 2+4 sin(J2 '
and
d2n (J)
n 1/ 2d2";(J)
= (J2..j6 (1 + O(n-1))
n
uniformly with (J E T. Therefore it is appropriate to take the matrix
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 55
01(0 1 + v'2 u 1)
2n
X (
sin(n + ~) (20 + oV:n u
2
)
------;-"------,--~
2
+
(n
sm + 1)
2 0v'6
1 U
n
2 1
- 1
(2
sin 0 + -v'6- u
20 1 n
2) . v'61 u
sm 20
n
2
(4.26)
uniformly with 0 E T.
Clearly condition (3.11) is satisfied. Let us verify that (3.12) is satisfied. Let
us consider first the case lu 2 1> c, where c > 0 is some number. For a fixed u 2 we
obtain
>-
-
1
(0 -
)2 ( 1 - -,-------,.
2
1
2n sin 20~n
v'6'
u2
1 (4.27)
56 CHAPTER 1. CONSmTENCY
If (0 1 )2 > BJ..L2' then it follows from (4.26) and (4.27) that a number Rl can always
be found such that for lu2 1;::: Rl the inequality (3.12) is satisfied. In fact,
2n V6 u 21 ;:::
sin 2Jln V6 .Inf 2V6
sin x 1 21 > -
-- U - R 1.
u 01 xE(O,1I"/2) x - rrb
And so, if
aJ J..L~/2 > 2v'2
then in (3.12) it is possible to take
Ro J
= R~ + 4J..L2
Indeed, if
then
then
lu 1 12 ;::: R~ + 4J..L2 - c2
Now let us choose Rl so large that u 1 can not take such values for 0 E T.
This means that all points u = (u 1 , u 2 ) with lu 2 1 ~ c are found in the set
VC ( JR~ + 2J..L2) n U~(O).
Let us note that the quotient 0 1 JJ..L~/2 in the statistical theory of communication
is called the signal to noise ratio, and the condition aJ J..L~/2 ;::: 20 has a physical
meaning, considering the observations Xj as noise-contaminated by the signal
g(j,O) = 01 cos0 2 j.
Let us verify the condition IIIq+6. The calculations show that
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 57
It is equally easy thus to convince ourselves of the validity of (4.2). Let us verify
(4.3). Let us note that
6
((jl)2 n 3
82 (2)
(8u 1 )24'n (Ul> U 2)
I =
2
((jl )2 + O(n
-1
),
2 U2=Ul
= O(n- 1 ),
and
or
5 STRONG CONSISTENCY
From the theorems of the preceding sections it is possible to obtain some suf-
ficient conditions for strong consistency of the l.s.e.-s en and l.m.e.-s On.
Let us mention two examples. Let us assume that in the conditions of Theo-
rem 8 {Ls < 00 for some s > 4. Then for any r > 0 and () E T
and, consequently,
Analogously, under the conditions of Theorem 9, if {Ls < 00 for some natural
number s ~ 3, then for any () E T
In fact, these estimators are strongly consistent for less severe constraints. The
corresponding assertions will be introduced in the second part of the section, here
we shall formulate and prove one general assertion about the strong consistency of
minimal contrast estimators (m.c.e.) for non-identically distributed observations.
5. STRONG CONSISTENCY 59
on conditions that there exist the mathematical expectations E~, E~~) under the
measures PI:, pIt
l
We shall assume that the assumption (3.10) is satisfied.
DEFINITION: The sequence {5.1} is a sequence of families of contrast functions for
the family of measures {pf, 0 E e} if
{1} for any j E N and any 01,02 E e
there exists E~~) 1;(Xj ,02);
e
{2} for any 0 E for any r > 0 there exists ~ = ~(r) > 0 such that
is satisfied.
Below we shall assume that ee is compact and that inf9EeCn-1E1;(xj,0)
is attained in ee for any x = (Xl' ... ' Xn) E ]Rn. We shall also assume that for
any j E N and any Borel set B ~ ee that infrEB 1; (x, r) and sUPrEB 1;(x, r) are
Borel functions of x E ]R1.
Let us denote
vro(r) = {r E ]R1 : Ir - rol < r}, 1;( .. 0) = 1;(0).
and introduce the assumptions:
VI1. For any ro,O E e and p > 0
sup Pj9 {
j?:l
sup
rEv"'o(r)nec
11;(r) - gj(ro)1 > p} ----t
r--+oo
O. (5.3)
VI2 For any ro E e there exists ro > 0 such that the sequence of r. v.-s
inf9Ev"'o(ro) 1;(r) is uniformly integrable in the sense that for any 0 E e
supE~j)
j?:l
1 inf 1;(r) 1
rEv"'o(ro)
x{ rEv"'o(ro)
inf 1;(r)
1 1 > R} ----t
R--+oo
O. (5.4)
60 CHAPTER 1. CONSISTENCY
LEMMA 13.1: Let us assume that the r.v.-s e~jl, n ~ 0, are given on the probability
spaces (OU),.'F(i),p(i)), j ~ 1, and
Then
sup E(i)
i~1
le~P - eai )I ~ 0.
n-+oo
Clearly
I(i)
I
<
-
P,
5. STRONG CONSISTENCY 61
LEMMA 13.2: Let the conditions VI1 and Vh be satisfied, and rn .j.. 0 as n --+ 00.
Then for any TO, () E e
Then the LV.-S ~~j), dfl satisfy the conditions of Lemma 13.1.
Let us set
LEMMA 13.3: If the conditions VI1 , VI2 , VI3 are satisfied, then for any () E e
sup IH(T)I
rEe"
----t
n-+oo
0 PI: a.c .. (5.6)
EU)f'(Tio)
e J - ~2 <
- EU) inf f'(T) -< E(j)f'(T)
11 rEv J 11 J ,
(5.7)
e
Since c is compact there exists a finite number of points T1,"" T m E e
and corresponding neighbourhoods V1, .. , Vm such that c c U:'l Vi, and for e
each neighbourhood Vi the inequality (5.7) is satisfied. From these inequalities for
i = 1, ... , m and T E Vi there follow the inequalities
EU)
11
!-(T)
J
- e<
-
EU) inf !-(T) <
11 rEv; J -
EU)
(j
!-(T)
J ,
inf H(T);:::
rEe"
~in
l:'Sl:'Sm
(n- L 1 inf !i(T) -
rEv;
n- 1"
L...J
E~j) rEv;
inf !i(T)) - e.
62 CHAPTER 1. CONSfflTENCY
Evidently
H(9) ------t 0 pf a.c .. (5.11)
n-+oo
Let X(l), X(2) ~ IRN be sets of total pf probability, for which the relations
(5.8) and (5.11) are satisfied, respectively. Let us fix the elementary event x E
X(l), X(2) and let us assume that
n-+oo
(8n(x) in fact depends only upon the first n coordinates Xl, ... , xn of a point
x = (Xl, X2, ) E IRN. This means that there exists eo > 0 such that for an
infinite sequence of indices nk, k ~ 1,
> ~(eo),
H(9) ~ ~~o) , (5.13)
The latter is always possible, since from (5.8) it follows that for any sequence of
sets en ~ e c
Since
On the other hand, from (5.10), (5.12) and (5.13), for nk > no we obtain
Let us note that in the proof of Theorem 13 only a part of Lemma 13.3 was
used, namely the inequality (5.8).
Let us assume that J.L2 < 00. For a contrast function let us take
Then
64 CHAPTER 1. CONSISTENCY
and the contrast condition (5.2) takes the following form: for any 0 E e, for any
r > 0 there exists 6. = 6.(r) > 0 such that
_ inf n-l~n(u, 0) > 6.(r). (5.18)
uEU:; (IJ)\v(r)
Let us assume that Jl.2+6 < 00 for some 8 > 0 and that
A = sup sup Ig(j,O)1 < 00.
j~1 uE9 c
Then
1+6/2
supE$ sup !J(r) <00, (5.19)
j~1 rEv.,.o (ro)
and therefore the condition VI2 of uniform integrability holds. On the other hand
(5.19) implies that the conditions of Theorem A.7 are satisfied for the r.v.
sup sup
!J(r),
inf inf
rEv.,.o (ro) rEv.,.o (ro)
Jl.2+6< 00 for some u > 0, satisfying the inequality (5.18) and that the sequence
of functions g(j, 0), j ~ 1, is compact in the space of continuous functions C(e C ).
Then the m.c.e. en (equal to the l.s.e. On) satisfies the conclusion of Theorem 19.
For concrete contrast functions one can present conditions, milder than those
of Theorem 13, which yield the validity of the 'uniform' law of large numbers
(5.6). Finally, Theorem 13 will be valid under milder assumptions. Let us use
some contrast functions !J(Xj , 0) = [Xj - g(j, 0)j2 for which the requirements of
Corollary 13.1 can be relaxed to illustrate what we have just mentioned.
THEOREM 14: In the model (0.1) let Jl.2 < 00, and let dn (()) == n 1/ 2 1 q satisfy the
conditions (9.14) for T = e c and
(5.20)
5. STRONG CONSISTENCY 65
if
sup n-1Iw(O,r)1 ~ 0 pf a.c .. (5.21)
rEe" n--+oo
Let us establish (5.21). By the condition (5.20) for any 6 > 0, for n > no
sup (n- 1<pn(Ol,02) - CP(Ol,02)) < 6.
Ih,92Ee"
Clearly
n-1<pn(O, r) - (n - l)-l<pn_l (0, r)
= n-1(g(n,O) - g(n,r))2 - n-1(n -l)-lcpn_l(O,r).
The series
L n-1(n-1cpn(O,r) -
00
(n -l)-lcpn_t{O,r))
n=2
converges by the Dirichlet criterion, thanks to (5.20). On the other hand, the
series
L n- (n - l)-l<pn_l (0, r)
00
2
n=2
also converges by the condition (5.20). Consequently
00
n=2
66 CHAPTER 1. CONSISTENCY
and
n-1w(O,T) ~ 0 pf a.c.
n-+oo
(Theorem A.7).
Let 8' C 8 c be count ably dense in the set 8 c and X(3) ~ ]RN be a set of total
pf probability for elementary events of which
n-1w{O,T) ~ 0, T E 8', S*~J.L2.
n-+oo n-+oo
where Vri (6), i = 1, ... , rna is a finite covering of 8 c , Ti E 8'. If x E X(3), then for
n > no
But by (5.21)
n-1w{Bn,O) ~ 0 pf a.c ..
n-+oo
Therefore also
From the latter relation and (3.14) we obtain the assertion of the Theorem in
the following way. Let us assume that for an elementary event x E ]RN
This means that there exists r > 0 and a sequence of indices nk, k 2:: 1, such that
(1) (5.24)
Since condition (5.24) also implies the validity of (3.14), then the assertion
coinciding with the result of Jennrich [138] is valid.
COROLLARY 14.1: In the model (0.1) let J.L2 < 00, dn(O) = n 1 / 2 1 q , and let the
functions CPn(Ol, ( 2 ) have the property (5.24). Then for any 0 E e
On ----t
n-too
0 PI' a.c ..
where x(x) = 1 if x > 0 and x(x) = 0 if x ~ O. Let us assume that the sequence
Fn(Y) weakly converges to some probability d.f. F(y), i.e., for any continuous and
bounded function a(y), y E Y,
Then
and
< r sup a(y, T)Fn (dy) - }yr inf a(y, T)F (dy),
}y rEv, rEv,
< r SUp a(y, T)Fn (dy) - }yr inf a(y, T)F (dy).
}y pEv, rEv,
where
Since An(t) -----t 0 for any t E T', then (5.26) follows from (5.27). In order that
n-+oo
(5.24) hold it is sufficient to require that the function g(y,8) have the following
property: for any 81 ,82 E ee, 81 ::f 82 , F-measure of those points y E Y for which
g(y,8t} ::f g(y, ( 2 ) is positive.
considered in Example 4 of Section 4, does not satisfy the condition (5.20). Clearly,
for this function dn (8) ::f n 1 / 2 1 q also. Nevertheless, the relation (5.21) can be
obtained in this case as well. For this it is sufficient to show that if JL2 < 00, then
i
."n = sup n -1 \L...J
' " eirj ej \ -----t 0 pN
fJ a.c .. (5.28)
rE[O,1rj n-+oo
5. STRONG CONSISTENCY 69
n n-Ikl
< n-2Le~ +n- 2 L L ejelkl+k ,
1o=-n j=1
kO
= O(n- 1 / 2 ).
Let us set
n(m) = [ma] + 1, 0: > 2.
Then
(n(m) ~ 0 pf - a.c.,
m--+oo
since
00
<
( 1) n(m+l)
( n m+ ) -1
< n(m) - 1 (n(m) +n (m) L lejl
j=n(m)+1
Ef ( n-l(m) L
n(mH)2
lejl ~ J.L2 (n(m + 1) - n(m))
2
= O(m- 2),
j=n(m)H n(m)
n(m + 1)
~l.
n (m) m--+oo
70 CHAPTER 1. CONS~TENCY
Therefore (m ----t 0 pf -a.c., and consequently (5.28) holds. Since the inequality
m-+oo
(5.18) in the case considered is also satisfied (see Example 4, Section 4), it is then
possible to conclude that for any () E e
i.e.,
The relation (5.28) in models with discrete and continuous time can be used
for the solution of the problem of detecting hidden periodicities. In the language
of the model (0.1) the question is that of the estimation of the parameters of the
regression function
Q
g(j, (}) = ~)Ai sinwij + Bi coswd), () = (A1, B 1,W1, ... , AQ, BQ,wQ).
i=l
Then
for any () E e.
Let us formulate one assertion which uses essentially the form of the function
Let us notice that (5.30) is a corollary of (5.20). In fact, if (5.20) is true, then
n-1~lg(j'{h) - g(j,021 ~ (n- 1<pn(01,02))1/2,
THEOREM 15: In the model (0.1) let J.L1 < 00, dn(O) == n 1/ 21q, and let relations
(3. 14}, (3. 27}, (3.28) and (5.30) hold. Then for any 0 E e
On ------t 0 p/i a.c ..
n-too
Therefore also, as in Theorem 14, using condition (5.30) it is possible to show that
sup IH(r)1 ---t 0 p/i a.c.,
TEec n-too
if for any r E ec
H(r) ---t 0 p/i a.c ..
n-too
For the verification of this fact let us apply Theorem A.8 to the sequence of r.v.-s
~j = IX j - g(j, r)1 - Et'IXj - g(j, r)l
Using the notation of (3.27), we obtain sequentially
00 00
2: 2: P{m - J.L1 -
00 00
2: mP{m -
00
L
00
(5.32)
is used for the definition of the l.s.e. instead of the functional L(O). We call the
l.s.e. On obtained in this way the logarithmic l.s.e ..
A particularly natural such estimation procedure is represented by the case
of the log-linear model, i.e., when a(j,O) = (Yj, O) and e = ]Rq; the logarithmic
l.s.e. is calculated in the explicit form. It is therefore of very great interest to
explain under which conditions on the function a(j, 0) and the errors of observation
the logarithmic l.s.e. has satisfactory statistical properties. Unfortunately, and it
was to be expected that, the logarithmic l.s.e. for the observational model (0.1)
with additively entering errors 'rarely' happens to be consistent. This section is
dedicated to the explanation of this property. Up to the end of this section we
shall assume that e is a bounded set.
Let us assume that
(1) infj~l inf(le9 a(j, 0) ~ ao > -00;
C
(2) Cj ~ -bo P-a.c. for some bo > 0, where the constants ao and bo are linked
by the relation boe- ao < 1.
The condition introduced ensures the formal possibility of the taking of the
logarithms of the observations Xj for any realisation of the errors Cj.
EXAMPLE 7: Let a(j,O) = OYj, 0 E e = (a, b), a > 0, b < 00. Let us assume
that the logarithmic l.s.e. On is a strongly consistent estimator of the parameter O.
Then for pf almost all x E ]RN, for n > no(x), On = On(X) satisfies the equality
or
74 CHAPTER 1. CONSISTENCY
(6.2)
Consequently (6.2) holds, and if the series (6.1) converges then en is a consistent
estimator of O.
Let us write
E9N-
W(On,O)
A
= E9N-W(On,O) Ir=9
A
LEMMA 16.1: If the conditions (1)-(4) are satisfied, then for any (J E e
n-1tpn(en , (J) ~ 2n- 1 Ef w(e n , (J) + 0(1) pf a.c., (6.3)
where 0(1) ----t 0 pf -a.c ..
n-+oo
if
00
If Cj ~ 0 then
log2(1 +Cjg-l(j,(J ~ log2(1- boe- aO ).
Let us further note that if Cj > 0 then
1 + Jog-l(Jo, (J) <
-
1 + Je- ao <
-
~e-2ao
J
ev'5) p{
log2(1 - boe-ao)p{ - bo ~ Cj ~ O}
C3 < 00 (6.7)
76 CHAPTER 1. CONSISTENCY
00.
j=l
The latter fact is a corollary of (3) and the compactness of the set ec
On the other hand, for 71,72 E e c we obtain
---+ 0 pf a.c.,
n-+oo
if
00
But the convergence of the series (6.9) follows from (4), i.e., analogously to (6.7)
it is possible to obtain the uniform bound
Therefore (6.5) follows from (6.8) in the same way as relation (5.21). In this way
we obtain (6.3) from (6.4).
The result of Lemma 16.1 shows that if the function 'Pn(Ol, ( 2 ) distinguishes
the parameters, for example in the sense of inequality (3.14), then On consistently
estimates 0 if
It is hardly probable that one can mention a simple sufficient condition for the
fulfilment of (6.10), covering a more or less wide and interesting class of functions
6. TAKING THE LOGARITHM OF NON-LINEAR MODELS 77
a(j,8) and errors Cj. However, using the result of Lemma 16.1, in one case it is
easy to show such a neighbourhood of 8 which contains all the limit points of the
sequence of On. In addition to the assumptions introduced we shall assume that:
(5) Cj E (-c, c), j ~ 1, for some C E (0, bol and P(O) = ~.
(6) For a> 0 from condition (3) and some 2:!. > 0
(6.11)
2 1/2 )~2/0I.
lim 10 - 81 < ",-1/OI. c 2/OI. e -2ao/0I. ( 1 + min ( _1_ (_1_~)
n-too n - - 1 - /3' 1 - /3 6
2 '
Proof: Applying the Cauchy-Bunyakovsky inequality to the sum n- 1 E$' W(On, 8),
from (6.3) we obtain
+ 0(1) pf a.c.,
or
pf a.c., (6.12)
hence it is possible to take 01(1) = 01/ 2(1). Let us estimate
IE$' log(1 + Cjg-1 (j, 8)) I = II'>Og(1 + xg- 1(j, 8))P (dx) I
= h +12 ,
11 < '2cg Ll)
1 -1(.J,U,
1 1 1 -
12 < - '21og(1 - cg- (j, 8)) ~ '2 cg- 1 (j, 8)~,
= 1+ 2/3 +3""+
/3 2 ....
2
~
78 CHAPTER 1. CONSISTENCY
1
_ {
1-,8'
~<
- 1 11'2 1/2
C-,826) ,,8>
Therefore
L
r
n- 1 (E:log(l + Cjg-1(j,O)))2
In this second chapter we find sufficient conditions that must be satisfied .by the
function g(j, 8) and the r.v.-s Cj in order to ensure that as n --+ 00 the distributions
of the normalised differences On - 8 and On - 8 tend uniformly to Gaussian distrib-
utions in the proximity of the parameter. The basic result of the chapter consists
in obtaining the asymptotic expansion (a.e.) of the distribution of l.s.e.-s On (see
Sections 10--11), significantly revising the usual statements about the asymptotic
normality of statistical estimators.
In this section a stochastic asymptotic expansion (s.a.e.) is obtained for the normed
l.s.e.-s On, i.e., a result about the On-approximation of the sum of vector polynom-
ials of the standard sums of r.v.-s with random remainder term which is stochas-
tically small. Theorems about s.a.e.-s of an l.s.e. reveal the structure of the l.s.e.-s
and are the starting point for the further study of subtle properties of the estimate
On.
Let us assume that condition (3.10) is satisfied, that e E IRq is an open convex
set and that Tee is compact. Let us introduce some notations. Let a =
(a1,"" a q ) be a multi-index with lal = a1 + ... + a q For a smooth function c(8)
we denote
We also make use of other notations for derivatives. Let k ~ 2 be an integer. Then
for r = 1, ... , k and i 1 , .. , ir = 1, ... , q
(0<) _ ( 82 ) (0<)
cij - 88 i 88j c , etc ..
79
In the course of all the chapters we shall assume that to the functions g(j,O)
for each j there also exist all continuous partial derivatives with respect to the
variables 0 = (0 1 , ... ,oq) in eO up to order k ~ 2 inclusive.
Let us set
Analogously li(OI), li~OI) will denote that functions g~OI), g~f) with the same compli-
cated arguments.
Let use denote
d;(ojO) =L (g(OI)(j,O)f
In particular,
for the mathematical expectations of the derivatives L~Ot), L~j), L~;'), etc.. We
also write
For the formulation of the assertions of Section 7 the following condition will
be needed.
II. For any R > 0 there exist constants ci(a, R) < 00, i = 1,2, such that
(1) sup sup n(lal-l)/2 (d~(O))-l dn(aj 0 + nl/2d;;-1(0)u)
9ET uEvC(R)nUN(9)
(2) sup sup In1 1al - 1 (d~(0))-2 ~~a) (Ul, u2)lul - u21- 2
uET Ul,U2EvC(R)nU:i(9)
where hv(O), v = 0, ... , k-2 are homogeneous vector polynomials of degree v+l in
the random varaibles b(aj 0), lal = 1, ... , v + 1 with coefficients uniformly bounded
in nand 0 E T .
The proof of this Theorem 17 will be carried out according to the plan of the
work [59]. Let us first prove a few lemmas.
LEMMA 17.1: The condition {7.2} for the vectors a, 0 ~ lal ~ k -1, follow from
the relations {7.1} satisfied for a + ei, i = 1, ... , q.
Proof: The proof consists of the application of the finite increments formula to the
functions n 1al - 1 (d~(O))-l4i~a) (U1, U2), U1, U2 E vC(R) n [j~(O), and is analogous to
the proof of Lemma 12.1 of Section 4.
Let us denote
and let us write the Maclaurin expansion in terms of the variable u for the gradient
of the function n- 1L(O + n 1/ 2d;;;1 (O)u):
n- 1VL(O + n1/2d;;1(O)u)
= - 2n- 1/ 2B(Oj 0) + 2I(O)u - 2n- 1/ 2B(2)(O)U
(u) (7.7)
The analogous expansion for the function n- 1Lil, i, 1 = 1, ... , q has the form
n- 1Lil(O + n1/2d;;1(O)u)
(il (u)
(7.10)
7. STOCHASTIC ASYMPTOTIC EXPANSION OF LSEs 83
If k = 2 then the sums in (7.7) and (7.9) are absent, and the remainder term
in (7.10) is equal to
LEMMA 17.2: Let lui ~ 8 < 1 and let the event {s* ~ J.L2 + 1} be realised. Then if
condition II is satisfied we have
i, 1= 1, ... , q.
where
((1)(aju) = 2n(lo:l/2)-I(d~(9))-1
where the c((3, "I) are integer constants. Thanks to the conditions of the Lemma
and the statement of Lemma 17.1,
(7.11)
C3 =4 L c~(ei' 1).
i=1
84 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
n(l a l/2)-1 (d~ (8))-1 1L:(f(-r) (j, u)f(J3) (j, u) - f(-r) (j, O)f(J3) (j, 0)) I
< (n(I-rI- 1)/2(a,{(8))-1 (~~-r)(U,0)f/2)
x (n(IJ3I-1)/2(d~(8))-ldn((3; 8 + nl/2d~1(8)u))
The assertion of the Lemma follows from (7.8), (7.10)-(7.12), and from the obvious
inequality lual :::; lupa l .
Let us write
L(2) = (Lil)~,'=l
for the Hessian of the function L.
LEMMA 17.3: Let 8 E T and the events {s* :::; J.t2 + I}, {n- 1/ 2Ib(a,0)1 :::; 8},
lal = 2, ... , k, be realised, and let the conditions II and V (condition (4.7)) be
satisfied. Then a number ro = ro (T) > 0 can be found such that for n > no
Since
n-1/2Ibil(8)1 :::; 8
in the right hand side of the expansion (7.9) by the condition of the Lemma,
and the sum of the subsequent terms does not exceed Cfj8, Cfj c6(T) < 00 =
7. STOCHASTIC ASYMPTOTIC EXPANSION OF LSEs 85
by the conditions of the Lemma and Lemma 17.2, then the right hand side of
(17.3) is not larger in value than q(2 + c6)8. Consequently the Lemma holds for
ro :::; 8 :::; Aoq-1 (2 + C6)-1.
If the event {lunUJ)1 :::; ro} is realised and the conditions of Lemma 17.3 are
satisfied, then the mapping
u -----t n- 1L(8 + n1/2d;;,1(8)u)
into .coo (u, t) the terms containing ti, i = 1, ... , k - 1, vanish. Therefore it is
possible to write
i = 1, ... ,k -1,
hi,k-1 = hi,k-1 (8), i ~k
from the representations (7.18) and (7.19) are q-dimensional vectors, the coordi-
nates of which are hom0geneous polynomials of degree i in b(a,O), lal = 1, ... , i,
with coefficients uniformly bounded in nand 8 E T.
Proof: The proof proceeds by induction on k. IT k = 2 then
h1 = A(8)B(0, 8)
and the assertion for hi,l, i ~ 2, is verified immediately.
Let the assertion be true for some k ~ 2. Then hk is defined by the condition
of the vanishing of the coefficient at t k in the expression
In (7.20) the discarded terms are of degree in t larger than k. From (7.19) we find
- 1 -
hk = 2 A(8)hk,k-1
(7.21)
(7.23)
Hence the assertion of the theorem follows. In fact, let us set C;,1/k = C7' Then
from (7.23) we obtain
(7.24)
Let us, for a fixed 0:, estimate the probability P;{lb(o:,O) ~ C7T;;/k}, setting
j = 1, ... ,n,
in the formulation of Theorem A.5. Thanks to (7.1), (7.3) and (7.4) the r.v.-s ejn
satisfy the conditions Theorem A.5 if
xn(T) ----t 0
n-too
(7.27)
Since the terms tiki and tiki,k_l in (7.18) and (7.21) are homogeneous in tbn(o:, 0),
then
Hence
(7.28)
From Lemma 17.2, (7.27), and the first of the inequalities (7.28) it follows that
(7.29)
From the second of the inequalities (7.28) and the second of the inequalities (7.29)
we obtain
(7.30)
Using (7.7), (7.29), and the inequality (14.12) of the book [33],
x (2,xo - 2q ~ax
l$I,I$q
Ibil(()) I
(7.31)
Since V L(On) = 0, we obtain (7.23) from (7.30) and (7.31), whence we can take
Cs = (Cl3 + Cl4 + 2ClS),xOl .
7. STOCHASTIC ASYMPTOTIC EXPANSION OF LSEs 89
We find the exact form of the polynomials hi = hi+1 occurring in the formu-
lation of Theorem 17 by substituting u(k-1) (t) in (7.17) and equating to zero the
coefficients ofthe powers of t i (see Lemma 17.4). In order to write the vector poly-
nomials ho, h1 and h2 obtained in this way let us assume the tensor contraction
convention, as it is constantly used below. We shall consider that if in a product
of two or more factors any index appears twice, then it means that the summation
of all the values of this index is taken from 1 to q.
For example,
L
q
(7.36)
(7.37)
90 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
Only the notation bili2i4 (0) needs clarification. In general, for r = 1, ... , k,
= 1, ... ,q, we assume
i 1 , ... ,ir
bil ... i~(O) = n(r-1)/2 (diln(O) ... di~n(O))-l LCjgil ...i~(j,O). (7.38)
Let us derive the general recursion relations for the calculation of the poly-
nomials hv in the case where the normalisation n 1 / 2 1q is used instead of the
normalisation dn (0). In this case the expressions for the functions ail ...i4 (0) and
the sums bil ... dO) are considerably simplified. For example, (7.38) turns into the
expression
(7.39)
Assuming that the functions g(j,O) are infinitely differentiable we can write
(c/., (7.7) and (7.17))
(7.40)
i = 1, .. . ,q.
o = n
-1
Li(On)
A
= f: n(r~!1)/2
r=O
i~
(aiil ... (0) - 2n-1/2biil ... (0))i~ fI (f: h~;
3=1 Ot;=O
(0)n- Ot ;/2)
It is important to-emphasise that in the sums (7.43) the integral vectors a(r)
contain r coordinates for each r.
The next Theorem is closely related to the one just proved.
THEOREM 18: Let the estimator On have the following property: for any r > 0
supP;{!un(lJ)! ~ r} = o(n-(m-2)/2). (7.44)
geT
Proof: The relation (7.44) was established in Theorem 8 of Section 3. The condi-
tions of Theorem 17 are sufficient for the l.s.e.-s On to have the property (7.44) if
(3.11) and (3.12) are added to it. By virtue of Remark 8.2, for the normalistaion
n 1 / 2 1q and bounded e it follows that we should add (3.14) only instead of (3.11)
and (3.12). Since 11"(1) = o(n-(m-2)/2) in Theorem A.4, then (7.45) follows from
(7.6) if we take Tn = c.. log k/ 2 n.
In the process of proving Theorem 17 we obtained simultaneously:
THEOREM 19: Let the conditions of Theorem 17 be satisfied for k = 2, as well as
(7..44). Then for any fixed A > 0
=o(n-(m-2)/2). (7.46)
Proof: The required estimate follows from (7.16) and Theorem A.5 applied to the
sum of the r.v.-s bi(lJ), i = 1, ... , q.
REMARK 19.1: The estimate (7.46) sharpens the estimate (4.11) of Theorem 12.
However, Theorem 12 was obtained without the assumption of the existence of the
second derivatives of the regression function g(j, lJ).
EXAMPLE 8: (See Example 4 of Section 4). Using Theorem 19, the bound (4.29)
for the function 9 (j, lJ) = lJl cos lJ2 j can be sharpened. Since now m = 3
1 O(n-l) )
(
J(lJ) = O(n-l) 1 '
This Section contains the statements about the rate of convergence of the distrib-
ution of normalised l.s.e.-s On to a Gaussian distribution. For a scalar 0 it is shown
(Theorem 21) that the rate of a Gaussian approximation is of the order O(n-/ 12 )
as n --+ 00. This and other results of the Section are of a preliminary nature and
later on they will be considerably extended at the cost of additional requirements
on g(j,O) and ej.
Let us write (!q C Bq for the class of all convex Borel subsets of IRq. Let CPK(X),
x E IRq, be the density of a Gaussian random vector (q > 1) with zero vector mean
and correlation matrix K (the density of the Gaussian r.v. (q = 1) with zero mean
and variance K),
THEOREM 20: Let us assume that the conditions of Theorem 18 hold for k = 2 and
some m ~ 3, for which (7.5) is satisfied instead of (7.4) for a = ei, i = 1, ... , q.
Then
Proof: We shall assume that m = 3. From (7.45) and the positive definiteness of
the matrix 1(8) follows the existence of a constant c~ such that
(8.2)
For A E Bq and x > 0 we denote
Ax = {x:x E IRq,p(x,A) < x}
where
p(x, A) = yeA
inf Ix - yl
is an internal set parallel to A. Let us remark that the set Ax is open and A-x is
closed.
8. LSE ASYMPTOTIC NORMALITY: FIRST RESULTS 93
Let us set
Xn = c~n-1/2Iogn.
For any C E cr.q , from (8.2) there follows the inequality
p;{JL;-1/211/2(O)dn (O)(On - 0) E C}
~ p;{JL;-1/2 A1/2(O)B(O,O) E Cxnm} 'Yn (8.3)
The inequality '~' is already quite clear, as is the inequality':::;' if we take into
consideration that (C-Xn)Xn C C.
Let us apply Theorem A.9 to the sequence of random vectors
j = 1, ... ,n, n ~ 1.
Kn(O) = JL21(O) ,
n- 1/ 2 L K;1/2(O){jn = JL;-1/2 A1/2(O)B(O, 0),
P3,n(O) = n- 1 L E31{jn1 3
we have
(8.5)
94 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
1(8) ~ 1(8)
n-+oo
where
A(8) = 1-1(8).
Proof: Clearly
(8.6)
uniformly in 8 E T.
When 8 E T is fixed (8.6) is an immediate corollary of two facts:
(1) the measure ~1J2<11(8) weakly converges as n --* 00 to the measure ~1J2A(8);
(2) <!q is an uniform class of sets for the Gaussian measure ~ 1J2A(8) ([33] p. 36).
However, both the direct proof of this and of the uniformness in 8 E T of the
variant (8.6) are almost obvious.
For a symmetric positive definite matrix A = (aij)~.j=l
det A $ au ... a qq .
Consequently
detJ(8) $ 1.
The same inequality is valid for <P 1'2A(II) (B). For an arbitrary t: > 0 let r = ret:) <
00 be a number such that
< t:+
det I(O)
( 1- ( detI(O)
)1/2) <P1'2 A(II)(CnV(r))
xl cnv(r)
IxI 2 e-(1/21'2)(I(II)X,x) dx.
COROLLARY 20.2: Let the l.s.e. On have the following property for some integer
m ~ 3
for sufficiently large H and some constant x < 00. Then under the conditions of
Theorem 20 and Corollary 20.1:
(1) (8.8)
(2) (8.9)
for real ri ~ 0, r = L:f=l ri < m, the vector ~ = (e, ... , ~q) has the Gaussian
distribution <P 1'2A(II)
Proof: The conditions for which the inequality (8.7) hold were discussed in Sec-
tion 3. In Section 2 we obtained the exponential bounds of the probability of large
deviation for the r.v.-s Idn(O)(On - 0)1, from which (8.7) also follows. For (8.7) to
be satisfied for any r < m and n > no (see, for example, [80])
96 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
Since
IT Idin(l1)(8~
i=l
- Oi)lri ~ Id n (0)(8n _ O)l r ,
then (8.8) and (8.9) are variants of the theorems about convergence [151) pp. 196-
198.
From (8.8) it follows in particular that the mathematical expectation and the
correlation matrix of the vector dn (0)(8 n - 0) converge uniformly in 0 E T to 0
and J.t2A(0) as n ~ 00, respectively. In Section 12 the relation (8.8) is investigated
at greater length.
The estimate (8.1), which will be useful in the future, does not determine the
exact order in n of the rate of convergence of the distribution of the l.s.e.-s 8n to
a Gaussian distribution. It follows that we should expect that the exact bound
is O(n- 1 / 2 ). Below, just such a rate of approximation to the Gaussian law is
obtained for a scalar O.
Up to the end of the Section we assume that 0 E e, and that e is an open
finite or infinite interval on the real axis ]Rl, and that Tee is a finite closed
interval of ]Rl .
Let us denote
d~(O) = 1)g'(j,0))2,
d~(2, 0) = L (gil (j, 0))2,
j' (j, u) = g' (j, 0 + nl/2d;;I (O)u),
gil (j, 0 + n 1/ 2d;;,1 (O)u) = f" (j, u),
~ c5(R),
(2) sup sup nd~4(8)~2n(Ul,U2)lul - u21- 2
8ET Ul,U2E[-R,Rjnu:;(8)
~ c6(R).
III. lim infnl/2d~2(8)dn(2j8) >0.
n-too 8ET
< 00.
we obtain
If the event
inf t(u~.
lul<2/l;/2(1+.5)1/2n-l/21og1/2 n - 2
+ (n1/2dn(2;O)d~2(O)) (n-1/2~;;2(u,0))
In particular,
(8.15)
Then
X n {J.L;-1/2 dn (O)(On - 0) < x}
In fact,
where ~ E (O,!) is a fixed number. Therefore the points u n ((}) and J.L;/2n- 1/ 2x
lie on the ascending branch of the parabola
y+(x) = t(O)x + C7X2.
Consequently, from the inequality
- ((}) < J.L2n
Un
1/2 -1/2
X
(8.17)
=0.
The points u n ((}) and J.L~/2n-1/2x lie on the ascending branch of the parabola
Therefore if
- ((}) >
Un
1/2 -1/2 X,
_ J.L2 n
then
o < - n- 1/ 2b(1)((}) + J.L~/2n-1/2t(0)x - J.L2n-1c7x2
sUPp;{X(O)} = o(n- 1 / 2 ).
(JET
Let us denote
17n(Z) = n- 1 / 2 L ~jn(z),
~jn(z) = j (d;;l (O)g' (j, 0) + JL;/2 d;;2 (O)g" (j, O)z) n 1 / 2 (8.19)
It is easy to see that
r
a;(O,z) E;17;(Z)
And so
P3,n(8, z)
sup
(JET,ZEZ" VERI
sup Ip;{ (Eo1J2(z))-1/2 1Jn (Z) < Y} - 4>(y) 1 = O(n- 1/ 2). (8.23)
Let us estimate the second term of the right hand side of (8.20). Using Newton's
binomial expansion for (1 + W)-1/2 and the condition of the Theorem it is easy to
establish that
where
Since
Izl ~ 2(1 + 8)1/2Iog1/2 n,
then for n > no
Iz*l> I~I,
and consequently
The relations (8.23) and (8.24) prove (8.18), and the case
(8.25)
to estimate
8. LSE ASYMPTOTIC NORMALITY: FIRST RESULTS 103
Together with (8.11) the bound (8.26) leads to the the relation
sup sup
(JET "'~2(1+8)1/21og1/2 n
Ip;{J.l;-1/2 dn (0)(On - 0) < x} - !J>(x) I = o(n- 1/ 2).
The case
x ~ - 2(1 + 8)1/2log 1/2 n
is analysed analogously.
Let us set
a; = n- 1L(On).
and change the normalisation of On in (8.10), namely: instead of J.l2 and dn(O) we
substitute their statistical estimators a; and dn(On). In this case there holds:
THEOREM 22: Let J.lB < 00 and the l.s.e. On satisfy (8.11) for m = 6.. Then if
conditions II, III, IV are satisfied,
Let us assume that the events xi B) and X3 are realised. By the finite increments
formula we find
x "Lf'(j,u)!,,(j,u)
~1 + ~2.
Let us estimate ~1 taking into consideration that
< cgn 1 / 2
Secondly,
n- 1L 1/ 2(8 + n1/2d;;1(8)u) > (S*?/2 - n-1/2~:!2(u, 0)
Thirdly,
Therefore
and consequently
~1 ~ C12 d;;1(8).
< C13 n1 / 2.
8. LSE ASYMPTOTIC NORMALITY; FIRST RESULTS 105
0) < x n Xl(6) n X3
}
nxi 6 ) n X 3
It is easy to see that if the events xi 6) and X3 are realised, then
(1 - xn- l / 2dn(0)((u*))-1 = 1 + xn- l / 2dn (0)((u*, x),
with
sup sup dn(O)I((u*,x)1 ~ C18 < 00.
lul~2tL;/2 (4+0)1/2 logl/2 n Ixl ~2( 4+0) 1/2x logl/2 n
Let us denote
X4(0,X)
rr
{p;-'i2dn(0)(Bn - 0)
X5(0,X)
~ {~2'i'dn(0)(Bn - 0)
106 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
where the upper sign is chosen if x ~ 0 and the lower one if x < o. Clearly, for
n > no
- V2C18x'n-'i2) - 4>(x)
Since the same bound holds for X5 (8, x) the assertion of the Theorem is established
for
8. LSE ASYMPTOTIC NORMALITY: FIRST RESULTS 107
and that the events xi 6 ) (0) and X3 are realised. Then by condition II
~ 1
supdn(On)d~ (0) ~ c4(1),
BET
and by (8.29)
Therefore if
X 2: JL21/2 C4 ()-1
1 clO
then
i.e.,
or
Consequently for
we have
Ip;{ a-~1dn(On)(On - 0) < x} - q>(x) I
< q>( -x) + PnB{ a-~ 1 dn(On)(On
~ ~ - 0) 2: x }
In this Section we prove one theorem about the asymptotic normality of the l.m.e.
On, doing this by using a method of partitioning the parametric set, owed to Huber
[118,119]. Using the notation of the preceding sections let us assume the following.
(ii) The r.v. Cj is symmetric and has a bounded density p(x) = P'(x), satisfying
the condition
Ip(x) - p(O)1 ~ Hlxl, p(O) > 0,
where H < 00 is some constant.
From (9.1) there follow the inequalities (3.17) and (3.18) and the inequality
strengthening (3.16):
sup sup n-l~n(Ul,U2)lul - u21- 2 ~ c(R) < 00. (9.3)
fJET Ul ,u2EvC(R)nU~ (fJ)
And so if, in addition to (9.1), it is assumed that J-ts < 00 for some integer S ~ 1
and that (3.15) holds, then by Theorem 9 of Section 3 for any r > 0
supPJ'{ln- 1/ 2dn (O)(On - 0)1 ~ r} = zn(s), (9.4)
fJET
where
Zn(S) = O(n- s +1 ), S ~ 2,
zn(l) ---+ O.
n-+oo
From condition (9.2), which coincides with condition (4.8) as may be inferred
from Lemma 12.2, one can obtain the inequality
sup sup din2~~)(Ul,U2)lul - u21- 2
fJET Ul,U2Evc(R)nU~(fJ)
8l R(9 n ) ~ 0
Proof: We shall divide the proof into several steps. Let h, ... , lq be the posi-
tive directions of the coordinate axes. Let us consider the vectors R(9) with
coordinates
i = 1, ... ,q
and the vectors E3 R (r) with coordinates (i = 1, ... , q)
Clearly
LEMMA 23.1: Under the conditions of Theorem 23, for any c > o and sufficiently
small p > 0
Proof: We carry out the proof for the quantity z;t(8, u). For simplicity we shall
assume that p = 1 and the inner supremum in (9.7) is defined in the cube
a(j) = tn -'Ym/~o
for some m = m(j), j = 1, ... , No - 1. In fact, let the cube C(j) be an element of
the covering of the set c(m). Then
In this way there is no more 'room' in the given octant than for
(1 - t)mq - (1 - t)(m+1)q
~~--~~~----
t q(l - t)mq
= 1-
--~~~
(1 - t)q
tq
Let us estimate each term in (9.8). The general element of the derivative matrix
Dn(u) of the mapping
u~E9 R+(8 + nl/2d;;1(8)u)
has the form
= ID~k(u) + ~~k(u).
Taking into account (9.2), (9.3) and the inequality
sup p(x) = Po < 00,
xElR 1
I~ n-l/2~~k(u) - P(0)Iik(8)1
+ dinl(8) (4)>~)(u,0)r/2]
By (9.1) and (9.5) the terms in square brackets are bounded by the quantity
PO((C{i) (p))1/2 + C{i)(p)C1/ 2(p))lul.
For the last term of (9.10), using condition (ii) and (9.1) with u = 0, we find the
majorant
(9.11)
Since by condition V the matrix
n- 1/ 2Dn(O) = 2p(0)J(0)
is positive definite, the arguments sketched show that for sufficiently small u (for
simplicity we assume that u E Co) and some Co >
(9.12)
Let k -# No, and v E C{k) be an arbitrary point. Then with (9.12) one can
write
wi!) (0, u, v) = 2I d;:;-1(0) L 'V jU, u) (X{Xj * j(j, u)} - x{Xj < jU, v)}) 1 '
= 2I d;:;-1(0) L 'V j(j, U) (P(J(j, U) - j(j, 0)) - P(JU, v) - jU, 0))) 1 '
w1!) (0, u, v)
= Id;:;-1(0) L ('VjU,U) - 'VjU,V)) (2P(JU,v) - jU,O)) -1)1,
y~k)(O,v)
Let us further note that in accordance with (9.1), (9.3) and (ii)
(9.14)
Analogously
= Xj (modP;).
Consequently by (9.1)
(9.16)
= n- 1 L (p (sup UEC(k)
f(j, u) - g(j, 0)) - P ( sup f(j, u) - g(j,
UEC(k)
0)))
114 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
< Po (t
i=l
(nl/2dinl ((J) sup
UEO(It)
m~ IIi(j'U)I)2)l/2 a(k)ql/2
l~.1~n
The quantity
Cp(k) -
2 c7 a(k) = (C2 (1 - ) - >0
n-'Ym/mo
t) - C7t
(9.19)
Let us denote
YliU) = (fiU, v) -IiU, 0)) (2X{Xj < fU, v)} - 1),
Y2i(j) = 2fiU,0)(X{Xj < fU,v)} - X{Cj * OJ),
i = 1, .. . ,q.
Then
Pl = p;{YJk)((J,V)(1 +eonl/2p(k))-l > ~}
n (v , 0) ,
~(i) (9.21)
The relations (9.20)-(9.22) and the condition of the Theorem show that
Pl < C9 n - l [(a(k) + p(k))2p-2(k) + (a(k) + p(k))p-2(k)]
= C9 n - l [(1- t)-2 + (1- t)- 2nl'm/;;:;o]
(9.23)
The bounds (9.19) and (9.23) show that, for k = 1, ... , No - 1 and some m =
m(k) < mo,
(9.24)
p;{ sup
l.IEC(NO)
z;i(O,u) > }
< p;{ sup _ IR+(O +
il.lio<n-.,.n/",o
nl/2d~l(0)u) - R+(O)
Let us write the expression standing under the norm sign in (9.25) in the form of
a sum of vectors
(9.26)
116 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
If 'Y >!,then for n > no the exponents in (9.26) and (9.27) are negative. And
so it remains to estimate the probability (e' < e)
(9.28)
i = 1, ... ,n,
instead of (9.28) it is sufficient to estimate, for any e" > 0, the probability
p;{ n- L (Xj -
1/ 2 E;Xj) > ell} ~ (e")-2c12n-"'(mo/:;;"o.
As all the bounds are uniform in 9 E T and the case of z:;; (9, u) is investigated
analogously, Lemma 23.1 is then proved.
Let us set
LEMMA 23.2: Under the conditions of Theorem 23, for anye >0
supP;{IR+(9) + E;R+(8n ) I > e} --+ O. (9.29)
(JET n-+oo
9. ASYMPTOTIC NORMALITY OF LEAST MODULI ESTIMATORS 117
i = 1, .. . ,q.
From (9.4) and the assertion of the preceding Lemma it follows that
:J Ar(8).
as well.
On the other hand,
i = 1, .. . ,q, (9.31)
n
q
c {IE;R+(Bn)1 ~ (1-qc:)-1(qc:+IR+(8)1)}
= X+(8),
i.e.,
(9.32)
118 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
We note that
P;{IE;R+(lJn)1 > M}
Let us denote
Then P;-a.c.,
= O.
i.e., the vector R+(8) has a bounded probability. From (9.32) and (9.33) it follows
that the vector E;R+(8n ) is also bounded in probability uniformly in 8 E T.
According to (9.31)
LEMMA 23.3: Under the conditions of Theorem 23, for any c >0
sup P;{IE;R+(8n )-2p(0)I(8)dn (8)(8 n -8)I>c} ~O. (9.34)
9ET n-+oo
Proof: If the quantity n- 1 / 2 Id n (8)(8 n -8)1 is small, then from the inequality (9.12)
and the boundedness in probability of the r.v.IE;R+(8n )1 it follows that the norm
of the vector d n (8)(8 n -8) has a bounded probability. The assertion of the Lemma
follows from (9.4) and the inequalities (9.9)-(9.11).
Proof of Theorem 23: The relations (9.29) and (9.34) show that for any c >0
supP;{1(2p(0))-lA(8)R+(8) + dn (8)(8 n - 8)1> c} ~ O. (9.35)
9ET n-+oo
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 119
From the relations (9.35) and (9.36) it follows that for any c > 0 and C E crq
- ~n + Il>(C- e ) < P;{2p(0)[l/2(8)dn (8)(8 n - 8) E C}
(9.37)
where Ce and C- e are exterior and interior sets parallel to C, and ~n ---+ 0
n-too
uniformly in 8 E T and C E crq The assertion of the Theorem follows from (9.37)
and Theorem A.11 (see (8.5)).
In this Section an a.e. is obtained for the distribution of normed l.s.e.-s On.
In Sections 10 and 11 instead of the normalisation dn (8) the normalisation
n l / 2 1 q is used. This simplification is not a fundamental one, it only makes it
easier to write the tedious formulae of those Sections. The normalisation n l / 2 1 q
leads to an alteration in the written forms of quantities introduced earlier, for
which the previous notations are kept. In particular, we immediately have
e- 8= U(8),
etc ..
We shall use the conditions of Section 7 in the following form.
II. For any R > 0 there exist constants Ci(a, R) < 00, i = 1,2, such that
(1) sup sup n- 1/ 2dn (aj9+u)
(JET uEvc(R)nUc(J)
Omitting the r.v.-s Vi!l... i,. (9) that vanish for this reason, let us consider the vector
Vr (9) consisting of all different r.v.-s Vi!l... i. (9), ordered in the natural order, i.e.,
s=o
then
where
is the arithmetic mean of the correlation matrices of the vectors wj(8)cj. Let us
assume that
I'
m+u
wm(O, t) = II l1jJj (K;1/2(O)t) 0 ~ m ~ n - u, n ~ u + 1,
j=m+l
122 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
sup sup
O<m<n-u (JET iRP
r w (9, t) dt <
m 00, (10.7)
n~u+1
The condtion VII imposes restrictions simultaneously upon the r.v. Cj and the
function g(j,9), and therefore the inequalities (10.7) and (10.8) have no obvious-
ness. We shall find such requirements as must be imposed individually on the r.v.
Cj and functions g(j, 9) in order to guarantee the fulfilment of VI and VII.
VIII. There exists an integer h ~ p such that amongst any h vectors from the
totalities
{Wj(9),j=m+1, ... ,m+h}, O$m$n-h, n~h+1
there can be found p vectors Wjl' ... ,Wjp such that the matrix
p
for some p ~ 1.
Let us show that (10.6) follows from (10.9). We shall use the following facts,
which arise from the Courant-Fisher Theorem on the minimax representation of
eigenvalues ([29] p. 143-144).
Let A and B be symmetric matrices, with the matrix B being non-negative
definite. Then:
where
[*)-1
A =~ L W;~)(O),
8=0
Using next the inequality (2) first for two terms of the sum
[*)-1
L W;~)(O),
8=0
~ r~' w);)
Ami. (B) ) >
[p]H
= II ~ l/([p]+l)
8 ,
8=1
where the indices ji correspond to the vectors w31 ' ... ,Wjp of condition VIII. In
the last integral we substitute the variables
(WJi (0) , K-
n / (O)t) -
1 2 -u,
i i = 1, ... ,po
124 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
where Dn(O) is the matrix with columns Wj; (0), i = 1, ... ,po Since by (10.9)
then
where the Kii(O) are the diagonal elements of the matrix Kn(O), and by conditions
(10.1) and (10.5) we have uniformly with respect to 0 E T
< 00
For the same collection of vectors wh, ... ,Wjp we obtain for It I > b
L: ((K~I/2((J)Wji ((J), t) )
P 2
= p (K~I/2((J)W~) ((J)K~I/2((J)t, t)
i=1
Therefore among the numbers (K~I/2((J)Wji ((J), t), i = 1, ... ,p, a number can be
found that has the property
Then
Let G be the c.f. of the probability measure G on (BP, ]RP), v = (VI, ... , v p )
being a multi-index. Then the quantity
_ 1 ~ (II)
XII - i\lI\ (log G) (0)
with respect to the variables zl, ... , zp. Let us define the polynomial Ps(Zj {XII})
in the variables zl, ... , zP, equating two formal power series in the variable u:
1 + ~p- ( { }) s
~ s Zj XII U = exp
{~XS+2(Z)
~ (s + 2)! u
s}
.
To obtain the general form of the polynomial Ps(Zj {XII}) we use one fact about
the derivative of the exponents of a power series ([173] p. 169):
where l:* denotes the summation over all integral non-negative solutions k1, ... , ks
of the equation kl + 2k2 + ... + sks = s,
-. - ",*
Ps(z, {XII}) - L...J
II X +2
s km
m
m=1 km!((m + 2)!)k
(
Z
)
m
(10.12)
From formula (10.12) it is seen that Ps(Zj {XII}) is a polynomial in ZI, ... , zP of
order 3s and that its coefficients depend upon the cumulants XII, where Ivl ~ s+2.
Let us consider a sequence of independent vectors ~j with values in JRP and
zero means. Let G j be a distribution of ~j. The c.f. I1j=1 Gj (tn- 1 / 2 ) corresponds
to the sum n-l/2l:~j. Assuming that moments of any order exist for the vectors
~j, j = 1, ... , n, we obtain formally
10gII
.)
n
G(tn- 1/ 2) = -! (K t t) + ~ Xs+2(it) n- s/ 2
2 L...J (s + 2)!
n, '
tE ]RP,
)=1 s=1
- ()
XS Z = s., 'L...J
" IXII
V.
ZII ,
\II\=S
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 127
and XII are the arithmetic means of the cumulants of order v of the distributions
G l , ... , G n . Consequently we formally obtain the a.e.
n
II Gj(tn- l/2 ) exp { - 2"1 (Kn t , t) } exp {~XS+2(it)
~ (8 + 2)! n
-sI2} (10.13)
j=l
The first term in the a.e. (10.13) is the c.f. of the Gaussian distribution q; K,,' The
function
is the Fourier transform of the function Ps ( -<P K" ; {XII}) formally obtained by
substituting (-1)11I1<p~~ in place of (it)" for each v in the polynomial Ps(itj {XII})'
In other words, we have the equality
where the written form Ps ( - \7 j {XII} )0K" is understood as the application of the
differential operator Ps ( - \7 j {XII}) to the function <P Kn' In fact
-
<p~~(t) = (-it),,0K:(t), tEJRP,
which is obtained by taking the vth derivative with respect to x of both parts of
the inverse Fourier transform
j = 1, ... ,n.
Let
be the cumulant of order 8 of the r.v. j. Since the c.f. K;;l/2(0)~jn(0) has the
form
128 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
then provisionally assuming that ej has moments of any order, we find formally
n
log IT G (tn-
j 1 / 2 ,()) (10.14)
j=1
And so
-. -
PB(~tj {X" (())})
~*
= L...J
IT
B k Bkm (t ())
m
'Ym+2 m+2,n ~ ,
m=1 km!((m + 2)!)km ' 8;::: 1, (10.15)
where
In particular,
X3(it)
3!
X4(it) 1 X~(it)
= 4 ! +2 (3!)2
8 = 1,2, ....
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 129
H 8 (Z ) -- ( - 1)8 e%2 /2 ~
d _%2/ 2
e , - 0 , 1, 2 , ....
s-
Z8
Then for x E IRP and a multi-index J.t = (J.t1, ,J.tp) let us set
HIL(s) = HILl (X1) ... HlLp(xP ).
Clearly
Therefore
P1(- cPj {XII (0) } )(X)
LEMMA 24.1: Let J-Lk+l < 00 and let the conditions IV-VII (or IV, V, VIII, IX)
be satisfied. Then for a distribution Qn((J) of the sum of vectors
n- 1 / 2 I: K;;I/2(0)Wj(0)Cj
we have the a. e.
sup sup
(JET BEBp iB
I rQn(O) (dx)
-L (cp(x) + ~n-r/2Pr(-CP;{X"(0)})(X))dX
O(n-(k-l)/2). (10.23)
Proof: The proof consists in the verification that the conditions of Theorem A.13
are satisifed for the random vectors ejn = Wj(O)Cj, j = 1, ... ,n. However, condi-
tions (1) and (2) of Theorem A.13 coincide with conditions VI and VII. Therefore
it follows that only condition (3) needs to be verified. Let us remark that
Therefore
n- 1 I: IWj(0)1k+ 1
< qlk+ 1 )/' C~~l IA" (0)1 )'+1 In-1 ~ (t, I~O (g,l.) (j, 0' ) 1k+1)/'
< 00.
(10.25)
sup
BE13p J B
rQn(O) (dx) - r ('P(X) + I: n- r/ 2Pr(-
JB r=l
'Pj {Xv(O)} )(X)) dx
= sup
AE13 P
rQ~(O) (dx)
JA
132 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
Since
In particular we find
(",(1) +",(2) ( )
X CPKn(9) X. (10.29)
sup sup
9ET BEBt'
IrQ~(9)(dx) - rQn(9,x) dxl = O(n-(k-l)/2).
1B 1B
(10.31)
(10.34)
134 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
are observed, where fJ E IRP is an unknown parameter, and the vectors gj E ]RP
satisfy the same conditions as the Wj(fJ) in Lemma 24.1. IT 9n in an l.s.e. of fJ,
obtained from the observations Xj, j = 1, ... ,n, then
Consequently Lemma 24.1 gives, in particular, the a.e. for the normed l.s.e. dis-
tribution of the vector parameter of the linear regression (10.34).
where the h~(fJ, x) are polynomials in x = (Xl, . .. ,xP) with coefficients uniformly
bounded in nand fJ E T. Then if the conditions (10.1), V, and VI are satisfied,
where
(10.37)
where Pr(fJ, y), r = 1, ... ,k-2, are polynomials in y = (y1, . .. ,yP) with coefficients
uniformly bounded in nand fJ E T.
Proof: It is easy to see that the polynomials Pr(fJ, y) are defined from the expansion
of the functions
(10.39)
with
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 135
thanks to the bounds for the diagonal elements Kii(f)) of the matrix Kn(f)) obtained
above. Consequently for f) E T
(10.40)
Therefore
(10.41)
Let us set
(10.42)
The first terms of the expansion (10.42) can be obtained in the following way.
Let us formally write
r:?:l
where
where
136 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
or
k-2
0= 6.y + L n- r / 2hr (y + 6.y). (10.43)
r=1
(10.44)
p
where
(hi.) . = -8
8
hi
J Yj
< R_I(B,Y)'PK,,(IJ)(Y)
For the proof of (10.45) it is sufficient to write the expansion of the quantity
P
P2 + L (tliQ~ + (tli PI + (PI)i) (1)
i=1
(10.47)
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 137
where
Since the functions R~ (8, y), i = 1, ... , p, are bounded by polynomials, then for
n > no
Consequently
(10.51)
where Dii is the Kronecker symbol. In this way, for y E Fn and n > no
8J~:(Y) > O.
We find the expansion of the Jacobian 8J;;1(y)/f!y in powers of n- 1 / 2 . The rela-
tion (10.42) shows that there exist polynomials Qi(f), y), i = 0,1, ... ,k - 2 with
coefficients uniformly bounded in n and f) E T such that for y E Fn
The first polymonials of the expansion (l0.52) can be found starting from the
following considerations. The polymial Ql is the sum of polynomials of order
n- 1 / 2 of the diagonal elements of the Jacobian matrix of the mapping J;;l(y), i.e.,
(10.53)
p p
(10.54)
10. ASYMPTOTIC EXPANSION OF LSE DISTRlBUTION 139
The expansions (10.51) and (10.52) show that there exists a polynomial Po(9, y),
with coefficients that are uniformly bounded with respect to 9 and n, for which
(see formula (10.38))
n(k-l)/2IQn(9,J;I(y)) - On(9,y)1
with
r _ ~
Pr = LPIIOr-II' r = 1, ... ,k - 2,
11=0
if we adopt
In particular,
We further find
~
P2 =02+PI0l +1'2
-
p p
p p
= P 2'PKn - L
i=l
(P1hi 'PKn + h;'PKn) i + ~ L (hi hi 'PKn) .. (10.58)
i,j ~J
Thanks to the bounds obtained in the course of the proof of the Lemma
sup sup
IJET BEBp
/,
vC(Jogn)n/-l(B)
Qn(f}, x) dx - r
lFnnB
Qn(}, y) dyl
= sup sup j
IJET IJEBP
r
1FnnB
Qn(}, f;1 (y)) j af~1(y) j dy -
y
r
1FnnB
Qn(}, y) dyj
O(n(k-l)/2). (10.59)
The relation (10.36) is now a consequence of (10.41), inclusion
(W'\Fn)C (RP\VC(~IOgn))
and the bound of the form (10.41) for the a.e. Qn(), y).
The following Lemma is a sharpening of Theorem 18 of Section 7 which is
useful in the proof of Theorem 24.
LEMMA 24.3: Let conditions 1f+l' 11- V be satisfied, and let the l.s.e. On have the
following property: for any r > 0
supP9{jOn - (}j ~ r} = o(n-(k-l)/2).
IJET
where h,,(O) are vectors, the coordinates of which are polynomials in the coordinates
of the vectors V"H (0), v = 0, ... , k - 2, with coefficients that are uniformly bounded
in 0 E T and n. Then
(10.63)
ho(O) = Vl(O).
Let the assertion of the Lemma hold for hi(O), i = 0, ... , l - 1. Let us substitute
u(l+l)(t) = u(l)(t) + t(l+l)hl
(see (7.18)) into the equality (7.17):
.coo (u(l+l) (t), t)
- 2tB(0; 0) + 2I(O)( u(l) (t) + t(lH) hi)
- 2tB(2)(O)(u(l)(t) + tl+1h1)
+ L 1
,(A(a,O) - 2tB(a,O))(u(l)(t)
a.
+ t1Hh1)'''. (10.64)
1"'1~2
+ L ~
a.
A(a, O)Q~l + L ~
a.
B(a, O)QhBl = 0, (10.65)
2~1"'I::;ZH 2~1"'1::;Z
where QhAl and Q["El are polynomials in the coordinates of the vectors ho, ... , h1- 1.
The statement about the hi is justified by the induction hypothesis and by the
presence in the expression for the hi in (10.65) of the vector
-~ L A(O)B(a, O)Q["El .
1"'1=1
142 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
.
It is also easy to establish the uniform boundedness of the coefficients of the
polynomials hv(B) by induction, basing ourselves upon the equality (10.65). The
identity of the polynomials (7.35)-(7.37) and (10.61)-(10.63) is verified immedi-
~~
Proof of Theorem 24: Let us consider the distribution of the sum of random vectors
n- 1/ 2 'L, Wj((})cj
Q~((})(B) = (P e 0 Vk-l(B))(B),
where B E BP, and the mapping (10.35)
By Lemma 24.2
sup sup
9ET BEBP
IiffB(Q~(B) 0 fn(' jB)) (dx) - fB Qn(B,y) dyl
if
= O(n-(k-l)/2), (10.67)
where the first polynomials Pt((}) and P2((}) of the expansion Qn((},Y) are given
by the equalities (10.57) and (10.58).
By Lemma 24.3 there exists a constant c* and a vector function
such that
= o(n-(k-l)/2),
k-2
Hn(xj (}) = L hv(x, B)n- v/ 2. (10.68)
v=o
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 143
Let us set
x = c.n-(k-l)/2Iogk/2 n.
Then from (10.68) it follows that
Then from Lemma 24.2, (10.67), (10.69) and (10.70) it follows that
(10.74)
Then [6]
(10.75)
The positive definiteness of the matrix S follows, for example, from the equality
M r (8,u) = r
iRrq
<P1'2 S(O)(Z - E 21 I(8)u)Pr (8,y) dz, r = 1, ... , k - 2. (10.77)
The functions M r (8, u) are polynomials in u, the degree of Mr coincides with the
degree of Pr, which is equal to 3r, and the coefficients of Mr are uniformly bounded
in 8 E T and n.
It is easy to be persuaded of the existence of constants a = a(T) < 00 and
b = b(T) > 0 such that for r = 1, ... , k - 2
sup sup
OET CEe: q
r
ic,,\c
<P1'2 A(O)(u)IMr(8, u)1 du ~ a sup
CEe: q
r
ic,,\c
<Pblq(U) duo (10.78)
Applying Theorem A.ll to the function <Pblq(U) we find that the right hand part
of (10.78) is of order
O(x) = 0(n-(k-l)/2Iog k/2 n).
Consequently
The opposite inequality with the same uniform bound for the remainder term, as
in (10.79), can be obtained starting from the inequality (10.73).
(10.80)
This Section is closely related to the preceding one. Using the notations introduced
earlier we shall not bother to indicate dependence upon nand 8 in the formulae.
11. FIRST POLYNOMIALS OF AN ASYMPTOTIC EXPANSION 145
r
JRP-q
Z'P1'2 S (Z - ~21Iu) = ~21Iu, (11.1)
r
JRP-q
ZZ''P1'2S(Z - ~21Iu) = J.L2~22 - J.L2~21I~12 + ~21Iuu'I~12' (11.2)
r
JRP-q
y iq+j 'P1'2 S (Z - ~21Iu) dz = AirII(rj)(a)Ua . (11.3)
For
let the indices i, j, 8 = 1, ... , q be chosen so that the tth coordinate of the vector
Vk-l is
v;k-l
t -
-
AirbrjB'
Then
(11.4)
r
JRP-q
yiq+jylq+m'P1'2s(z - ~21Iu) dz
. u, z -- 1, ... , q,
Yiq+a Ua - 4"1 Air ara,8U a,8'
hl(Y) = { (11.7)
o i=q+1, ... ,p.
146 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
Let us further note that from (11.3) and (11.7) it follows that
~ kp-q (ht(Y)<PK(Y))i dz
= - ~ (A ir II(r)(a.B)U au.B<P1'2A(u)L
= (- A.BrII(r)(a.B)Ua + 2~2 II(a.B)('Y)Uau.Bu'Y) <P1'2A(U). (11.8)
( -
JRP-q PI
()
Y <PK ()
Y dz = 'Y3 AiaAj.BAhii (a)(.B)('Y) <P1'2A U )) ijl
- "6 (11.9)
=
6'Y33 II(a)(.B)('Y)u'Y(uau.B - 3JL2 Aa.B) <P1'2A (u).
JL2
Combining the equalities (11.8) and (11.9), from (11.6) we find
M()
1 U = ('Y6JL~3 II(a)(.B)('Y) - III (a.B)('Y) )a.B'Y
2JL2 u u u
= Lp-q P 2(Y)<PK(Y) dz
+
q
j=1
8 [12 L (ji8
L 87
u
q
i=1 U
h . .
llV-q
hi(y)hHy)<pK(Y) dz
11. FIRST POLYNOMIALS OF AN ASYMPTOTIC EXPANSION 147
Lp-q P2(Y)'PK(Y) dz
1'4
24
AiO: Aj,8 Ak"Y A l6 II ( ( ))
(0:)(,8) (')')(6) 'P1J.2A U ijkl
2
+ 1'3
72 AiO:Aj,8Ak"YAI6AmeArvII (0:)(,8)("Y) II (6)(e)(v) ( ( ))
'P1J.2A U ijklmr
1'3 2
ij kl 1'32 ij )
+ -4 4 A A II{i)(k)(o:)II(j)(I)(o:) + -84 A A
kl
II{i)(j){o:)II{k)(I)(,8)
0:
U U
,8
J.t2 J.t2
2
1'4 ij kl 1'3 ij kl mr
+( 8J.t~ A A II{i)(j)(k)(l) - 12J.t~ A A A II{i)(k)(m) II (j)(l)(r)
-
1'~
8J.t~
AijAklAmrII
(i)(j)(k)
II
(l)(m)(r)
)]
. (11.12)
(11.13)
= III
[ 8Jt~ II Ot{3"(6EV
(Ot{3)("() (6E)(V)U U U U U U
5 ArkII
- -8 II
(r)(Ot{3) (k)("(6) + -2
1 II
(Ot{3)("(6)
)
YOt Y{3 Y"( Y6
1'2 1'2
+ (- ~ A rk II(rOt)(k{3) - A rk II(rk)(Ot{3)
(11.15)
j -- AjrbrOt{3,
VOt{3 t = q2 + q + 1, ... , ~ q(q + l)(q + 2).
In accordance with (11.4)
(11.16)
(11.17)
11. FIRST POLYNOMIALS OF AN ASYMPTOTIC EXPANSION 149
And so
-_ - Ajr 2
(1 II (ra:)({3'"Y) +"61 II (a:{3'"Y)(r) ) Ua: U{3 U'"Y 'P/l-2A (U) . (11.18)
_!Ajrara:iUa:U{3
2
r
JRp-q
y iq +{3'PK(y)dz+ua:
JRP-q
r yiq+a:yjq+i'PK(y)dz.
Effecting the calculation of the integrals in (11.19) by the forrimlae (11.3) and
(11.5) and collecting similar terms, we obtain
[ ( - -I II
6 (a:)({3'"Yo) + -1 ArsII
(r)(a:{3)
II
(s)('"'to) - 1 II
- 2 (a:{3) ('"'to)
)
Ua: U{3 U'"Y U0
J.t2 J.t2 J.t2
Let us consider
The first term on the right hand side of (11.22) is calculated by formula (11.9).
Resorting to integration by parts we obtain
(11.23)
From the equalities (11.3), (11.9) and (11.23) we find after differentiation
1'3
+ JL~ Air (1 II
"2 (rQ)({3)('Y) + "41 AiBII (r)(Q{3)II(i)(B)('Y)
~ a r .-
- ;:. aui JR.p-q hi (y)P l (Y)'PK(Y) dz
1'3 Q{3'Y6ev
= [- 12JL~
II
(Q)({3)('Y)
II
(6e)(v)U U U U U U
+ (2~~ II(Q)({3)('Y6)
+ ~ II(Q)({3)('Y)II(6i)(r)) UQU{3U'Y U6
From (11.11), (11.12), (11.14), (11.21) and (11.25) after collecting similar terms
in the expressions (11.14) and (11.21) we obtain, at last, the expression for the
polynomial M2(u):
')'4 ')'3 1
+ [ 24 4 II(a)(,B)('Y)(6) + -23 II(a)(,B)('Y6) - -6 II(a,B'Y)(6)
J.L2 J.L2 J.L2
')'3 1
+ -63 II(a)(,B)('Y)II(6j)(r) - - 8 II(a,B)(j)II('Y6)(r)
J.L2 J.L2
I~ 13
+ 4J.L~ TI(a)(i)(j) TI(,B)(s)(r) + 4J.L~ TI(i)(s)(j) TI(a,B)(r)
13 ')'3
+ 1/2 TI(a)(i)(j)TI(,Bs)(r) - -2 2 TI(a)(i)(s)TI(,Bj)(r)
r2 J.L2
152 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION
+ "2
J.l2
II(kl)(is) -
J.l2
"2 II(ki)(ls)
)
2 2
kl is jr ( 1'3 1'3
+A A A - 8J.l~ II(k)(l)(i)II(s)(j)(r) - 12J.l~ II(k)(i)(j)II(l)(s)(r)
(11.26)
The polynomial M2(U) contains 40 terms, each of which in turn is a sum. For the
symmetric r.v. Cj the cumulant 1'3 = 0, and the written form of the polynomials
Ml(U) and M2(U) becomes less cumbersome. For example, in M2(U) there remain
18 terms. For the Gaussian r.v. Cj, 1'3 = 0 and 1'4 = 3J.l~.
From the expressions (11.10) and (11.26) it is easy to obtain the form of the
polynomials Ml (u) and M2 (u) for q = 1. For this it is sufficient to note that
n- 1 ~gll(j,O)gl(j,O),
Then for q = 1
5'Y~
+ A ( - 24J1.~ (
Illll
)2 1'3
+ 6J1.~ IllllIll2 -
5 2)] 4
8J1.2 Ill2 U
(11.28)
Chapter 3
In this Chapter we find the a.e. of the moments of the l.s.e. and the a.e. of the en
distributions of a series of functionals of the l.s.e. used in mathematical statistics.
In this Chapter the assumptions of Chapter 2 about smoothness of the regression
functions g(j,8) are kept: for each j there exist derivatives with respect to the
variables 8 = (8 1 , .. , 8q ) up to some order k ~ 4 inclusive that are continuous in
e c , where e ~ IRq is an open convex set, The assumption of Section 10 about the
normalisation n 1 / 2 1 q instead of d n 8 is also used.
This Section contains the a.e. of mixed moments of coordinates with a normed
l.s.e.en as n -+ 00. In particular the first terms of the a.e. of the bias vector and
correlation matrix 8n are indicated.
Let
m ~ max (3, k).
We shall assume that the l.s.e. en has the property (3.4):
supP;{n 1 / 2 Ien - 81 ~ H} ~ cH- m . (12.1)
9ET
LEMMA 25.1: Let the conditions II, III, V of Section 10, lVI, and J.tm+1l. < 00
for some ~ > 0 (the condition Im+ll.) be satisfied. Then if the l.s.e. 8n satisfies
the relation (12.1), then for some c. > 0
sup
9ET
p;{ n 1 / 2(8 n - 8) - ~ n-
11=0
II / 2h ll (8) ~ c.n-(k-l)/2logk/2 n}
155
LEMMA 25.2: Let the conditions III of Section 10, IVl , and J.Lm < 00 be satisfied.
Then for the r. v.
where XT < 00 is one and the same constant for any sequence
an ~ (m - 2 + o)l/2Iogl/2 n,
in which 0 > 0 is an arbitrary fixed number.
Proof: The Lemma is a rephrasing of Theorem A.5 for the r.v.
ejn = Cjg(a)(j,O).
For
let us assume
(hv(O), >.} = hv(>'}, v = 0, ... , k - 2,
and the set of matrices of dimension (1+1) xq with non-negative integer coefficients
LEMMA 25.3: Let the conditions of Lemma 25.1 be satisfied, and let s ~ 1 be an
integer. Then
k-2
(1) O~ (A) = L hI B(A)n- I/ 2 + hk_ l ,B(A)n-(k-l)/2,
1=0
I
= Ls!IT
a,. ~h~(A)'
11=0 ~II
1= 0,1, ... , k - 2,
IT IT
I q
h-I,r = 'L..J
" s.I 1
~ (hj)d~j
II , 1= O, ... ,k - 2,
A,.(r) II=Oj=l "3
and where :EA,.(r) is the sum over the set of matrices AIB(r)i
(3) the coefficients hk-l,r(O), Irl = s, of the polynomial hk-l,B(A) have the
following property: a number b > 0 can be found such that for some constant
Cl = cl(T) < 00
there holds
supp;{max
9ET Irl=B
Ih k -l.r (0) I ~ ClI0gb n }
= O(n-(m-2)/21og- m/ 2 n). (12.6)
158 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
Proof: The proof of (1) is evidentj (3) follows from Lemma 25.2 and equality
(12.4). The assertion (2) follows from the equality
BI = {(d vj ) : t
3=1
dVj = iv, v = 0, ... , l} .
Let us denote by Ml+s, l = 0, ... ,k - 2, the set of integer-valued vectors J.L
with coordinates J.LOI ~ 0, lal = 1, ... ,l + 1, for which
1+1
L J.LOI = l + s.
1011=1
The assertions (1) and (2) of the preceding Lemma show that for coefficients
hl,To Irl = s, of degree >.r = >.? ... >.;q of the polynomial h (>'} we have the ,s
representation
1+1
hl,r = L c!"r(9) II b!,a (aj 0), (12.7)
!'EM,+. 1011=1
where the coefficients c!"r(9) are uniformly bounded in 9 E T and n, and some of
which may possibly be zero. Indeed, the quantities are polynomials of degreehi
v + 1 in the variables b(aj 0), lal ~ v + 1, and
I q
L L(v + l)d vj =l + S
v=Oj=1
Let us introduce the following sets of matrices with non-negative integral ele-
ments
12. ASYMPTOTIC EXPANSION OF LSE MOMENTS 159
t $. p.
Gp.,n(Oj t, I, s) =
where the q,}f.~ are polynomials in the moments of the r.v.-s ej and the arithmetic
means over all indices i = 1, ... , n of the products of different powers of different
partial derivatives of the junctions g(i,O) (of the types introduced earlier of the
quantities II(it}(i2)(is)' II(h)(i2 i s) etc.).
Proof: The quantity
1+1
n- 1 L II (g(a) (i, 0)) ".. 1 m"i
lal=1
is a polynomial of the stated form. Therefore for G p.,n (OJ 1, I, s) the assertion of
the Lemma is justified. Let us consider the case t > 1. Let
160 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
D~n) = {1 $. ie t= i6 $. n, c t= 6j c, 6 = 1, ... , t}
a set of indices. Then
where 6u:. is the Kronecker symbol. If t > 2 let us write the analogous equation
for the sums over the sets of indices Dt)1 etc .. In the result we obtain the required
representation for the sum corresponding to the matrix (Xo:i).
Let us set
-(p)
wr ,n(8jt,l,s) =
where the cJ',r(8) are quantities bounded uniformly in 8 E T and n. Let us assume
that L:~=1 = O. Let us also introduce the set of indices
with, moreover,
(1) The coefficients h~,r(8) of the polynomial h~8().} of degree
\r _
" -
\rl \rq
"1 ... "q , Irl = s,
have the following property
max sup Ih~ r(8)1 = o(n-(k-2)/2),
Irl=8 9ET '
12. ASYMPTOTIC EXPANSION OF LSE MOMENTS 161
for which
(a) s:S; m - 1 for k = 2 and m ::::: 3,
(b) s:S; m - k-1 for k > 2 and m ::::: 2k - 2,
(c) s:S;m-k for k > 2 and m < 2k - 2;
[~] t
(2) E'e'hl,r(O) = L L W~,t:u) (0; t, I, s)n-((I+S)/2)+U, (12.9)
t=1 u=1
1 = O, ... ,k - 2;
(3) "
Sl /
,ns 2 EO'
II (O~ - OT'
q
A.
rl ... r q . i=1
k-2 [ill]
2
) +hn,r(O),
= Ln
p=o
_ /2
P ( L
Bp(k,s)
L Wr,n
t=u
-(t-u)
(O;t,l,s)
I
(12.10)
where the functions h~,r(O) has the property of the coefficient h~,r(O), and the
coefficients of n- p / 2 in (12.10) are uniformly bounded in 0 E T and n. If s is an
even number then the sum in (12.10) is carried out over even p, and if s is odd
then over odd p.
Proof: For 0 E T let us introduce the event
Wn(O) = {max
Irl=s
Ihk-1,r(O)1 < c1log b n},
where Cl and b are the constants from (12.6). Let X{A} be the indicator function
of the event A, and X = 1 - X.
From Lemma 25.3 it follows that
k-2
EO'X{Wn(O)}O~(A) =L EO'X{Wn(O)}hls(A)n-s/2 + h~J(A), (12.11)
1=0
where h~l} (A) is a polynomial of which all the coefficients are quantities that are
of O(n-(k-l)/2Iog b n) uniformly in 0 E T.
Let us estimate the mathematical expectation E O' X{Wn (O)}h ls (A), I = 0, ... ,
k - 2. Let I be fixed. The equality (12.7) shows that it is sufficient to estimate
the quantities of the form
1+1
EO'X{Wn(O)} II (b(n; O))lta, JL E M!+s'
lal=1
162 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
(I:
Since
W~~ (B) = bj-l,n::; Ib(aj 0)IJL;-I/2nd;;1 (aj 0) < 'Yjn}, j = 1,2, ....
Clearly
(12.13)
+L
00
E;X{Wn(Bnx{W~?1(Bnlb(aj O)ll+s
= O(n-(m-2)/2 (log n)(l+s-m)/2), (12.14)
L E;X{W~~(Bnlb(aj O)ll+s
00
j=1
< c2(T) (t
3=1
r j(I+S)-U-l)m) n-(m-2)/2(logn)(l+s-m)/2. (12.15)
(12.16)
(}I B+ L
00
Ln
00
B/ 2 E;X{Wjn ((})}19 n - (}IB
j=1
H = T j - 1 n 1/ 2 -.e logl/2 n.
For the majorisation of the right hand side of the relations (12.17) and (12.18)
by a quantity O(n-(k-2)/2(logn)-(m-B)/2) it is sufficient to take an integer 8 and
a number (3 E (0, !)
such that
1 <_ 2(m - k) 2
m-2 < ,
i.e., k > 2 and m ~ 2k - 2. Then
8~m-k+1.
164 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
Finally, let
2(m - k) 1
2 < ,
m-
i.e.,
m < 2k- 2.
In this case
8 ~ m- k,
and the assertion (1) is proved.
Let us fix J.t E Ml+s. We obtain successively
1+1
E'9 II (b(aj O)l'a
lal=1
= n-(l+s)/2 E'9
1+1
II
lal=1
(Lg(a)(j,8)cj r a
[~] 1+1
= n-(l+s)/2
L
t=1
L
1~h<<jt<n
L
K(n)nK(n)(h j)
En
(J
II J.ta!
lal=1
- p,' I "." t
X
II _1_ ( g (a) ('.J" 8))
t
. f
Haj,
c3~j,
,
i=1 "a3,'
[~] 1+1
= n-(I+s)/2
L L
t=1 K(t)nK(t)
L
1~h<<jt~n
E'9 II J.ta!
lal=1
1',1 I
X II -1
t
X .f
(g<a)(j. 8))
"
Hai
c~,
3,
i=1 al'
[~]
= n-(l+s)/2
L
t=1
ntGI',n(8j t, I, 8).
12. ASYMPTOTIC EXPANSION OF LSE MOMENTS 165
Taking into account the equality obtained and bringing in the result of Lemma 25.4
we obtain assertion (2) of the Theorem.
To prove (3) let us note that
k-2
"L..J En(J hI,r (B)n -1/2
1=0
k-2 [ill]
2 t
" " "\[I (t-u) (B t l s)n -1-(s/2)+u
L..J ~ L...J r,n '"
1=0 t=l u=l
The cases s = 1,2 are especially interesting for applications. For s = 1 Theo-
rem 25 gives the a.e. for the bias vector of the l.s.e. en. For s = 2 we obtain the
a.e. of the moments of the second order of the en coordinates, in order to find the
a.e. of the en correlation matrix. Let us denote
Let us find the first terms of the a.e. of the bias vector of en and matrix Cn(B).
The first terms of the correlation matrix of the a.e.
EllBn (),) = - ~2 Aiil (B)Ah i3 (B)II(il)(i2 i3) (B),i n -1/2 + In (),), (12.19)
and the coefficients of the linear form In (),) of order o(n- 1 ) uniformly in BET.
166 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
Clearly
We further find
(12.20)
Lastly let us note that the coefficients of the form E;h2(>") are O(n- 1 / 2). The
relation (12.19) now follows from (12.20) and (7.33).
The relation (12.19) can be rewritten in the form
(12.21)
i = 1, ... ,q.
For k = 3 the equalities (12.21) hold, with the remainder terms being o(n- 1 ). For
k = 2 it is only possible to state that
(12.23)
furthermore:
(1) the elements of the matrix A (2) (() have values of order o(n- 1 ) uniformly
in () E T;
- IICh)(jds)IICi2)(idS)]) q . (12.24)
',J=l
Proof: Evidently,
(12.26)
x (2II(ili2)(h)(h)
(12.27)
+ (II(idCi2is) + IICi2)Cids))
The result of Corollary 25.2 follows from the relations (12.25)-(12.28) after the
collection of similar terms and the symmetrisation of the expressions obtained.
For 8 = 2 and k = 3, thanks to the relations (12.25) and (12.26) equality (12.8)
can be rewritten in the form
(12.29)
where by o(n- 1 / 2 ) we denote a matrix, the elements of which decrease with the
same degree of n.
For k = 2 it is possible to state that
In Table 3.1 the minimal values of m are shown which are determined by
Theorem 25 for k = 2,3,4; the relations mentioned above for the bias vector and
matrix of second moments of the l.s.e. On already hold.
IL2=0'2>0
of the errors of observation Cj in the model (0.1) is unknown. The rigorous statis-
tical treatment of the observations Xj, j = 1, ... , n, in the model (0.1) provides
a means of obtaining the estimators both for () and 0'2. Therefore, along with the
13. AEs RELATED TO THE VARIANCE OF ERRORS 169
problem of the estimation of the parameter () there arises the problem of estimat-
ing the parameter (72. As an estimator of the variance (72 of the observations X j ,
j = 1, .. . , n, let us take the statistic
c7~ = n- 1 L(8n ),
where 8n is the l.s.e. of the parameter O. The estimator of c7~ already appears in
the formulation of Theorem 22 of Section 8.
In this Section the a.e.-s will be obtained for the normed estimate of c7~ and
its first two moments, and also the a.e. of the distribution of the estimator of c7~.
Let us assume that
fLm+fl. < 00
for some m ~ max (3, k) and ~ > O. Assuming the conditions of Lemma 25.1 of
Section 12 to be satisfied, let us consider the quantity
(13.1)
n- 1 / 2cpn(O + n- 1 / 2u,O)
where lu*1 < lui. The derivative cp~), 10:1 ~ 2, has the following form:
cp~a) (0 + n- 1 / 2u, 0)
= n- 1al / 2 L c({3,"{) Lg(!3)(j,O +n- 1 / 2u)g('Y)(j,O +n-l/2u)
!3+'Y=a
where {3 - 0, "{ - 0 are multi-indices, and the c({3, "{) are constants. Therefore the
remainder term in the expansion (13.2) can be written in the form
170 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
Rk-l ((})
This relation, close to (13.4), was mentioned earlier in the statement of Theorem 19
of Section 7. And so under the conditions of Lemma 25.1, for
u = n 1 / 2 (8n - ())
with
~~~ p;{ l17i~l ((}) I ~ c3 logk/ 2 n} = O(n-(m-2)/2(10g n)-m/2). (13.5)
Consequently
+ 'I'l(1)
'k-1
((})n -(k-1)/2
, (13.6)
where
rr(,8)(1')((}) = n- 1 Lg(,8)(j,O)g(-r)(j,O).
Analogously we find
k-2
L ~ b(a; 0)(n 1 / 2 (8 n
a.
- (}))a n - 1a l /2
lal=l
+ 17i~1 ((})n-(k-1)/2, (13.7)
13. AEs RELATED TO THE VARIANCE OF ERRORS 171
where the r.v. 71i~l (B) has the property (13.5) with some constant C4. Therefore
nl/2(&~ _ 0'2)
= n- 1/ 2 ~)c~ - 0'2)
+ L 1
'I'
a.
L c(,8,'Y)II(.B)('Y)(B)(v + I)!
1e.I=v+1 .B+'Y=e.
(13.10)
172 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
where the r.v. (k-l(O) has the property (13.5) with the constant C6.
Let us denote by Pp the polynomials of the sum m b(a; 0) which in the a.e.
(13.10) are the coefficients of the powers n- p / 2 , p ~ 1, and let us set
The a.e. (13.10) determines the order of decrease of the remainder term, but
is of little use in the calculation of the polynomials Pp (0). Let us determine
the formulae giving a visible form for the polynomials Pp(O) of the a.e. (13.10).
Assuming that the functions gU,O) are infinitely differentiable, we find formally
(13.11)
where the functions ailoo.i.(O) and the sums of the r.v.-s bi1oo.i.(0) are defined in
Section 7. Substituting in (13.11) the formal expansion
L
00
we find
n 1 / 2 (o-; _ 0- 2 ) (13.13)
= PoCO) + f: (
v=l
L
r+lo.(r)l=v+l
~ ailoo.dO)h~l ... h~)O)
(13.14)
,,1 . .
L..J I bi1oo.dO)h:;1 (0) ... h:;JO) , (13.15)
r.
r+lo.(r)l=v
THEOREM 26: Under the conditions of Lemma 25.1 of Section 12 there exists a
constant C7 > 0 such that
(13.16)
where
REMARK 26.1: If in the conditions of Lemma 25.1 the condition (12.1) is replaced
by a weaker condition, for example for any r > 0
supP9{IOn - 81> r} = o(n-(m-2)/2), (13.18)
9ET
then the conclusion of Theorem 26 remains true with the right hand side of (13.16)
replaced by a quantity that is O(n-(m-2)/2). For this it would be sufficient for the
moment J.tm to be finite.
Let us find the first polynomials of the a.e. (13.10), or, what is the same thing,
the polynomials of the a.e. (13.16). Using the relations (7.32), (7.33) and (7.35),
(7.36), from (13.14) and (13.15) we obtain
1 .. ..
Al = 2! a'1 '2. h'lh'2
0 0 -
- A'Jbb
, J'
1 . ..
Bl = I! bihe, = A'Jbibj ,
PI = Al - 2Bl = - Aijbibj, (13.19)
A2 = 1 hh hi2 his
3! aid2is 0 0 0
+ 2 hh hi2
2! aid2 0 1
(13.20)
174 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
More cumbersome, but equally simple computations with the invocation of the
equalities (7.34) and (7.37) lead to the expression
(13.21)
Having available the a.e. (13.16) and using the method of Section 12, it is
possible to obtain the a.e. for the moments of any order of the r.v. n 1/ 2(u; - 0'2).
However, we concentrate on the considerably more special problem (but the most
interesting for applications) of the determination of the initial terms of the a.e. of
the first two moments El!nl/2(u~ - 0'2) and El!n(u~ - 0'2)2. Thus we shall start
from the expansion (13.16) for k = 4:
2
n 1/ 2(u; - 0'2) = L PI.' ())n- V/ 2 + (3 ())n- 3/ 2, (13.22)
1.'=0
THEOREM 27: Let the conditions of Lemma 25.1 be satisfied for k = 4 and m ~ 6.
Then
m=6,7,
(13.24)
m~8.
Proof: The proof is close to the proof of Theorem 25. Let us introduce the event
13. AEs RELATED TO THE VARIANCE OF ERRORS 175
Then we have
2
E;n1/2(o-; - 0'2)x{On(0)} = L E; P (0)x{On(0)}n-"/ 2
I
11=0
where the cll(O) are coefficients (some of which may be zero) that are bounded
uniformly in 0 E T and n. Therefore for the estimation of E; PIIX{On} it is
sufficient to estimate the quantities E;lb(a; O)l"+1X{On(O)}, lal = 1, ... , II. Fixing
a and using the notation (12.12), by analogy with (12.13)-(12.15) we obtain
E;lb(a; O)l"+1X{On(O)}
j=1
L
00
The bound (13.26) is non-trivial, since m > II + 1 in the conditions of the Theorem
being proved. Let us further observe that the r.v.-s (J.t4 - 0'4)-1/2(c~ - 0'2), j =
1, ... , n, have finite moments of order [m/2] ~ 3. Therefore the application of
Theorem A.5 to the sum of the r.v.-s (J.t4 - 0'4)-1/2 Po analogously to (13.26) gives
the bound
E;!Polx{On(O)} ~ clQn-(m-2)/2(logn)-(m-1)/2
n i / 2 E;IOn - 012"X{On(0)}
n i / 2 E;(a~ _ (72)
where
uniformly in 0 E T, and
(13.32)
(13.33)
13. AEs RELATED TO THE VARIANCE OF ERRORS 177
THEOREM 28: Let the conditions of Lemma 25.1 be satisfied for k = 4 and m ~ 8.
Then
sup IE;n(a! - a 2)2 - J1.4 + a4
8ET
Proof: From the expansion (13.22), on raising to the square power we obtain
n(a! - a 2)2 = P~(9) + 2Po(9)P1(9)n- I/ 2 + (2Po(9)P2 (9) + pf(9))n- 1
+ (3 (9)n- 3 / 2 , (13.35)
where
(3 = 2PO(3 + 2P1P2 + 2P1(3n-I/2 + 2P2(3n-1 + (;n- 3/ 2.
The r.v. (3 has the following property: a number Cl5 > 0 can be found such that
supP;{1(3(9)1 ~ cI5{logn)2.5} = O(n-(m-2)/2 (log n)-m/2). (13.36)
8ET
E;P~(9) = J1.4 - a 4,
E; Po (9)PI (9) = - q(J1.4 - a4)n- I/ 2, (13.38)
E;pf(9) = a 4(q2 + 2q) + O(n- l ).
We are somewhat delayed by the calculation of the expectation
178 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
And so,
+O(n- l ). (13.39)
(13.40)
Let us prove one assertion about the a.e. of the distribution of the estimate u;.
We assume that:
X. The r.v. Cj has density p(x), which has a bounded variation on .!Rl
Let [a, b], a> 0, b < 00, be an arbitrary, but fixed, interval,
T* = [a,b] x T, 9* = (0'2,9),
the coefficient of the excess of the r. v. Cj .
13. AEs RELATED TO THE VARIANCE OF ERRORS 179
THEOREM 29: Let us assume satisfied the conditions II-V, VIII of Section 10, X,
and IL2(k+1) < 00. Also let the l.s.e. en
have the following property: for any r > 0
(13.43)
Then
sup sup
e*ET* zERl
where R v (}*, z) are polynomials in z of degree 3v with coefficients that are uni-
formly bounded in (}* E T and n.
The property (13.43) is the property (13.18) for m = k + 1.
Let G be the d.f. of the vector (c~ - 0'2, Cj) and 0 be its c.f..
LEMMA 29.1: If ILl < 00 and condition X is satisfied, then for any 6 > 0
10(A1,A2)1 5 (l+O) ~ C16(1 + IA11 1+O)-1(1 + IA211+O)-1. (13.45)
(13.46)
1/12(A1, A2)
= e -i,xV 2,x2121T d<p (Xl pe i ,x2p 2p (p cos <p _ A1) P (p sin <p _ A1) dp.
o Jo 2A2 2A2
The inner integral in (13.46) is equal to
-1
.-
2tA2
1 0
00
P ( pcos<p - -A1) p (psm<p
2A2
. - -A1)',x
2A2
de' 2P 2
(13.47)
180 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
It = 1 o
00 e'">.2P 2 p ( pcos<p - -AI) dp ( psin<p - -Al )
2A2 2A2
+1 00
ei >'2p2 p (p sin <p - 2;J dp (p cos <p - 2;2)
= 12 +13.
Let us estimate the integral 12, and the integral 13 is estimated in just the same
way:
12 < Po 1 00
IdP(PCOS<P - 2;J I
< Po 1IIF
Idp(p) I, (13.48)
Po = sup p(x).
zElR l
(13.50)
Multiplying the inequality (13.49), raised to the second power, by the inequality
(13.50), for IA21 ~ 1 we obtain
1\lI(AI' A2)1 5
dim (V(O*)) = 1 + p,
13. AEs RELATED TO THE VARIANCE OF ERRORS 181
LEMMA 29.2: Under the conditions of Theorem 29, for the distribution Qn(8*)
we have the a. e.
sup sup
()*ET* BEBp+l
= O(n-(k-l)/2),
where the polynomials Pr ( -<P; {;~1I(8*)}) were introduced in Section 10, and XII(8*)
are the arithmetic means of the cumulants of order v of the vectors ~jn(8*), j =
1, ... ,no
Proof: We show that the conditions of the Theorem to be proved guarantee that
the conditions of Theorem A.13 are satisfied.
Let us show that
From condition VIII, as demonstrated in Section 10, (10.6) follows. Let us intro-
duce the (p + 1) x (p + I)-dimensional matrices
1 0 8
o
R
o
182 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
~ E(K;;-1/2(8)wj(8)1
o
o
~ E(K;;-1/2(8)wj(8P 1
Then
And so,
n
J.t4 - J.t~ - m~n-2 L (K;;1(9)Wi(9),wj(9) > J.t"21 (J.t4J.t2 - m~ - J.t~)
i,j=l
> 0,
since J.t4J.t2 - m~ - J.t~ is the determinant of the correlation matrix of the vector
(1, cil c~). Consequently (13.52) is true.
Let us set u = rh, where r ~ 6 is an integer, and h ~ p is taken from the
condition VIII of Section 10, and let T = (TO; t), to E ]Rl, t E JRP. Then for
o ~ m ~ n - u, n ~ u + 1, and
m+u
\J!m(9*,B~/2(9*)T) = II 18(tO, (t,Wj(O)})1 '
j=m+l
we obtain
13. AEs RELATED TO THE VARIANCE OF ERRORS 183
r
= II a s 1/ r ,
s=1
(13.53)
where Wj;(s) (9), i = 1, ... ,p, are p vectors from condition VIII. Let us make the
substitution of variables
i = 1, ... ,p
in the integral (13.53). The Jacobian of this transformation is equal to det Ws ,
where Ws is the matrix with columns Wj;(s) (9), i = 1, ... ,po From condition VIII
it follows that
det(Ws W;) ~ (P~)P > 0
uniformly in m, n, and 9 E T.
Therefore
From (13.53), (13.54), Lemma 29.1, and the conditions of Theorem 29 there follows
the finiteness of the integral a and the validity of the relation
This implies
x ~ B;;1/2(B*)x
sup sup
fJET AEBv+ 1
O(n-(k-l)/2, (13.59)
(13.60)
(13.62)
Proof: The proof of the Lemma is identical to the proof of Lemma 24.2 of Sec-
tion 10.
Proof of Theorem 29: From (13.59) and (13.60) there follows the relation
sup
O'ET' AEBp+l
sup IJrA (Qk(()*) 0 fnC )) dx -
JA
rQk((}*, y) dyl = O(n-(k-l)/2. (13.63)
We further note that from (13.14), (13.15), (13.17), and Lemma 24.3 it follows
that the polynomials Pv ((}), v = 1, ... , k - 2, of the a.e. (13.16) of Theorem 26 are
polynomials of the coordinates of the vectors Vv ((}), v = 1, ... , k-2, i.e., Pv ((}) are
polynomials of the coordinates of the vector v((}*). Let us introduce the mapping
186 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
y -_ (0
y ,y') , Y, -_ (y,
1 ... ,yP) ,
R:(9*,yo) = { P (9*,Y)'Pa
II n (y' - rn3 2 n- l LWj(9)YO) dy', (13.66)
iRP 1-'4 -1-'2
v = 1, ... ,k - 2.
13. AEs RELATED TO THE VARIANCE OF ERRORS 187
The calculation of the first polynomials of the a.e. (13.44) can be carried out,
for example, by the following method. For the mapping (13.64) it follows from
formulae (13.61) and (13.62) that
(13.67)
(13.68)
It then follows that we should integrate (13.66) taking into account the equalities
(13.67) and (13.68). In the calculations there arise no new difficulties in comparison
with the calculations of Section 11. Therefore we present only the final result.
Let H(s), s ~ 1, be the Chebyshev-Hermite polynomials:
Po = (J.t4-J.t22) 1/2 ,
PI = AijII(i)II(j),
{ -4 2(J.t6
+ Po qu
J.t4 u2
-"'6 + -2- -
0'6
"3 2)
+ m sPl
+ pC;4[(6u2m~ - mSmS)pl + mgps]
+ Po-4 (J.t8 J.t6u2
- - -
24 6
- J.t4u4
+- 4
- - -0'8)
8
- -8 I} H4(Z)
+ Po
-2
(q(2u 4
- J.t4) + msu2 P2 +
24)
q 0'
-2- H2(Z). (13.70)
COROLLARY 29.1: If the conditions of Theorem 29 are satisfied and the ej are
symmetric r.v.-s, then the first polynomials of the a.e. (19 ..1,4), Rl and R2 are
independent of the parameter 8.
The polynomial (13.69) was, in fact, obtained in [217].
188 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
In this Section a result (Theorem 30) is obtained that coincides with Theorem 29
of the preceding Section for Gaussian (0, (12) errors of observation Cj. The sense of
the separation of such a fact in a separate statement consists in this, that a specific
Gaussian character enables the third polynomial R 3 (8*, z) ofthe a.e. (13.44), which
has a remarkable property considered in Chapter 4, also to be found.
THEOREM 30: Let us assume the conditions 11- V, VIII of Section 10, that Cj are
Gaussian (0, (12) r.v.-s, and that the l.s.e. en
has the property (19.49). Then
sup sup
9"ET" zERl
v'2 3 q+2
Rd8*, z) = R 1 (z) = ""3 z - v'2 z, (14.2)
= 1 6-
-z
9
(
-+-
q
36
7) z 4 + ( -+-q+2
q2
42
3 ) z2 - (
-+-+-
q2 q
426'
1)
(q3 7 2 37) 3
- 24 + 12 q + 2q + 18 z
+ (1"8 q3 + 21 q2 + 12
7 1 n ))
q + "6 + "8 Y(8) z (14.4)
nY(8) = (12 Aidl (8)Ai2h (8) (2II(hh)(hh) (8) - II(hh)(i2h) (8)) (14.5)
Proof: The relation (14.1) was obtained in Theorem 29 in coincides with (13.44).
Therefore we shall turn out attention to those details of proof related to the method
of obtaining in an explicit form the initial terms of the a.e. (14.1).
14. AEs DISTRIBUTION OF THE VARIANCE ESTIMATOR 189
Let
r = (to; t), t=(tl, ... ,tP).
Then the c.f. of the vector
Using the normality of the r.v. Cj it is possible to show (see, for example, [142]
p. 381) that
~ it O/T2 e- 2 2 1 {u
}
Gj(r) = v'1_2it0u2exp -"2((t,Wj(O))) 1-2itOu2 . (14.6)
Consequently the c.f. of the sum of random vectors B;; 1/2 (O")V(O")
-
are
W' n (r ) = II G. (
n
J
to . m- 1/ 2K-
"2='2' n
1/ 2(0)t)
j=1 V..:inu
f W'm(O", r) dr
J11V+l
= f dt\l
JRl 1 + 2(tO)2)u/4
X
f {
JRP exp -
1
2(1 + 2(tO)2)
m+u
j=~1 (UK;;1/2(0)Wj (0), t)
2} dt
= (21r)P/2 det Kn(O) f dtO
(14.8)
u2p det W,hu) (0) JRl (1 + 2(tO)2)u/4-p/2 '
where
m+u
W~u)(O) = L Wj(O)wj(O).
j=m+1
190 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
The integral (14.8) is convergent if u ~ 2p+3. From the conditions of the Theorem
it also follows that for u ~ p there exists a constant c > 0 such that
O)2)-U/4 { cltl 2 }
\11 m ()
r :S (1 + 2(t exp - 2(1 + 2(tO)2) < 1 (14.9)
if Irl ~ b> O.
Let us find the first terms of the a.e. (13.51). Taking (14.7) into account we
obtain for the polynomials Pr (lrj {Xv}), introduced in Section 10, the following
expressions:
A(irj {Xv})
P2(irj {Xv})
Itl 2 = - ~)itj)2,
j=1
the polynomials Pr ( -<pj {Xv} )(x), r = 1,2,3, of the a.e. (13.51) are immediately
defined from (14.10)-(14.12). The passage to the polynomials of the a.e. (13.59)
is obtained by the substitution of variables x -+ B~/2(8*)x in the polynomials
Pr ( -<Pj {Xv} )(x):
- (')
PI 8*, x =
(xO)3 P+2
6a 6 - 2a 2 x
+ 2aXO2 ( K n -1 ( )
8 x ,x
I ')
, (14.13)
(14.14)
37 0)3 1 0
- 360'6 (X + 60'2 X
1 07 7 05 1 03 1 0)
+ ( 1440'14 (X) - 480'10 (x ) + 20'6 (X) - 120'2 X
X ((K;I(8)x',x') - p)
1 (0)5 1 (0)3 1 0)
+ ( 480'10 X - 60'6 X - 80'2 .x
1 (0)3 1 0)
+ ( 480'6 X - 80'2 X
For the transformation (13.64) the first polynomials of the -ap (13.63) are
given by formulae (13.67) and (13.68). We must find the polynomial P3' For this
it is necessary to turn to the details of the proof of Lemma 24.2 of Section 10 for
the special transformation (13.64). From the reasoning of Lemma 24.2 it follows
that
- ~
where the ]5" are polynomials of the a.e. (10.51), and the
a.e.(10.52).
From the identity (10.43) it is easy to find
Q
-i
r
-
-
0, i = 1, .. . ,p. (14.17)
aJ;;I(y)
ay = det ((8" + ~
'3 ~
n- 1/ 2 (Qi)
r j
)P ) -- 1,
r=1 i,j=1
then the polynomials j5 r of the a.e. (10.45) are the polynomials of the a.e.
CP2tr4 (
~
Y0 - L..Jn -r/2p,r (1I*
17 ,y 1 , ... ,yP)) (14.18)
r~l
x ( 1+ Ln
k-2 -r/2- * 0
P r (8,y - Ln -r/2 Pr(8,y
* , ... ,y ),y , ... ,y) )
1 P 1 P .
r=l r~l
Pl CP2tr4 =
= (14.19)
P2CP2tr4 =
= (14.20)
P3CP2tr4 =
= (p 3 - (P l )OP2 - (P2)OPl ) CP2tr 4 + ~ Pf(Pl CP2tr4)OO + PlP2CP~tr4
P 2 CP2tr4 - 6 P l CP2tr4.
+ P2 + Pl - )' 13111
- ( P3 (14.21)
Pl (8,y') = _Lyi
'3 y ,
j (14.22)
P2(8,y') (14.23)
14. AEs DISTRIBUTION OF THE VARIANCE ESTIMATOR 193
(14.25)
(14.26)
We observe that
fourth order of the centralised Gaussian vector (see, for example, [89] Chapter 1)
we obtain
(yO)4
q ( - 12a8 +
(yO)2)
2a4 'P2u 4 (y ).
Let us further note that from (14.23) there follows the equality
1(
2 lap p21 'P K.. (')
Y
d''P2u" (0) = (q2"2 + q) ((yO)2
Y 4 Y
1) (0)
4a4 - 2 'P2u 4 Y .
And so
'Q*(8*, z)
""2
= 1(yO)6 - 1(q + 27) "7
72 a 12 12
(yO)4
q2 3 ) (yO)2
+ ( -+-q+l -- - (q2 q
-+-+-1)
(14.29)
84 4 a 426'
Let us find the cumulants kj, j = 1, ... ,5 of the quantity M / ';2 q 2 without spec-
ifying terms of the order o(n- S / 2 ). We shall thus look for a representation
For this one should find the mixed moments EC pJP[ pJp; which enter into the
expression for the moments of M up to fifth order inclusive (terms o(n- S / 2 ) not
being taken into account), then passing by the standard formulae (see, the Ap-
pendix, Subsidiary Facts) from the moments to the cumulants. Using this plan
we arrive at the matrix with coefficients kjl' j = 1,2,3,4,5, l = 0,1,2,3, from the
representation (14.30):
0 -qV2 0 V2 nY(9)
8
1 0 -q 0
(kjz) = 0 2V2 0 -2V2q (14.31)
0 0 12 0
0 0 0 48V2
The large number of zeros in the matrix (kj/) is a consequence of the normality of
the errors of observation j.
Having the matrix (14.31) available it is not difficult to obtain the a.e. of the
c.f. of the r.v. (!n)1/2((a~)/q2) -1), from which in turn, using the inverse Fourier
transform, we find R 1 , R2 , and Rs.
To conclude the Section we shall indicate the representation of the polynomials
R1, R2 and Rs in terms of Chebyshev-Hermite polynomials:
(14.32)
(14.33)
V2n V2
Rs(9*, z) = -8- Y(9)H1 (z) + 24 (- qS + 6q2 - 8)Hs(z)
bs + V2 (q2 _ ~ q + ~)H5(Z) + v'2 (_.!l... + !)H7(Z)
12 12 5 18 6
(14.34)
196 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
The 'jack knife' and 'cross-validation' methods have been long and widely used
in applied statistics works and thoroughly investigated in their theories (see, for
example, [82,215]). Both methods belong to the methods of statistical estimation
linked to resampling methods. The basic idea of these methods consists of using
a special method of treating experimental data to obtain an estimate of an un-
known parameter, namely: some function is calculated (most often the arithmetic
mean) of estimators obtained by reduced samples). As a result the probabilistic
characteristics of the estimator are altered by the comparison with the standard
estimators: for example, the bias of the estimator decreases.
In this Section of the book the 'jack knife' and 'cross-validation' methods are
used for the estimation of the variance of the errors of observation in the non-linear
regression model: first the stochastic a.e.-s of these estimators are obtained, then
the initial terms of the a.e. are found and their first two moments.
If, for the estimation of the variance of the errors of observation, the statistic
A2
O'n = n -lL(()An,
)
is usually used, then in the 'jack knife' method is replaced by the statistic
(15.1)
(15.2)
where B(_j) are the l.s.e.-s of the parameter obtained by sampling, from which the
observation Xj is removed, and
is the estimator analogous to 0';for such a reduced sample. We shall further mark
by the index (-t) the quantities relating to the truncated sample.
Let us assume that to the functions g(j, ()) for each j there also exist in e c
all partial derivatives with respect to the variables () = (()l, ... ,()q) to order k + 4
inclusive, k ~ 2.
For the proof of theorems on the stochastic a.e. of the functionals I n and
Cn we require conditions which are modifications of the conditions for obtaining
the stochastic a.e. of the l.s.e. On and the variance estimator 0'; of the errors of
observation j.
15. JACK KNIFE AND CROSS- VALIDATION ESTIMATORS 197
111(1). (1)
for all larl = 1, ... ,8, for which g(a r ) (j, 0) ~ 0 and 8 = 1, .. . ,1:
(2)
IV(l, m) (1) _
lim supn- 1 L Ig(a)(j,O) I m (l-l)
< 00,
n-+oo Bet
lal = 1, ... ,1 - 2,
lal = 1-1,1,
V. lim inf m~n Amin(J(O) - n- 1 J(j, 0)) > AO > 0,
n-+oo BeT 1$3$n
where
J(j,O) = (gi(j,O)gr(j,O)):,r=l'
For certain sets of indices
ks = ( .(s) , , ,(S) '
~1 ~r. 8 = 1, ... ,1, 1 = 1,2, ... ,
we shall denote
I
rb(kl) ... (k,) (0) = n- 1 / 2 L c:j II g(k2)(j, 0),
s=l
I
rb(kl) ... (k,) (0) = n- 1 / 2 L(c:j - m r) II g(k.) (j, 0),
s=l
ob = 1,
198 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
with one and the same constant C3 < 00 for 9n = 9, 9 = 9(-t), t = 1, ... , n. Then
where
Gt = n- 1/ 2 ~)e1- (12),
and G~, v = 1, ... , k = 2, are polynomials of degree v + 1 with respect to the
quantities lb(kl) ... (kl)' I = 1, ... , [v/2] + 1 with coefficients uniformly bounded in
9 E T and n.
In particular
Gf (15.5)
It follows that we should stress that in contrast to the s.a.e. of the estimator
q~ (see Section 13) the polynomials G v now are not homogeneous with respect
to the sum of r.v.-s lb. The conditions for which (15.3) holds are mentioned in
Section 2 of Chapter 1.
Proof: The regularity conditions of the Theorem being proved ensure that The-
orem 26 of Section 13 holds not only for the original but also for the truncated
samples. Therefore the application of this Theorem to the 'jack knife' functional
I n results in the s.a.e.
L {A (9)n-
k
= Gt + v V/ 2+l - 2Bv (9)n- V / 2+l - C v (9)n 1 / 2(n - 1)-(V-l)/2
v=l
15. JACK KNIFE AND CROSS-VALIDATION ESTIMATORS 199
n
+ n- Ck - 1)/2 Rk+1 (0) - n- 1 L n 1/ 2(n - 1)-k/2 Rk+1,C-t) (0), (15.7)
t=l
where
(1) G~ = Po, A" - 2B" = P"
are the polynomials (13.17) of the expansion (13.16) of the functional a~j
n n
(2) C" = n- 1 LA"C-t), D" = n- 1 LB"C-t),
t=l t=l
supp;{IRk+1,C-t)(0)1 ~ c6 (1ogn)Ck+2)/2}
8ET
< c7 n-Cm-2)/21og-m/2 n ,
_ (15.9)
with the constants C6 and C7 not depending upon t
The next statement gives important information about the structure of the
= 1, ... , n.
polynomials of the expansion (13.16).
LEMMA 31.1: The polynomials P,,(O), v = 1, ... , k - 2, are linear combinations of
quantities of the form
where (k~), (k;), (kr ) are sets of indices from {ilo ... ,i,,+~}U{il, ... ,j,,+~}, and
,,+1
U((k~) U (k;)) U(k
~
(15.11)
200 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
and with
~ v+1
U((k~) U (k;)) U(k r) = {i1, ... , iv+~+d U {i1,'" ,iv+~+d
r=l r=l
In doing so, the recurrence relations (7.43) are used. Then we also obtain (15.10)
by induction using the relations (13.14), (13.15) and (15.11).
To obtain the general formula for the polynomial G~ of the stochastic a.e.
(15.4) first of all the quantities containing the index (-t) should be got rid of.
Let us carry out the following substitution:
A(-t) = r1 - 1 [J _ n- 1J(t,
(-t) -- n n 8)]-1
. (15.14)
= (15.15)
+ n- 2 Aidl Ai2h Aiaja gil (t, 8)9h (t, 8)gi2 (t, 8)gia (t, 8) + ....
As follows from (15.10), on substituting the expressions (15.12)-(15.15) (retain-
ing in the series (15.15) only a finite number of terms) into the terms of formula
(15.7) containing C v and Dv we obtain some polynomials in n- 1 / 2 The coeffi-
cients of this polynomial in degrees of n- v / 2 are polynomials in the sum of r.v.-s
IbCk1) ...... (k/)' 1 = 1, ... , [v/2] + 1, v = 1, ... , 2k, and each monomial of the latter
polynom1als contains no more than one factor lb" with 1 ~ 2. Centring the quan-
tities lb", 1 ~ 2, about the power n- v / 2 , i.e., a conversion to sums of lb, leads to
the appearance of additional terms in the coefficients of n- v - 1 / 2
Performing the centring, let us gather together all coefficients for the powers
n- v / 2 , v = 1, ... , k. These are just the polynomials G~ which have the form
(15.16)
A ll n- II / 2+1 - 2Bll n- II / 2+1. In its turn, from the terms containing only one of the
quantities
- n-1g(kt} (t, 9)g(k2) (t, 9), - n-1ctg(a) (t, 9), n- 1 Aidl gjl (t, 9)gh (t, 9)Ai2h.
after averaging with respect to t we obtain - PII , whence in view of the signs of CII
and DII we now obtain PII. The quantity PII emerges here thanks to one property
of the polynomial PII , that each term entering PII has one more factors of II and
b than of factors of A.
We can now rewrite the expression (15.7) in the form
n 1 / 2 (Jn _ (12)
k
= LG~(9)n-Il/2 + G k+1(9)n-(k+l)/2 +n-(k-l)/2Rk+1(9)
11=0
n
_n- 1 Lnl/2(n -1)-k/ 2Rk+l,(_t)(9), (15.17)
t=l
where G k+1 (9) is a polynomial in the variables A, II and b, moreover the maximal
degree of this polynomial in lb and the maximal value of I are equal to k + 1.
Let us estimate the remainder terms of the s.a.e. (15.17). The remainder term
Rk+l (9) is estimated by the formula (15.8). On the other hand, by using (15.9)
we find
n
:::; L sup p;{I R k+l,(-t)(9)1 ~ ea (log n)(k+2)/2}
t=l BET
= O(n-(m-4)/2(logn)-m/2). (15.18)
Let us further observe that
n-(k+1)/2G k+1 = n-(k-l)/2(n- 1 G1 + n- 3 / 2G2 + ... ), (15.19)
and each term Gi of the finite sum (15.19) has the following property: there exists
a constant Cs such that
(15.20)
i = 1,2, ....
In fact, Gi is a linear combination of the products
r = 0, ... , k, 1 = 0, ... , k + 1.
202 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
k
~ L P; { Ib( kf) I ~ C~h(k+l) logl/2 n}
i=l
In this way (15.20) is a corollary of the conditions of the Theorem being proved
and Theorem A.5. We obtain similar bounds also for the polynomials Gf-l and
n- 1 / 2Gf
Close to the Theorem just proved is the following:
THEOREM 32: For some integer m ~ k + 4 let the conditions
(15.23)
Proof: Let us outline the proof of the Theorem as formulated. From the technical
point of view it is expedient to represent the functional (15.2) in the form
n
n -1,", A2
Cn = n Qn - -n- L.J17(-t), (15.24)
t=l
15. JACK KNIFE AND CROSS- VALIDATION ESTIMATORS 203
where
n
Qn = n- l E n- E [Xj - g(j, 9(-t)W,
l (15.25)
t=l
i.e., the statistic (15.25) plays for C n the same role as the statistic a~ for I n . The
latter gives grounds to use of Qn as an estimator of the variance 0- 2 of the errors
of observation. In fact there holds:
LEMMA 32.1: Under the conditions of Theorem 31
where the polynomials Gfj have the properties of the polynomials G~ and Ge, and
furthermore
G~ = Po, (15.27)
and the s.a.e.-s of all the statistics of the right hand side of (15.28) are already
obtained. Taking advantage of these expansions, in (15.28) let us equate the
polynomials with the same degrees of n- II / 2:
/I = 0, ... , k - 2. (15.29)
Analogously, for the remainder terms of the s.a.e.-s (13.16), (15.4), (15.21) and
(15.24) the relation
sup
BET
p;{ IRf-ll ~ CI5Iog(k+2)/2 n} = O(n-(m-4)/2Iog -m/2 n),
204 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
L n-II/2G~ (8) + n-
2
n 1/ 2(Jn - a 2) = 3 / 2Rf (8), (15.30)
11=0
2
n 1/ 2(Cn - a 2) = L n-II/2G~(8) + n- 3 / 2Rf(8), (15.31)
11=0
We shall write
B(Jn ) = E(fn 1/ 2(Jn - a 2), S(Jn ) = E(fn(Jn - a 2)2,
D(Jn ) = D(fnl/2(Jn - a 2).
Analogously, B(Cn ), S(Cn ), D(Cn ) are the bias, the mean square deviation, and
the variance of the normed estimator n 1 / 2 (Cn - (12).
THEOREM 33: Let the conditions of Theorem 32 be satisfied for k = 4. Then we
have, uniformly in 8 E T,
O{n-l/2Iog-3 n), m = 8,
(1) B(Jn ) = O(n- 1 log- 7/ 2n),
{ m = 9, (15.34)
O(n- 3/ 2 Iog3 n), m ~ 10,
(2) S(Jn ), D(Jn ) = a 4 (32 + 2) + 2qa4 n- 1 + 0(n- 1), m ~ 9, (15.35)
O(n-l/2Iog-3 n), m = 8,
(3) B(Cn ) = qa 2 n- 1 / 2 + { O(n- 1 log- 7 / 2 n), m = 9, (15.36)
O(n- 3/ 2 Iog4 n), m ~ 10,
(4) S(Cn ) = a 4 (32 + 2) + n- 1a 4 (q2 + 2q(32 + 3) - 2(31aZ(8))
+ o(n-l), m ~ 9, (15.37)
where
J.t4
/31 = -3
m3
(J' and /32 = 4"
(J' - 3
are the coefficients of skewness and the excess of the distribution of the r. v. C j,
and
(15.39)
(cf., (19.94)).
Proof: The proof is close to the proofs of Theorems 27 and 28. Therefore let us
direct our attention only to certain details. Let us consider first the estimate I n .
Instead of the event On ((J) of Theorem 27 let there be introduced the event
we obtain the bounds that in the degrees n and log n contain the exponent m
instead of [m/2]. Instead of the inequality (13.28) the inequality
n
n 1/ 2 JJn - (J'2J ~ 4nJPo J + C1Sn3/2JOn - (JJ2 + C19n-1 L IO(-t) _(J1 2
t=l
(15.40)
and instead of (15.40) let us use an analogous inequality for Cn. The calculations
of the mathematical expectations analogous to (15.41) are obvious.
To within o(n- 1 ) the estimator I n has the least bias. The sizes of the bias for
a~ and Cn are identical in modulus but differ in sign. From (13.34) and (15.35) it
follows that
206 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
The sign of the difference depends upon the sign of the expression on the right
hand side of (15.42). Let us note that /32 ~ -2, and the case /32 = -2 corresponds
to the degenerate r.v. ISj. Let /31 = 0, then for q > 2(2 + /32) and n > no
But, for example, for Gaussian (0,0'2) r.v.-s ISj (/32 = 0) for dimensions q = 1,2,3
and n > no
Analogously
Secondly,
In this way, for /31 = 0 and n > no the variance and mean square deviation of
the functional a;are smallest (with the exception of the case q > 2(2 + /32),
when S(a;) > S(Jn )). By these indicators I n possess the second place and the
functional Cn in this case has the worst characteristics. If, also, the r.v. ISj has a
non-zero skewness (/31 ::j:. 0), then the properties of the regression function which
are specified by the term Z (0) will influence the relations between variances and
mean square deviations (see (15.39)).
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 207
Setting
u(8) = n 1 / 2 (9 n - 8),
let us consider the following functionals of 9n :
7(1)(8) = u- 2 (L(8) - L(9n )) , (16.1)
7(2)(8) = u- 2 (I(9 n )u(8), u(8)} , (16.2)
7(3)(8) = u- 2 (I(8)u(8), u(8)} , (16.3)
2
7(4)(8) = u- 'Pn(8n , 8). (16.4)
A
For Gaussians (0, ( 2 ) the r.v.-s (16.1) and (16.2) are the statistics of the
Neyman-Pierson criteria (with coefficients u 2 /2) and of the Wald criteria of hy-
pothesis testing in which the value of the unknown parameter is equal to 8 ([189],
Section 6e.2). The functional (16.1) is widely used in regression analysis to con-
struct regions of confidence for the unknown parameter 8. The functional (16.4) is
naturally called the Kullback-Leibler statistics, since for the Gaussian (0, ( 2 ) the
r.v. Cj the quantity u- 2 'Pn(8 1 , 82) is the double of the Kullback-Leibler distance
[39] between the Gaussian measures p~ and P~. And, finally, the functional
(16.3) is a modification of the statistics (16.2) of Wald's criterion.
The functionals (16.2)-(16.4) are quadratic in the sense that they weakly con-
verge to the X~ distribution as n -+ 00.
This Section contains a theorem about the a.e. of the distribution of the func-
tionals (16.1)-(16.4). Our goal is to obtain and analyse the initial terms of the
a.e.: they are the most important ones for applications. Therefore the assertions in
this Sections are deduced only as necessary for this purpose of generality, although
they are true in more general formulations.
The central place in the Section is occupied by the concept of virtual vector.
We say that a random vector is virtual if it is similar to the s.a.e. of an l.s.e.,
and generally speaking it is not an a.e. of any estimator. As is seen later on,
the concept of virtual vector is technically convenient for obtaining the a.e. of a
distribution of functionals in 9n of statistics of the Neyman-Pierson type, which
do not admit the expansions (16.7)-(16.9).
It is necessary for us to use the special case of Theorem 18 of Section 7 (see
also Lemma 25.1 of Section 12 and Remark 26.1 of Section 13).
LEMMA 34.1: Let J.ts < 00 (condition If!) and conditions II, III, V of Section 10
be satisfied for k = 4, and IVI of Section 12 for m = 5. Then, if for any r > 0,
supP;{19n - 81 ~ r} = o(n- 3 / 2 ),
(JET
and hv, v = 0,1,2, are the vector polynomials of Section 7 (taking into account
the normalisation n 1 / 2 1 q instead of dn (9).
1 1
P4 = - -,
4
P5 = - -,
2
and the quantities aOtjk and aOtjkl are those introduced in Section 7.
LEMMA 34.2: Under the conditions of Lemma 34.1 the functionals ,(m), m =
2,3,4 admit the s.a.e.
(16.7)
moreover,
(1) ,(m) = 0'-2 {Iijuiui + (C(m)rr(i)(jk)UiUju k ) n- 1 / 2 (16.8)
Proof: For the quantities ,(m) (9+n-1/2u), m = 2,3,4, let us write the expansions
in Taylor series in u up to the fourth order derivatives inclusive, with the remainder
term in Lagrange form, and let us rewrite them in the form (16.7). Using the
conditions of the Lemma, for t(m), m = 2,3,4, we obtain the bound
(16.11)
From the conditions of the Lemma it is not difficult to deduce (ef., with (13.4))
that there exists a constant C4 = C4 (T) > 0 such that
C5 = c5(T), c~ = c~(T),
such that for
we have
P;{lu(9)1 ~ c41og1/2 n}
+ L P(bi ) + L P(bi1i2 ) +
i=l
=o(n- 3 / 2 ).
uniformly in 9 E T. (16.7) is then evident from (16.11) and (16.12).
210 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
LEMMA 34.3: Under the conditions of Lemma 34.1 the functionals T(m), m =
2,3,4, admit the s.a.e.
where:
(1) The c(m) are the r.v.-s having the following property: there exist constants
c~m) = c~m) (T) > 0 such that
supp;{lc(m)1 ~ c~m) log2.5 n } = o(n- 3 / 2 );
BET
(3) Bl = I ia v~rV'ViVk,
B2 = (II(a)(jk) + 2II(j)(ak) vtViVjV k ,
B3 = A r8 (II(r)(kl)II(8)(ij) + 4II(r)(kl)II(i)(j8)
+ 4II(k)(rl)II(i)(j8) ) V i V3V
. k I
V ,
. . k I
B4 = II(ij)(kl) V'V3V V ,
B5 = II(i)(jkl) V'V V V ,
. k k I
B 6 -- 1-,a vaViVjv
jk
k.
,
(4) The coefficients aim) and (3;m) satisfy the following relations:
{I} 211"1 = aim),
{2} 1211"2 + c(m) = a~m),
{3} 2P2+11"12 = (3(m)
1 ,
{4} 4(P4 + P5) + 411"111"2 + 11"IC(m) = (3~m),
{5} 8P6 + 411"~ + 211"2C(m) = (3~m) ,
{6} 12p3 + d(m) = (3im) ,
{7} 16p3 + e(m) = (3~m),
Clearly,
LP +l(0)n-
00
7(1)(0) = _0'2 II Il / 2,
11=0
where PII , v = 1,2, ... , are polynomials of the a.e. (13.16). Analogously, we have,
formally,
L A +l(0)n-
00
T(4)(0) = 0'-2 II Il / 2,
11=0
where the quantities AII(O) are assigned by (13.14). The first terms of the expan-
sions mentioned are given by the expressions (13.19)-(13.21).
In Table 3.2 are listed the values of the coefficients a~m) and p~m) for various
criteria 7(m). For m = 2,3,4 the values of a~m) and p}m) are obtained from (16.6),
(16.10), and (16.14), and a~l) and pF) are taken immediately from (16.13).
212 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
m
(m) a(m) ,Bi m ) ,B~m) ,B~m) ,B~m) ,B~m) ,B~m)
a1 2
1 1 1 1
1 1 -1 1 -1 4 -4 -'3 '3
1 1
2 2 -1 3 -2 4 0 -'3 1
5 4
3 2 -3 3 -4 4 -1 -'3 1
4 2 -2 3 -3 s s -1 1
4 -4
(16.16)
Let us approximate the d.f. Fn in (16.16) by its a.e.. We shall arrive at the
final result after the necessary changes of variables in the approximated integral
and taking into account the values of the remainder terms in the expressions so
obtained.
Unfortunately we can not apply this method to the important functional T(1),
since its lacks a representation (16.7), (16.8). Nevertheless, the result of the
Lemma 34.3 permits us to unify the method of obtaining the a.e. of the dis-
tributions of all four functionals. Let
(16.17)
where hs (9), 9 E e is a bounded non-random function on T, ho = ho, h1 and h2 are
vector polynomials of the form (16.5), the coefficients 11'1, 11'2, P1-P6 of which are
arbitrary and are not obliged to coincide with the values of (16.6) corresponding
toen. Also let
and
r[1] = (1-2 {IiiUiUi + (c(1)II(i)(ik)UiUiuk) n- 1/ 2
5 1
P4 + Ps = - 24' P6 = 36
(16.20)
214 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
where the r.v. t(l) has the property (16.9) with the constant
are one of the realisations of a virtual vector and s.a.e. respectively. Let us keep
the notation Fn(x) for the d.f. of the virtual vector ii. Then for the functional r(l)
the relations (16.15) and (16.16) hold. Consequently the a.e. of its distribution
can be obtained, having available the a.e. of the d.f. of the virtual vector ii.
In the work [16] an assertion about the a.e. of the d.f. of the functional T(l) is
proved which uses the a.e. of the d.f. of the vector V(8) (see Section 10) and the
a.e. (16.13). In spite of the greater naturalness of such an approach in comparison
with the virtual approach just stated, it turns out to be unsuccessful from the
calculational point of view. As we have already been persuaded above, the proofs
of theorems about a.e.-s usually also contain a calculational scheme, following
which it is possible to find the initial terms of this a.e. that are important in
applications. The proof of the theorem in the work [16] is no exception to the
rule. However, the attempt to calculate the second term of the asymptotic d.f. of
r(l), confining oneself to [16], was shown to be unsuccessful, since one arrived at a
complete halt owing to the extraordinary tediousness required for that calculation.
In solving the problem under consideration under a distinctive law of the con-
servation of the difficulty of calculation, it becomes clear that the use of the virtual
approach does not set us free from the huge volume of processing. But here the
fundamental calculational difficulty is absorbed into Theorem 34 about the a.e. of
the d.f. of the virtual vector ii, which is close to Theorem 24 of Section 10 about
the a.e. of the d.f. of the vector On.
THEOREM 34: Let the conditions of Theorem 24 of Section 10 be satisfied for
k = 4. Then
sup sup
(JET CEe: q
(16.21)
where M v, v = 1,2, are polynomials of degree 3v in the variables y = (y1, ... , yq)
with coefficients uniformly bounded in 8 E T and n.
We do not give the proof of Theorem 34 since it coincides with the proof
of Theorem 24. The calculation of the polynomials M 1 and M 2 is carried out
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 215
r = 1,2, ....
THEOREM 35: Under the conditions of the preceding Theorem, for any Zo >0
(q = 1), Zo = 0 (q> 1), and m = 1,2,3,4,
sup sup
(JETz2::zo
where
\ (m) _ \ (m) ((m) (m) (3(m) (3(m))
Ajk - Ajk al , a2 , 1 , ... , 6
are the numerical coefficients characterising the functionals T(m), and the quan-
tities Pk((}) do not depend upon m, and given by the expressions
2
'Y3
(16
AiS Ajr AOI.8II (a )(.8)( i) II (s)(j)( r) ,
2
'Y3 Ais Ajr AOI.8II (a )(j)( i) II (.8)( s)( r) ,
(16
P6 = "Y3 A
2"
(J'
is Ajr AQ,Bn
(i)(s)(j)
n
(Q,B)(r),
P7 = "Y3
2"
(J'
A is Ajr A Q,Bn
(Q)(i)(j)
n
(,Bs)(r) '
P9 = (J'
2 A is Ajr A Q,Bn
(Q,B)(j) n (ir)(s) ,
P lO = (J'
2 A is Ajr A Q,Bn
(Qi)(j) n (,Br)(s) ,
Pu = (J'
2 A is Ajr AQ,Bn
(i)(js)
n (Q)(,Br)'
Pl2 = (J'
2 A is Ajr A Q,Bn (i)(jr) n (s)(Q,B) ,
Pl 3 = (J'
2 A is Ajr AQ,Bn (i)(jQ) n (s)(r,B) ,
2 .. kl
Pl4 = (J' A'3 A n(ij)(kl)'
2 .. kl
PIS = (J' A '3 A n(ik)(jl)'
2 .. kl
Pl6 = (J' A \3 A n(i)(jkl)'
Proof: We shall carry out the proof for the virtual vector u and its d.f. Fn(x).
In particular, it includes the case u = u. The coefficients >.Y::) are contained in
Table 3.3.
Let us denote
X! = {x: r[mj (9 + n- l / 2 x) < z:t= c5(m)}
sn(9,x) = {u:(J(9)u,u} ~ x2logn},
where x > 0 is some constant. Thanks to (16.15), (16.16), the Theorem will be
proved if the required expansion can be obtained for the integrals Ix;[ dFn . The
sets Xl n sn(9, x) are convex for n > no. On the other hand the constant x can
be chosen such that
2
- !8 0:1
i (0:1 + 0:2)2 + i o:~ - - i (0:1 + 0:2)2 - i o:~ +
12 ~ (0:1 + 0:2)2
! ,83 ! ,83
- i o:~ + i (0:1 + 0:2)2 + o:~- - ! (0:1 + 0:2)2 - i o:~ +
13 i (0:1 + 0:2)2
! ,81 ,81 - ,82 - 3,83 ! ,81 + ,82 + 3,83
14 !8 0: 2
1 - i o:~ - !,84 1 2 + 21 ,84
80:1 0
1 2-
40:1
15 - ! o:~ + !,81 -,84 12(3
40:1 + 4 0
! ,81
16 0 - ~ (,85 + ,86) ~ (,85 + ,86) 0
218 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
uniformly in 0 E T, since clearly the virtual vector u has the property (16.12).
Therefore it is sufficient to restrict ourselves to the consideration of the integrals
Jx~nSn dFn
By Theorem 34
sup sup
(JET z~zo
Let us consider the integral Y:. The integral Yn- is considered analogously. In
Y: let us carry out the substitution of variables u -+ UA1/2U, and then the polar
substitution of variables u -+ (r, cp), cp = (cp1, .. . , cpq-1):
i-1
ui = r II sin cpo. cos cpi, i = 1, .. . ,q,
01.=1
cpo. E [0,11"), a: = 1, ... , q - 2,
cpq-1 E [0,211"), CPq == 0, r ~ o.
Then the function ,[m] is transformed into the form
where a 1 and a 2 are trigonometric polynomials in the variables cp1 , ... , cpq-1. For
example,
where
q-2
I(r, <p) = r q- 1 II (sin <pi)q-i-1
i=l
where m~j) are trignometric polynomials in <p1, ... , <pq-1 determined by the sub-
stitutions, mentioned above, of variables from the formulae for the polynomials
M1 and M 2
By Fubini's Theorem
=+
Xn (16.23)
Let us denote by Hn(<P) the image of the interval in under the mapping r --t
rfm1(r,<p). It is clear that for n > no and some constant Cs E (0,00)
uniformly in () E T. Let
1l1 n (p,<p) :Hn --t]R1
be the inverse function of rfm1(r,<p)li". The function 1l1 n can be found formally
in. the form of a series in half-integer powers of p, the coefficients of which are
trigonometric polynomials in <p1, . .. , <pq-1, with coefficients uniformly bounded in
() E T and n:
r = 1l1n (p,<p)
+ L n- v/ 2p{v+1)/2(TV A~(<p).
00
= p1/2 (16.24)
v=l
220 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
The coefficients Ll~ are calculated by the substitution of r!m1(r,<p) in the series
(16.24) and setting equal to zero all coefficients of the powers r > 1. Let us find
the quantities Lli and Ll;. Let us denote
p = rfm1(r,<p).
Then
T = urn- 1/ 2.
Since
Consequently
(16.25)
~ Lll + Lli = 0,
~ Ll2 - ~ Ll~ + LllLli + Ll; = 0,
In particular we find
Ll" 5 1
2= 8~1 - A2
2~2'
A
(16.26)
with
r~l) = O(log2 n).
=
where
r~2) = O(logn).
uniformly in (J E T and cp E II q , and for p < Cs logn. _
Let us denote by z;t the integral in (16.23) over the set X rt, and let us carry
out the change of variable
r -t p = pm](r,cp).
in it. For n > no we obtain
z~ = {
JH n n[O,z+6(m)]
'li~-le-"'~/2 (1 + tv=l
M v ('lin' cp)n- V / 2) a'li n dp. (16.29)
ap
In (16.29) let us substitute the representations (16.27) and (16.28) for 'lin and
a'lin/ap, substituting for the ~t and E; in them the quantities ~l and ~2 in
(16.26). Simple transformations show that
z~ = ! {
2 JH n n[O,z+6(m)]
e- p / 2pq/2-1 (1 + t
v=l
Mv(p, cp)n- V / 2) dp
+ r~3)n-3/2. (16.30)
In the representation (16.30)
r~3) = O(log2 n)
Ml i
= (a3m13) + ~l )p3/2 + (amP) - q; 1 a~l) pl/2, (16.31)
(16.32)
222 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
The coefficients of the polynomials Mil are trigonometric polynomials in <p1, ... ,
<pq-1 with coefficients uniformly bounded in 8 E T and n. For 11 1 each term of =
these trigonometric polynomials contains either one odd power of cos <pi, i = 1, ... ,
q - 1, or an odd power of sin <pq-1 j for 11 = 2 it contains terms with even powers
of the functions stated.
It is easy to understand that for a proper x one can obtain
!~
2 Rl\H..
e- p / 2pq/2-1 (1 + t 11=1
Mlln- II / 2) dp = O(n- 3/ 2)
uniformly in 8 E T and <p E II(q). Therefore (16.30) can be rewritten in the form
1
z;t = 2 (
z+6(m)
e- P/ 2pq/2-1
(2 _ )
+ LM n-"/2
1 l dp + r~4)n-3/2,
h 11=1
Yn+ =
1
2 (21r)-q/2 10
r+&(m) {
e- P/ 2pq/2-1 in 1(1,<p)
(2
1+ LM n-"/ 2
)
l d<pdp
o nw ~
(16.33)
and
uniformly in 0 E T.
Let us integrate (16.33) with respect to <po Thanks to the properties of the
trigonometric coefficients of the polynomial (16.31), the integral with respect to <p
of the polynomial 1(1, <p)M1 is equal to zero. In fact each coefficient of 1(1, <p)M1
contains:
Jo
(1) either one integral of the form 1r sina <pi cos b <pi d<pi, j = 1, ... , q-1, where
b = 1 or 3j
Jg
or (2) the integral 1r sina <pq-1 cosb <pq-1 d<pQ-1 with a = 1 or 3 and even b.
But such integrals vanish.
Let each
i-1
ui = II
sincpi cos <pi, i = 1, ... ,q
;=1
appear in M2 in degrees (li. As has only just been established, the integral with
respect to <p differs from zero if and only if all the (li are even numbers (in partic-
ular, zero). In such a case when integration with respect to <p is performed, in the
expressions (16.33) there appear the integrals
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 223
y+
n
= 1
2q/2r(q/2) 1. 0
z e-p/2pq/2-l(l+n-lp' (p))dp+O(8(m))
n .
An analogous representation also holds for Yn- . Analysing the coefficients Pj (9),
j =0,1,2,3, of the polynomial Pn(P) it is not difficult to establish that
(16.34)
where
,(m) _ ,(m)( (m) d(m) (m).
Ajk -Ajk C , ,e ,ll'l,1l'2,Pl, ... ,P6 )
224 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
are the coefficients characterising the functional r(m). Let us express with the help
of the system of equations (16.14) the quantities d(m), 11"1, 11"2, P1, ... ,P6 through
the variables c( m), e( m), 0:1, 0:2, /31 -/36 in the following way:
(16.35)
/36 = 3e(m) + 4/34 - 3/35,
P1 = 2' 4d(m)
P2
1
= 2 /31 - -t,
0: 2
P3 =
/35 - e(m)
16
Let us substitute the relations (16.35) into the coefficients >";7:) from equality
(16.34). The collection together of similar terms gives the coefficients )..Y::) the
form indicated in Table 3.3.
REMARK 35.1: For m = 2,3,4 the quantities 11"1,11"2, P1-P6 are given by (16.6).
Therefore in (16.34)
>";7:) = >";7:) (c(m), d(m), e(m)
and their values as functions of c(m), d(m) and e(m) can be set out in a table
analogous to Table 3.3.
Let us consider the question of Bartlett's corrections for the a.e. (16.22) of
Theorem 35. Bartlett's correction is designated here as a perturbation of the
argument of the dJ. of the functionals T(m), for which the term of order n- 1 in
their a.e. vanishes.
Let us write the a.e. for p;{T(m) < z} in a form that is more suitable for
applications. Let us denote
16
WJ m ) = L >.;7:)Pk, j = 0,1,2,3.
k=l
The quantities W)m) differ from the coefficients Pj of the polynomial Pn(P) in
(16.34) only by constant factors. There holds the remarkable equality
3
LW)m) = 0, (16.36)
j=O
which was discussed in another context in [47). We have (see Table 3.3)
3
L >')7:) = 0, k = 1, ... ,16,
j=O
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 225
Denoting
k-l
bkm) = Lwjm)
j=O
and once again taking (16.36) into account, we finally obtain
3 3
'~
" wj(m)G~ q+2j -- - 2'" b(m)
~ k gq+2k (16.39)
j=O k=l
Let Z = a2 and
ep(a) = (27r)-1/2 e -a /2 2
be the standard Gaussian density. Then combining (16.37) and (16.39) we obtain
p.(Jn {T (m) 2: a}
2 = ~2
G ( ) Vrn=
qa -
27r ()
3
k a L
b(m) q+2(k-l)
--;--=-'''-:-.,.---,-:;--,------,...,..
7rep a k=12(q/2)+k-lr(!(q+2k))
~
= G q(a 2) + hep(a)
{3 b(m) a2kr(q/2) }
8 - ~ 2k~1 r(~(q + 2k)) n- 1 + 0(n- 1 ), (16.41)
226 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
where
..j2-ff a q - 2
h = 2q / 2 f(q/2)'
Choosing 6 by equating to zero the term of order n -1 in the right hand side of
(16.41) we arrive at the expression
6_ ~ bim )a 2k f(q/2)
(16.42)
- ~ 2k - 1 f(!(q + 2k)) .
uniformly in () E T, where
(16.44)
To conclude the Section we shall give explicit expressions for the polynomials
M1 and M 2 :
M 1 (y)
13
- 20'6 (11'1 + 61r2)A i'JII(a,B)(')')II(i)(j)(o)
13 ..
- 20'6 (11'1 + 21r2)AtJII(a,B)(i)II(i)(j)(0)
13
+ 20'6 1r1II(a)(,B)(')'0)
-
13 (
60'6 11'1 + 411'2 ) A i'JII(ij)(a) II(,B)(')')(o)
13 ..
- 20'6 (11'1 + 41r2)AtJ II(a)(,Bi) II (j)(')')(o)
- ~
a
(1r~ + 1211'111'2 + 401r~)ArsII(r)(sa)II(,B)(')'0)
+ ~[-
a
(1r~ + 1011'111'2 + 321r~) + 2(p4 + P5 + 8p6)]A rs II(r)(a,B)II(s')')(0)
- ~
a
(11'1 + 411'2)(11'1 + 81r2)ArsII(rs)(a)IICB')')(0)
I .. kl
+ 40'8 AtJ A II(i)(j)(k)II(I)(a)(,B)
2
13 .. kl
+ 40'8 AtJ A II(i)(k)(a)II(j)(I)(,B)
2
13 .. kl
+ 80'8 AtJ A II(i)(j)(a)II(k)(I)(,B)
+ "/3
2a 4 71"1
A ab ArsII
(rs)(a)
II
(b)(o:)(,B)
"/3 AabII
- a 4 71"1 (ao:)(b)(,B)
-
"/3
2a 4 71"1
AabII (o:,B) (a)(b)
"/3 AabII
- 2a 4 71"1 (ab) (o:)(,B)
2
13 AijAklArsII II
- 12(16 (i)(k)(r) (j)(l)(s)
2
13
- 8(16 AijAklArsII
(i)(j)(k)
II (l)(r)(s)
13 AirAjsII
+ 2(12 11'1 (ir)(j)(s)
2
(1 2A ir Aj s II
+ 211'1 (ir)(js)
2
(1 2AirAjsAabII
- 211'1 II
(ir)(a) (js)(b)
+ (1
2(P2 - 1 2) AirAjsAabII (rj)(a) II(si)(b) } .
"211'1
In this Section the model (0.1) is considered with a regression function g(j,O)
depending on a scalar parameter 0 E e, where e is an open interval of the real
line. Let us assume that for the function g(j, 0) there exist continuous derivatives
with respect to 0 up to the fourth order inclusive.
Let us write
230 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
LIIgi. (j,9),
B
It is clear that these notations correspond to the notations for q > 1 introduced
earlier.
Let us consider three funtionals of the observations (0.1):
where
u(9) = n 1/ 2(8n - 9),
The Neyman-Pierson functional 7(1) and Wald functional 7(2) have already
been considered in the preceding Section. For the Gaussian (0, (12) r.v.-s Cj the
functional 7(0) is Rao's statistical criterion for simple hypothesis testing Ho : 9 =
90, where 90 is the presribed value of the parameter 9 ([189], Section 6e.2). The
important difference between the functional 7(0) and the two other functionals is
that it does not depend upon 8n , and therefore it seems to be more preferable
from a practical point of view. In fact, using the functional 7(0) we economise on
the procedure of calculation of the l.s.e. 8n .
No longer assuming that the Cj are Gaussian r.v.-s, we shall use the functionals
7(m), m = 0,1,2, for the construction of the criteria "iI!~m) for the solution of the
following problem of hypotheses testing.
Let us fix the number 6 =f 0 and consider a sequence of criteria "iI!~m) for a
testing by the observations X = (Xl, ... , Xn) of the simple hypothesis
Ho:9 = 90 , 90 E e
H6:9 = 90 + 6n- 1/ 2
The question is, which of these criteria "iI!~m) generated by the statistics (17.1)-
(17.3) are asymptoticaly locally more powerful, as n -+ 00, than their competitors
for properly equated errors of the first kind?
In this Section an answer is given to this question under the conditions of
regularity assembled below. For this we shall not assume that for the r.v.-s Cj
17. POWERS OF TESTS 231
there exists a (smooth) density p = P'. In this way the solution of the problem
can not be reduced to a direct application of the theory developed in the works
[46,48-51,65,66, 143] for two reasons.
Firstly, in the model (0.1) the distributions Xj differ in shifts, at the same time
as the investigations conducted for a repeated sample.
Secondly, in the works on statistical criteria mentioned, the conditions of regu-
larity, the expressions for the deficiency of the criteria which are calculated under
these conditions (see, for example, [65]) contain a density of observations and its
first derivative. We can not, however, use the concept of Fisher information being
within the framework of our regularity conditions.
A method of comparison of powers for a class of test stated below and two-
sided modifications of 'IJI~m) -tests with statistics (17.1)-(17.3) are representatives
of this class. The essence of the results consists in the following.
The critical regions of the criteria 'IJI~m) are chosen such that the errors of the
first kind coincide up to magnitude o(n- 1 ). It is shown that the powers of these
criteria begin to differ only with terms of order n- 1 , and that these differences are
determined only by terms of order n- 1 / 2 of the s.a.e. of the generating functionals
r(m). The choice of more powerful criteria essentially depends upon the symmetry
of the distribution of the r. v .-s Cj. If the cumulant of the third order together with
its coefficient of skewness /31 of the r.v. Cj are equal to zero, then the answer to
the problem coincides with the answer in the works [49-51], namely: a two-sided
modification of Rao's criterion is the most powerful of the three under consider-
ation. If 73 :f:. 0, then the answer turns out to be considerably more complicated
and any of the three criteria 'IJI~m), m = 0,1,2, can be the most powerful.
The conditions of regularity in this Section repeat the conditions introduced
earlier in Section 10 for q = 1, k = 4, and retain the previous numbering.
Let us write
II. For any R > 0 there exist constants C; = ci(R, T) < 00, i = 1,2, such that
(1) sup sup n- I / 2 d r () + c) ~ CI, r = 1,2,3,4;
(JET lul~RjuEuc(J)
VIII. There exists an integer h ~ 3 such that amongst any h vectors from the
collections {Yj,j = m + 1, ... , m + h}, 0 ~ m ~ n - h, n ~ h + 1, three vectors
Yil, Yh, Yja can be found such that the matrix
3
y(h) _
m -
!"
3 L...JY3'.. ()) Y3
'.. ())
i=l
Let us further assume that Tee is a fixed compact set containing the point
()otogether with some neighbourhood of it.
The functions r(m) , m = 0, 1,2, admit a s.a.e. analogous to Lemma 34.3 of the
preceding Section.
LEMMA 36.1: Let us assume that conditions I-IV are satisfied, and that for any
p>O
(17.4)
where
(I) the c(m) are r.v.-s having the following property:
(17.8)
where for q = 1
ho = Ab1 ,
+ ( "29 As II12
2 2 4 1 4
- 3 A II13 - "2 A II22
) 3
b1 (17.9)
Let us note further that under the conditions of the Lemma the functional ,(2)
admits a representation analogous to (16.7)-{16.9):
,(2) = ,[2] + t(2)n-3/2, (17.1O)
,[2] = u- 2 {Iu 2 + 2II12u3n-1/2 + {II22 + II 13 )u4n- 1}, (17.11)
Without specifying the dependence upon m, let us rewrite the principal part
of the s.a.e. (17.5) in the form
a 2,T' = Abi(1 + An- l / 2 + Bn- l ), (17.12)
where
LEMMA 36.2: Let the conditions (17.4) and I-IV be satisfied. Then for T = T(m),
m = 0,1,2,
(17.15)
and
r=l
{lbr (8)1 ~ c~r) logl/2 n},
be realised, where
c~r) = c~r) (T) < 00
are certain constants. Then by the binomial expansion
W = W + En- 3 / 2
where E is the r.v. for which
17. POWERS OF TESTS 235
where
ct; = ct;(T) < 00
is some constant.
Consequently if the event Xl is realised then the representation (17.15) holdsj
however, instead of (17.17) it is possible only to assert that
where "f8 is the cumulant of the 8th order of the r.v. Ojj the key role in the
presentation later on is played by the quantities
Ik = O'-k Jk = "fkO'-kn-......,.......
11 ... 1 , (17.19)
k
Let us set
where
(17.24)
(17.25)
For obtaining the a.e. of the probability of the error of the first kind and the
power of the modified criterion it is necessary to have available the a.e. of the
statistic
W = W(Oo,X)
under the null hypothesis 00 and the alternative
00 = 00 + 8n- 1 / 2 .
We obtain the expansion of the r.v. W for 00 , writing the expression (17.16) in
orders of n- 1 / 2
Let us denote
+ a2A5/2(00)II12(00)b~(00; X) }n- 1/ 2
17. POWERS OF TESTS 237
1
+2 (34A7/2 (90)TI22 (90)b13 (90; X)
+ ~ (35A7/290)TI13(90)b~(90;X)
(17.26)
Let us call the reader's attention to the fact that criteria using the statistics W,
in spite of their specific awkwardness, are simpler than generating criteria (except
Rao's criterion), in the sense that the r.v. W depends on 90 and the observations
X, and do not depend upon the estimator On. The last property drives them
together with Rao's criterion.
The expansion of W (modP~) is not difficult to obtain from (17.26) in the
following way. Clearly
(modP~)
Let us denote
A = u- 1[1/2(90 )6.
Substituting (17.27) and (17.28) in (17.26), and collecting similar terms, we obtain
P~-a.c.
(17.29)
where
2
W = L W;n- II / 2
11=0
238 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
and, (1):
W; = A+0'-lA 1/ 2b1, (17.30)
(2) W2' is a r.v. having the following property: there exists a constant C7 < 00
such that
(17.33)
If we omit the terms in (17.30)-(17.32) containing ,xr, r = 1,2,3, then we
obtain the expansion of the r.v. W for the null hypothesis. Let us thus underline
that all the functions of (J in the formulae (17.30)-(17.32) are calculated at the
point (J = (Jo.
From further arguments it follows that
sup Ip~ {W > u} - p~ {W" > u}1 = O(n- 3 / 2 Iog3 / 2 n),
u~O
and, analogously,
sup Ip~{W < -u} - P~{W" < -u}1 = O(n- 3 / 2 Iog3 / 2 n).
u~O
Let us denote
kj = kjo + kj1 n- 1/ 2 + kj2n- 1 + o(n-1), j = 1,2,3,4, (17.34)
the a.e. of the cumulants of the jth order ofthe r.v. W in the measure p~ . Then
using the s.a.e. (17.29)-(17.32) of the statistics W one can formally calculate the
quantities required later on
m"
kjv =L kjll (r),xr , v = 0,1,2, (17.35)
r=O
klO(l) = 1,
1
ku(O) = '2 (a1 + (2)lu ,
1
k12(2) = '2 (a1 + a2 + 1)lu,
ku (1) = (-18 a12 + 1 3
'2.81 + ) 102 + '23 (.86
'2.84 + .85)1101
240 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
k22(2) = (1'2 a1 + 4
1 a12 + '2/34
3 ) 102 + 3(/36 + (35)1101
{III
+ '2 a1 + a2 + '2 a1 a 2 + 4 a22 + 3(/31 + /32 + /3s) } In,
2
+ 3 {5'2 a1a2 + 4
3 a12 + 5 2
4 a2 + 3(/31 + /32 } 2
+ /3s) 111
(17.36)
17. POWERS OF TESTS 241
Let us remark in passing that for 8 = 0 (17.36) is turned into the equality
Po = pet + Po- , (17.37)
where Po is the error of the first kind of the modified criterion.
THEOREM 36: Let the conditions I-IV, VIII, and IX be satisfied. Then
2
P6 = L 'frvn- v/ 2 + 0(n- 1), (17.38)
v=o
where
and where !(x; 1, A2) is the density of the non-central X2 distribution with one
degree of freedom and non-centrality parameter A2 ,.
r ll rv
(2) 'frv = cp(a - A) L AjQvj(a) + cp(a + A) L(- A)jQVj(a), (17.40)
j=1 j=1
and the QVj(a) are polynomials in a with coefficients depending upon the cumulants
kj, j = 1,2,3,4, of the r.v. w.
Proof: Let us outline the proof of the Theorem. Let us use the Edgeworth expan-
sion of the d.f. of the vector v (0) E ]R3 and the a.e. of the statistic W, understood
as a function of the coordinates of the vector v(8) for the notation of the desired
probabilities ptin the form of integrals and the subsequent transformation and
estimation of these integrals. Such an approach is fruitfully used in the preceding
part of the book.
The calculation of Pt
can be carried out by at least two methods. The first
method consists of calculating the integrals arising in the process of proving the
Theorem by the method indicated above. The second method is the application
of the 8-method, described in Section 14, to the asymptotic normal statistics of
W.
Both methods, unfortunately, are equally laborious, although this is an intrinsic
peculiarity of the problem under consideration. In this Section the calculations
are carried out by the 8-method. F,ollowing [49-51] we find
pi = P~{W-A>zt-A}
2
1- c)(x) + cp(x) L Vv+(x)n- V/ 2 + o(n- 1 ), (17.41)
v=1
242 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
where
~(X) = i~ <p(t) dt, x = a- A,
and
3v
Vv+ = L.B;!'"vHr-1(x), v = 1,2, (17.42)
r=l
where the Hr(x) are the Chebyshev-Hermite polynomials of order r. The coeffi-
cients .B;!'"v in (17.42) have the form
1
.Bt = 6 k31'
.Bt; = -c+ + k12'
1
.B~ = 12 k21 k 31 ,
1 2
.B~ = 72 k 31 (17.43)
In (17.41) let us choose b+ and c+ from the condition V1+ = V2+ = 0 for A = O.
Then
(17.44)
Since
k12 (0) = k21 (0) = k32 (0) = 0,
17. POWERS OF TESTS 243
(17.45)
(17.46)
The calculation of P6- is realised completely analogously and leads to the re-
lations
and
a
p,-
0 = P~{W < z';} = "2 +o(n- 1 ), (17.48)
P-
6 P~ {W - ).. < z'; - )..}
2
= <p(x) + <p(x) L Vv-(x)n- V/ 2 + o(n-l), (17.49)
v=l
where
x=-a-)..
and
Tv
The relations (17.44) and (17.48) show that the errors of the first kind of all
the modified criteria are equal to
For the completion of the proof of Theorem 36 it remains to write out the poly-
nomials QVj(a):
5 k 2 (O) - (3I
- 36 I
k32(1) + S}
k42 (0) ,
31
+ { k12(3) - 2
1 k22(2) + (3I
k32(1)I
- 24}
k42 (0) ,
1 1 1 2
- (3 k31 (O)kll (2) + 12 k21 (l)k31 (0) - 72 k31 (0).
Let us consider the problem of the comparison of the powers of the criteria \I1~m)
based upon the statistics W. In the preceding part of the Section it was shown, in
17. POWERS OF TESTS 245
particular, that under the appropriate conditions of regularity these statistics have
identical limiting Gaussian (0,1) distributions for the null hypothesis and identical
accompanying Gaussian distributions (A, 1) for a sequence of close alternatives.
(We can speak of the limiting law (AO, 1) only in the event that there exists
lim I(()o)
n-+oo
= 10
Then
\ _
"0-0'
-lL01 / 2 () )
The quantity
e1 = 11"0 = I(aj 1, A2)
gives an idea of the local behaviour of the power curves of the modified criteria in
a neighbourhood of the point ()o. From (17.39) it follows that e1 does not depend
upon m. In particular, if there exists
then the preceding assertion means that all criteria considered have one and the
same Pitman efficiency, i.e., they are asymptotically equivalent.
In order to establish the distinguishability of the criteria it is necessary to
investigate their equivalence at a higher order. It is known [179,34] that for a
broad class of one-sided criteria the efficiency of the first order implies the efficiency
of the second order. Below we come up against the phenomenon, similar in form,
(the values 11"1 in (17.38) do not depend upon m) which, however, has a quite
different nature, and is associated essentially with the property of the modified
criteria under consideration to be two-sided ones. A corresponding fact does not
hold for the original one-sided variants of the criteria under consideration.
And so our goal is the calculation and comparison of the quantities 11"2 = 1I"~m)
THEOREM 37: Let the conditions of Theorem 96 be satisfied. Then the polynomials
Q/ljand the quantities b+, c+ defining the powers of P6 and error of the first kind
Po of the modified criteria w~m) can be represented in the form:
1
Qu = - -Iaa
3 '
6 Ia +'2 Iu ,
1 1
Q12 = (17.53)
Q21 = { - -1 01B1
8
2 - -1 01B2
2
+Ba ( --8'1 -185)} a 2
+Ba(~, -:6) ,
246 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE
b+ = {2 (a1 I I+}
+ (2)111 "6 a - "613 2 1 13,
+4
1 (a1 2
- a2 - 10)111 + 1
313111 ]} a3
for the rth and 8th criteria and determine the condition for which the rth criterion
is more powerful than the other two.
Expanding rp(a>.) by Taylor's formula we obtain from (17.40), (17.53), (17.54)
where
(17.57)
17. POWERS OF TESTS 247
And so the quantity ~rs to within 0(8 4 ) is a linear combination of the quantities
B1 and B2
!
Let d2,rs > 0 (this condition is satisfied for a 2 > and the proper enumeration
of the criteria, for which a~s) > a~r)). Then the inequality Drs 2: 0 is equivalent
to the inequality
B d 1 rs ( a(S) + a(r)) a 2 - 2
- 2 > - -' - =- 1 1
--'------:---;:-..!....-,-:---- (17.60)
B1 - d2 ,rs 2(2a 2 - 1)
Hence, knowing the coefficients aim) of the compared criteria, we find in the half-
plane OB 1 B2 of the regions of preference of the criteria w~m), m = 0,1,2 (see
Figure 3.1). A more subtle interpretation of the results obtained is presented in
the final section of Chapter 4.
The method of comparison of the criteria w~m), m = 0,1,2, as n -t 00 ad-
mits a generalisation to a broad class of two-sided criteria, the critical regions of
which are assigned by statistics of the form (3.4) and different sets of coefficients
{a1' a2, .81, ... , .86} In particular the criteria with generating statistics T repre-
sentable in the form (17.5) belong to the class. Let us denote by K(a) the class
of two-sided criteria wn of size an = a + o(n-1), the critical regions of which are
given by the statistics (17.26).
DEFINITION: We shall say that the criteria wt), w~) E K(a) are asymptotically
equivalent if
(32
Rao criterion; al =0
I
Neyman-Pearson criterion;
0!,=1
Wald
criterion; at =2
Figure 3.1: Regions of preference of the criteria 'l1~m). (a 2 > 2 (a < 0.1576)).
17. POWERS OF TESTS 249
Let us assume that the error of observation Ci has the property ,3 = O. Then
the difference of the powers of the criteria w!:'), w~) E K(o:) is determined by the
quantity d1 ,rB' Let the quantity o:~B) = 0:1 be given. Let us indicate that it is
the criterion w!:') which has the maximal power in the class K(o:). For this let us
define the point of the maximum of the function
And so the maximal power in the class K(o:) is had by the criteria for which
0:1 = a- 2 in the representation of statistics (17.26). The simplest criterion of such
a form is obtained for 0:2 = (31 = ... = (36 = 0 and is given by the statistic
(17.65)
r = pI + tn- 3/ 2 ,
(31 = 1, (32 = 3,
(34 = 49 (5 - 2cd, (35 = C2 -1, (17.67)
The linear theory of estimation by the method of least squares uses the language
of algebra and plane geometry. In the non-linear theory planes yield place to
surfaces and inference acquires a local character. Therefore the natural geometrical
language in non-linear regression analysis is the language of differential geometry
and tensor calculus. Nowadays an intensive geometric reinterpretation of the basic
concepts of mathematical statistics is made. One of the goals pursued in this
consists in the move from geometric invariants of statistical matters, alloted a
geometric structure, to invariant statistical inference.
In this Chapter we consider a series of questions about the differential geometry
of non-linear regression models, and we suggest a geometric interpretation of the
results about a.e.-s in Chapter 3. In the following the tensor sum notation is
extended also to summation over indices from 1 to n, etc ..
Let M be the basic model (0.1) of observations. The model M is embedded in the
'free model' S
X=g+c, (18.1)
where
251
For Gaussian Cj S(X,g) is, clearly, the logarithm of the probability density of
the vector X. Otherwise (18.2) is the initial formula of the geometric theory.
We shall consider S as a parametric family of functions S = {s(X,g)} which
forms an n-dimensional manifold with coordinate system g. The model M corre-
sponds to the family of functions
(18.4)
(18.5)
(18.6)
According to (18.4)-(18.6) the metric tensor Tij induced on M has the form
(18.7)
(18.8)
The metric Tab of the enveloping manifold S does not depend upon s E S, and the
induced metric Tij of the embedded manifold M depends on the local coordinates
oof the point m E M.
Let us denote
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 253
where
8 aj = 8 (8S(X,9))
8(Jj 8 ( ) = - (J' -2 gj ((J)
a, Pon -a.c.,
9 a 9=9(0)
i.e., P;'-a.c.
8 ij = (J'-2n 1 / 2 bij ((J) - (J'-2nII(i)(j)((J)
consists in the connection taking into account for a f 1, 1'3 f 0, the skewness of
the errors of observation Cj. Clearly
r~~)k
'3,
= ri3" 'k ,
and for all a
r~~k)
"3,
= ri3" 'k ,
if
1'3 = rn3 = 0
(in particular if the Cj are symmetric r.v.-s).
In Section 19 of this book the first terms of the a.e.-s of Chapter 3 are in-
vestigated in detail from the viewpoint of the exponential connection. A fuller
geometric theory of the a.e. in non-linear regression analysis, at any rate, in the
spirit of Section 18, is not available to date.
The exponential connection V = V(l) is expressed in Riemannian manifolds
by the metric tensor, namely:
1 (8Tjk 8Tik 8Tij)
r ij,k = 2 8(Ji + 8(Jj - 8(Jk (18.14)
Since the symbols rij,k are expressed in terms of Tij the connection V reflects the
intrinsic geometry of the manifold M. However, for an embedded manifold the
connection can also be defined externally.
254 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs
Let a covariant tensor field Vi = Vi(8) be given. Then its absolute (covariant)
derivative is given by the second rank tensor [191]
Let
Na = N~8a, a = 1, ... ,n- q,
be a basis of the space T;;{M) orthogonal to the space Tm{M). The derivatives
of the vectors 8i are written via the derivation formulae
where the Bij are the matrix elements of the second quadratic form, with the
corresponding direction Na [155].
The relations (18.16) characterise infinitesimally small alterations of the vectors
of the moving frame referred to itself [155,191].
If in (18.15) we set
Vi = 8i
then from (18.16) it follows that the covariant differentiation (18.15) means geo-
metrically the projection of
gij{a,0)8a = a-2nl/2bij(8)
onto T;; (M).
Let there further be given a contravariant tensor field Vi = Vi{O). Then the
covariant derivative
(18.17)
represents the tensor with one covariant and one contravariant component [191].
The operation of covariant differentiation is not commutative, namely: differ-
entiating covariantly equation (18.17) and then interchanging the indices j and k
we obtain
i - v;i
V3,ok k,3 - V'Ri3Ok ,"
0 -
where
(18.18)
is the rank four tensor called the curvature tensor (or the Riemann-Christoffel
tensor). Lowering the upper index of the tensor (18.18) we obtain the covariant
curvature tensor
80 k + rskp rp)
Rlk,ij = ( 8rii Ii rsj -
(8rti s p
88' + r lp r ki ) rsj (18.19)
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 255
Using the properties of the covariant differentiation operation, this tensor can
be written in a form more convenient for calculation [191]
rP r
R'k,ij = arli,j
a(}k - kj Ii,p - ----aoz + r
arki,j P
Ij
r
ki,p' (18.20)
where
A = 0'
2
AI'J A''k II(li)(jk) ,
B = 0'
2
AI'J AI'k II(lj)(ik) ,
D = r ik
0' 2Ap A ' jA II (Ii)(p) II (jk)(r)' (18.25)
For
q=dimM=1
the curvature tensor R'k,ij together with the Ricci tensor Rki and the scalar curv-
ature R become zero. In this connection let us consider one more concept of
curvature defining the 'extrinsic' geometry of the manifold M.
Let
No< = N~aa, a = 1, ... , n - q,
be an orthogonal basis of T~(M). Associated with the direction No< are the
principal curvatures kr, . .. ,k~ of the manifold M at the point m EM, and are
defined as the eigenvalues of the bunch of forms Bli - krij, i.e., as roots of the
equation
det(BO< - kr) = 0,
256 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs
where
Ba = (Bij) and r = (rij)
are the matrices of the second quadratic form corresponding to N a and the metric
tensor. The mean curvature in the direction N a is the name given to the quantity
[161]
q
ka = Lkf = trr-1B a =rijBij. (18.26)
i=l
Let us introduce the mean curvature vector [155,156] N = kaNa. Using (18.16)
we obtain
(18.27)
i. e., the vector N does not depend upon the choice of the basis of the space T~ (0).
We find the square of its length
The quantity
(18.28)
is called the Efron curvature. In contrast to R the curvature H does not become
zero when q = 1. By reason of its definition it is non-negative, i.e., H ~ 0, at the
same time as the curvature R can take values of both signs.
Measures of non-linearity are the characteristic numbers defining the extent of the
divergence of the non-linear regression model from its linear approximation and
the possibility of using this approximation in statistical inference. The Ricci and
Efron curvatures could serve as examples of measures of non-linearity. However,
the immediate use of the curvature R proves to be inconvenient because it is able
to admit not only positive but also negative values, and consequently can not be
used as an index of the non-linearity of a model M.
In the theory expounded there exists a clear correspondence between the basic
concepts of differential geometry of an embedded statistical manifold M and a
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 257
q-dimensional surface sq in IRn, given in (18.3). This fact is easy to explain in the
following way. Let
((1'-lgk(a,8)):=1
corresponds to the basis vector Ok E Tm (M). Let us introduce the n x q-matrix
Then
(1'-2 F' F = r = (rij)
is the matrix consisting of the coordinates of the metric tensor of M which coincides
with the metric tensor of the surface sq to within a factor of (1'-2.
The material of this section can be conveniently presented using the geometry
of sq. Let us consider the tangent plane of the surface sq at the point g(8):
We shall give the name of normal plane to the surface sq at the point g(8) to
the orthogonal complement Nn-q(8) of the tangent plane Tq(8). Evidently Tq(8)
corresponds to Tm(8) and Nn-q(8) corresponds to Tr*(M). Let us denote
H a = (gik(a,8))~,k=1 '
and let
P = F(F'F)-lF', p.l. = In- P (18.29)
be the orthogonal projection operators onto Tq(8) and Nn-q(8). Let us set
(18.30)
(18.31)
258 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs
Proof: Using the expressions (18.24) and (18.25) for the scalar curvature, we obtain
sequentially
(J"-
2
R = n- 1 At"k AI"J { II(ij)(lk) - II(ik)(lj)
(18.33)
We call the quantities K T , K N and K the tangential (geodesic), normal, and total
curvatures of the surface sq (of the manifold M). The tangential component KT
of the decomposition (18.33) is defined by the parametrisation of the model M
and becomes zero in a geodesic system of coordinates, i.e., r{j (0) = 0 at the given
point 0 for such a parametrisation of a regression function. In order that the
latter assertion be more obvious it is convenient to carry out a certain orthogonal
transformation of the sample space IRn.
Let us consider the orthogonal transformation
U = U(g(O))
of the space IRn defined by the matrix
in which
T=D'F'
is a (qxn)-matrix such that
(F'F)-l = DD'.
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 259
and let
n
Aj = LTja ca , j = 1, .. . ,q, (18.35)
a=l
n
Aj = LNja ca , j = q+ 1, ... ,n. (18.36)
a=l
By reason of the orthogonality of the transformation U we have
where
(18.38)
KT = tr (PuUGU')
q
= L { tr A~ - tr 2 Ai} , (18.39)
i=l
n
KN = L {tr A~ - tr 2 Ai} . (18.40)
i=q+l
Let us consider the Christoffel symbols
n n
rti2 = L Aik FakH~i2 =L (DT)iaH~i2 . (18.41)
a=l a=l
Let us denote
260 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs
Then
(18.42)
Since
ri(8) = 0, i = 1, ... , q,
for a geodesic parametrisation, then the equalities
Ak = 0, k = 1, ... , q,
8 = 8(0)
be a twice-differentiable one-to-one mapping of e into e,
are the Jacobi matrices of the mappings cp and cp-l. Also let
k = 1, ... ,q.
Let us set
F(8)~,
{)2 _ )q
H j (8) = (
()6i {)Ok g(j,8(8)) , _
t,k-l
Then
(F' (O)F(O))-1 = (~-1 D(8))(~-1 D(8))',
and
T(O).
8 2 0i 80 s 80 m 80i 8 2 0m
80 m 80 s . 80k . 80j + 80 m . 80j 80 k =0
we find
<}>'lJIi<1> + (<1>-1) ~ <1>m = 0, i = 1, ... ,q,
or
Consequently
(18.44)
D -1 = D- 1 <}>
finally we obtain
(18.45)
262 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs
8 = (8 , ... ,8q )
- iil -
(18.46)
From (18.46), with the aid of (18.42), we obtain the relation
ri - x..i ,T,r
-"l'r~' .= 1, ... ,q ,
.; (18.47)
connecting the Christoffel symbols of the surface sq with the required transform-
ation of the parameters. The equality (18.47) corresponds to the formulae (91.4)
of the book [191] and is suitable for when the mapping 8 = 8- 1 (8) is given. Then
the right hand side of (18.47) can be written in terms of the original parameter 8.
For this it is sufficient to calculate
Differentiating the curve Lq twice at the point A = 0 we define the vectors ai, b, E
IRn with the coordinates
i.e.,
b, = ( I
l H 3l
.)nj=1
b, = bT +bl",
where
The quantity
(18.48)
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 263
KT _ IbTl (18.49)
I - lad 2
we call the tangential (geodesic) curvature of the surface sq
at the point g(O) in
the direction l.
Let d E IRq be a vector of unit length: Idl = 1. Let us consider the direction
1 = Dd. Then
al T'd,
bl = (d,cjdf3=1 '
Ual = (d 1 , .. ,dq,0, ... ,0)',
Ubi = (d'A jd);=l . (18.50)
where ho is the first term of the a.e. of the l.s.e. obtained in Section 7.
In the quantities KT and Kf" let us substitute the random direction
E;lbd 2
B = E;l ad4
En8 JbIT J2 + En8 JbIN J2
= E;lad 4 E;l ad4
-T -N
= B +B . (18.53)
Since, now,
0'-21ad 2 = IPO'-1 c:12,
thanks to the idempotency of the matrix P, has a X~ distribution, then
(18.54)
Therefore
1
B = q(q + 2) trG,
BT 1
= q(q + 2) tr PG,
BN 1 .L
= q(q + 2) tr P G. (18.56)
Let us set
r=Tc: and l = Dr.
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 265
B = 1,",
q(q + 2) L..J (2 tr Ai
22)
+ tr Ai
1
= q(q + 2) tr A,
(
1 ~
2) L..J (2 tr Aj
2
+ tr2Aj) , (18.58)
q q+ i=1
-N
B = ( 1 2)
q q+
t
j=q+1
(2 trA~ + tr 2 A j ). (18.59)
Let sq(l) be a sphere of unit radius in the space IRq, ds the Lebesgue measure on
the surface of the sphere sq(I), and ISq(I)1 the area of sq(I). As Bates and Watts
(25) showed, Beale's measure of non-linearity (18.56) is linked to the curvatures
Kbd and K ~d by the relations
BT
=
-N
B =
Let us set
BT = trPG, (18.60)
The quantities (18.60) are different from the measures of non-linearity introduced
by Beale only by numerical factors, and have a structure analogous to the curv-
atures KT and KN. The latter circumstance allows us to introduce the following
generalisation of the relations (18.32) and (18.60).
Let 151 , 152 E IRl and
where
(18.61)
266 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs
The quantities
q
we call the generalised curvatures of the surfaces sq (of the manifold M). For
81 , 82 ~ 0 the generalised curvatures C T and C N are measures of non-linearity.
According to (18.24) and (18.28) can be written as
R = n- 1 {(A - D) - (B - Cn,
Hence with regard to the equalities
n tr G(8) = 81 A + 82 B
= (8 1 D + 82 C) + (81 (A - D) + 82 (B - C))
we obtain
cT = n- 1 (81 D + 82 C), (18.64)
COROLLARY 39.2: The Efron and Ricci curvatures of the manifold M are related
by the inequality
H+R~O.
(18.66)
An object of any sort which is not changed under a local coordinate transformation
is said to be invariant in differential geometry. For example, any point m E M
is invariant because a transformation changes its coordinates () = (}1, ... ,(}q) but
does not change the point.
Every system of points is an invariant, every functions of points is also an
invariant. If such a function in one coordinate system () has the form T(}), then in
a new coordinate system iJ linked to the system () by the relation (}i = <pi (iJi , ... ,iJq )
it is represented in the form
(18.69)
depending on a finite number of formal variables wo , 8w o /8(}i, etc .. Then I(w, ())
is some function of the variables (}i.
DEFINITION: ([2J, p. 207). If, for a structure wand any coordinate system iJ =
() , ... ,
-1 -
(}q) ,
(18.71)
In particular,
Wij = E8i 8 j = rij
Let
(18.72)
The matrix G = (Gap) is the Gram matrix of the system of vectors Fa, a
1, ... ,n.
Let us introduce the parameter
s= r8 G~{2(U) du,
180
(18.73)
corresponding to the length of arc of the curve 8 1 between the points go, 9 E S1
that have local coordinates 00 and O. Let us write {ei(s)h=1, ... ,n for the Frenet
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 269
frame of the curve SI at the point 9 E SI. The rate of its change as 9 moves along
SI is given by the Frenet formulae [191]
ko(s) = kn(s) = 0,
The functions k i (s), i = 1, ... , n, are called the curvatures of the curve SI. The
kr
quantity (]"2 (s) coincides with the Efron curvature (18.28).
Let us denote by tl k , k = 0, ... , n, the principal minors of order k of the Gram
matrix G. By definition we set tlo = 1.
THEOREM 40: The equalities
k,s
2( ) = tli-l(S)tli+1(S)
tlHs)' i=I, ... ,n-l, (18.74)
hold (for simplification of the equations we also omit the argument s below).
i-I
where the Ctik are chosen from the orthogonality condition of the vectors Ei =
(Ef) to the vectors E l , ... ,Ei- l . Then from (18.75)-(18.77), with regard to the
normalisation ei = Ei/IEil, we obtain
=f
if i > j,
if i = j,
otherwise.
Then from (18.77) we obtain
i.e.,
G=AEA'.
Rewriting the latter equality in the form
It is not difficult to see that a relation of the type (18.79) can be extended to
all the principal minors of the matrices E and G. Since E is a diagonal matrix we
have
k
Ak = II(Ej , E j },
j=l
A2(8)
k~(8) = AH8) .
(18.81)
Proof: Equations (18.80) follow from (18.74) taking into account that Al(S) = 1,
G 12 (S) =
O. The relation (18.81) follows from the equalities
A.(s) = Ai(8) i ~ 2,
, A~(i-l)(8) ,
Let us consider the quantities B1 ,132 , B3 which are given by equations (17.20)-
(17.22) and appear in the formulations of Section 17.
THEOREM 41: The quantities
B1 = 102 - 1?1 ,
Let
o:e-te
be a diffeomorphism defining a reparametrisation of the curve 51. From the rela-
tions
A(O) = (dUr
dO A(O),
- k
TI ll ... 1 (0) = (~:) II ll ... 1 (O)
D = A1/ 2 ~
dO'
then the invariance of B2 is a consequence of the equality
B
2
= ~A1/2
3
~I
dO 3
Let us observe that 13 is the coefficient of skewness of the r. v. b1 (0). The
invariant B1 is always non-negative, and the invariant B2, describing the rate of
change of 13 as a function of 0, can take any value, with B2 = 0 if 1'3 = o.
272 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs
The proof of Theorem 41 is not complete, since there remain some unexplained
questions about the system of basis quantities from which the statistical invari-
ants of the regression model with scalar parameters are constructed. Clearly, the
collection of quantities Jil ... ik given in (17.18) can serve as such a basis. A more
precise answer is contained in the Subsection that follows.
Using the definitions of the affine connection r~j and the tensors Wil ... ik according
to (18.11) and (18.71), let us rewrite the system of quantities P k , k = 1, ... ,16,
from the formulation of Theorem 35 of Section 16 (see equation (16.22)) in the
following form:
is jr
Pl nr r Wijrs,
is jr kt
P2 nr r r WiktWsjr,
is jr kt
P3 nr r r WijkWsrt,
P4 = n 2r is r jr (-6
(J' 1'3 n(is)(j){r) ) ,
P5 n 5 r is r jr (-6
(J' 1'3 n(ij)(r)(s) ) ,
P6 = nr r
is jr
Wisk
rk
jr'
P7 nr is r jr Wijk rkrs'
Ps = nr r
is jr
Wijs rk'
rk
Pg u
nri j rij rvuv'
P lO = u
nr i j r iv rvju'
Pn = u
nr i j riu rvjv'
P 12 = nr is r j r ris,u rujr'
P14 = n 2r is r jr (
(J'
-2n (is)(jr) ) '
P15 n 2r is r jr (
(J'
-2rr (ij)(rs) ) ,
P16 n 2r is r jr (
(J'
-2rr (i) (jrs) ) . (18.82)
Table 4.1: The correspondence between Ii1 ... i" and Pj.
q=l! q~2
14 PI
[.2 P2, P3
3
121 P4, Ps
13111 P6, Pr, Ps
Ifl P9 - P13
102 P14 , PIS
hOI P16
q> 1. The correspondence between the quantities (17.91) and (18.82) is listed in
Table 4.1. Consequently the quantities I h ... i " contained in it are a basis for q = 1.
The quantities PI, P2, P3 are scalar invariants of the model M, since they re-
present its contraction of the tensors r ij and Wil ... i" , k = 3,4. Indeed, an arbitrary
tensor under transfer from the coordinate system () to 9 is transformed according
to the law
(18.83)
The linear combination aPl + {3P2 + ,,(P3 is the multi-dimensional analogue of the
invariant B3
Further, it is not difficult to see that
(18.86)
In the theory considered below the normal McCullagh curvature [155]
Y = H +2R
= n- 1{(P12 + 2PlS) - (P14 + 2P13)}. (18.87)
274 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs
Proof: Let
8 = 8(9) : e -t e
be a diffeomorphism specifying the reparametrisation of the model M. Then
j k j
II(a)({3'Y) = i
II(i)(jk)el>a el>{3el>'Y + II(i)(j)el>ai el>{3'Y'
j k
II( a )({3)( 'Y) = i
II(i)(j)(k)el>a u el>{3el>'Y '
Hence we obtain
(18.90)
(18.91)
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 275
q=1 q~2
= n- 1{P4 + 2Ps ),
= n-1{Pa +Ps),
is_ir kt awrs
r 1- r Wijk ao t = 2n- 1P7,
is jr kt aWkt
r r r Wijs aO r = 2n- 1 ps,
(18.92)
is some function of the tensors Wij and Wijk. Since, according to (18.90) and (18.91)
we have
Q=Q,
then the validity of the assertion being proved follows from (18.69), (18.70), and
(18.92).
The scalar differential invariant Q is a joint invariant of the tensors Wij and
Wijk.The quantities
(18.93)
are merely the scalar invariants with respect to the reparametristion, with
(18.94)
tn = E(jnl/2(On - e)
In its turn, from (19.1) and (18.42) it follows that in the language of 'tangential'
matrices Ai, i = 1, ... , q,
t nk = __21 n 1 / 2 D kltrA+o(n-
t
1)
, k = 1, ... ,q. (19.2)
The relations (19.1) and (19.2) show that the bias tn depends upon the parametris-
ation of the model and can be made equal to zero within terms of order o(n- 1 )
upon passage to a geodesic coordinate system.
Turning to the a.e. (12.24), for the correlation matrix of the l.s.e. On we obtain
the expression
and
(19.3)
In order to make the expression (19.6) more intuitive let us take advantage of the
equation
8
U
-2
nrJ
k
8(}k r iij = - PlO - PI3 + PIS + P16. (19.7)
e = - P7
1
+ P 9 + P lO + 2 P I3 , (19.10)
2 Ok 8
(* = - u - r J 8(}k r'ij
(19.11)
(19.12)
(19.13)
Let us introduce the normed bias
tn = tnD;;I/2
Hence from the definition (18.64) of the measure of non-linearity CT in the one-
dimensional case it also follows that
CT = n- I (c5I + c52)U 2A3m2
Equalities (19.14) and (19.15) give the statistical interpretation of the tangential
measure of non-linearity CT. For q ~ 2 there exist no simple relations of the type
(19.15).
Let
1'3 = O.
Let us turn to the expressions (N and (T in the language of the matrices Ai,
i= 1, .. . ,n.
By Theorem 39 of the preceding Section
n
(N = nY = na 2 2: (2 tr A~ - tr2 Ai). (19.16)
i=q+l
n
(DD')i2h 2: (DT)haHi;h(DT)itbHLl
a,b=l
n
2: Tpa (D~i2 Hi;h Dhk) Tmb (D~il HLl Dit m)
a,b=l
q n n
= 2: 2: Tpa C~k 2: TmbC!m
k=la=l b=l
q
= 2: Ap,kkAm,pm
k,p,m=l
q
= tr Ap 2: Am,pm
m=l
2: Ap,kmAm,pk
k,p,m=l
19. GEOMETRIC INTERPRETATION OF AEs 279
= L {AmAp}pm
m,p=1
(19.18)
We find the quantity ~P13 from equations (18.62) and (18.64). Finally we obtain
Let us comment from the geometric viewpoint on the form of the coefficients of the
polynomials (13.69) and (13.70) of the a.e. of (13.44) of Theorem 29 of Section 13.
First of all, all quantities which contain PI, P2 and P3 do not enter the set of basis
variables PI-PI6. Consequently the complete interpretation of the a.e. (13.44) in
the spirit of Subsection 19.1 is not possible, and we restrict ourselves only to some
remarks. Clearly,
= L (DD\j FaiFbj
a,b=1
n
= L D:niFlaD:njFjl
a,b=1
n
= L T~mTmb
a,b=1
n
= L Pab. (19.20)
a,b=1
is the sum of the elements of the matrix of the orthoprojection on the tangent
space Tm (M) (or in terms of the surface sq on the tangent plane Tq (0)).
On the other hand,
- 20' -2 nr hhril II
i2ia (ia) -
Ai lillI (hil) , (19.21)
P3
= 0'
-4 2 idl i2h
n r r riaili2 II (M II(h) II (ia)
- Ai dl Ahh II (M II (Ma) II (is). (19.22)
280 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs
P2 = - A d1II (idd'
i
P3 (19.23)
'furning to the a.e. (13.34) of Theorem 28 of Section 13 and the a.e.-s (15.37)
and (15.38) of Theorem 33 of Section 15 let us note that the function z(B), defined
by (15.39), is the difference of two quantities Zl - Z2, where Z2 coincides with
(19.24), and
= a- 2 nr i131 r~2.
'131
II(.32 )
n
L D i1m D:nil Di2rD~hH~il Fai2 Fbh
a,b=l
n
L (D~hF;2b) (D~i2F:2a) (D:nil H~il D j1m )
a,b=l
n q
L L TrbTraC!m
a,b=l m=l
n
LTrb tr Ar
b=l
n
L T~rTra tr C a
a,b=l
n
L Pab trC a . (19.25)
a,b=l
19. GEOMETRIC INTERPRETATION OF AEs 281
Let us consider the a.e. (14.1) of Theorem 30 of Section 14. The polynomials R1
and R2 of this expansion do not depend upon the parameter 0, and the polynom-
ial R3 contains the function Y(O). Rewriting equation (14.5) for Y(O) through
the basis quantities (18.82), we ensure that Y(O) coincides with the McCullagh
curvature (18.87). Since R = 0 when q = 1, then in this case
Y = H = n-1B1 = 0'2k~(O)
is the Efron curvature, or to within a factor 0'2 is the square of the first curvature
of the curve (18.72).
is invariant, since it is a function of the three points X E IRn, g(O),g(9n ) ESq. T(4)
is invariant analogously. On the other hand T(2) and T(3) are not invariants, and
this is reflects on the properties of the a.e. of the d.f. of the functionals T(1LT(4).
For the formulation of the geometric results let us rewrite (16.37) in the form
3
p;{ ,(m) ~ Z1-Q} = a + n- 1 :L>;m)gq+2j(Z1-Q) + o(n-1), (19.26)
j=1
where
L J.Ljk(m)pk,
16
(m)
cj = j = 1,2,3,
k=1
(m) - 2..\(m)
J.Llk Ok ,
(m)
J.L2k = _2(..\(m)+..\(m))
Ok 1k' (19.27)
(m) 2..\(m)
J.L3k = 3k .
The numerical coefficients J.L;7:) obtained from Tables 3.2 and 3.3 are given in
Table 4.3.
Let us set
S3 =
~
00
~
~ 3
1 [-4 r[6 [ 81 7 [ 9 [10 rl~rLl14115116
1 1 1 1 1 1 1 1 1
1 -4 4 6 -2 0 2 0 0 0 0 0 4 -2 -4 2 0
@
1 1 1 1 1 >
1 2 4 -2 -"3 2 1 -2 -1 0 0 0 0 0 0 0 0 0 '\j
1 1
3 0 4 '6 0 0 0 0 0 0 0 0 0 0 0 0 0 ~
::x;,
1 1 1
1 -4 4 -1 0 1 0 0 0 0 0 1 -1 -1 1 0 ~
'6
1 1 1 -~ C':l
2 2 4 -2 -"3 1 2 2 -2 -1 -2 -3 -1 -1 -2 1 2 2 tr1
1 1 1 1 1 o
3 0 4 '6 0 0 2 1 1 1 1 1 4 2 0 0 0
1 1 1 1
-1 0 1 0 0 0 0 0 1 -1 -1 1 0
~
-4 4 '6
1 1 1 1 ~
C')
3 2 4 -2 -"3 1 2 -2 -2 1 0 1 -1 0 0 0 0 -1
1 1 1 1 ;g
3 0 4 '6 0 0 0 -1 -1 1 1 1 4 2 0 0 0
1 1 1
1 -4 4 '6 -1 0 1 0 0 0 0 0 1 -1 -1 -1 0
1 1 1 1 1
4 2 4 -2 -"3 1 2 -1 -2 0 0 0 0 -4 -2 1
4
1
2 0
~
~
1 1 ?;5
3 0 4 '6 0 0 0 0 0 0 0 0 0 0 0 0 0
~
>
~
19. GEOMETRIC INTERPRETATION OF AEs 283
uniformly in f) E T.
The proof is obvious.
(19.32)
Let s~/a2 be a statistic that does not depend upon ,(I) and has a X~ distribution.
Then from (19.32) it follows that
Let Ua be the quantile of the distribution Sq,r' Then from (19.33) a formula
follows, generalising (19.31):
The relation (19.34) only differs notationally from equation (A1.26) of Beale's work
[28].
Let us note that the quantities P6-P13 contain the Christoffel symbols of the
second kind r~j' which vanish in a geodesic coordinate system.
THEOREM 44: Let Cj be Gaussian r.v.-s. Then in a geodesic coordinate system,
under the conditions of the Theorem 43,
and the tail of the distributions T(1) and T(4) satisfy {19.30} and {19.31}.
Proof: The proof of this assertion also is not difficult.
19.4 GEOMETRY OF THE STATISTICAL CRITERIA FOR TESTING
HYPOTHESES ABOUT NON-LINEAR REGRESSION PARAMETERS
0:8-+8
where
g(j,O) = g(j, 0(0)),
the conditions II, III, IV, VIII, and (17.4) of Section 17 are satisfied.
19. GEOMETRIC INTERPRETATION OF AEs 285
Here it is important to remark that not all the enumerated conditions for the
reparametrisated model (19.37) automatically follow from the analogous condi-
tions for the initial model (0.1). This remark relates to every result of Chapter 4
associated with the reparametrisation of the model (0.1), i.e., with the passage to
another local coordinate system. Rigorously speaking, these results are true only
for regular reparametrisations.
Let us consider the statistical experiment {JRn, En, P;, 8 E e} generated by
the observations (19.37), and let us introduce for the reparametrised model (19.37)
the class of criteria K(a) analogous to the class K(a) of Section 17. It is easy
to establish a one-to-one correspondence between K(a) and K(a) if the criterion
\[I n E K (a) with the statistics Wand set of coefficients {a1' a2, (31 , ... , (36} is set
correspondence with the criterion \[In E K(a) with statistics Wand the same set
of coefficients.
We shall say that \[In E K(a) is a criterion with the statistics W that is
invariant under a regular reparametrisation 8, if, for any Bo E e and 80 = 8(Bo)
(19.38)
as n -+ 00. Henceforward, for simplicity we shall call such a criterion \[In E K(a)
invariant.
THEOREM 45: The criterion \[In E K(a) is invariant if and only if
Proof: For distributions of the statistics W and W one can write the Edgeworth
expansion with remainder term o(n- 1 ) and compare the first terms of the expan-
sions. Since they depend upon thecumulants kjv(O) of the statistics Wand the
cumulants kjv (0) of the statistics W, j = 1,2,3,4, v = 0,1,2, respectively, then
the invariance condition is the condition that kl1 (O), k20 (0), k22 (0), k31 (0), and
k42 (0) (the remaining kjv(O) = 0) depend only on statistical invariants. The rela-
tions (19.39) ensure that this conditions is satisfied. Let us note that b, c are
statistical invariants if a1 + a2 = O.
COROLLARY 45.1 There exists an unique u-representable criterion, which is defined by the
coefficients
27
(32 = - 9, (33 = 4'
286 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs
3
{34 = - -,
4
{35 = -1, {36 = 1,
1
CI = 1, C3 = -.
3
W = u-IAI/2(90)bl(90;X)
+ 2~2 u-IA3/2(90)bl(90;X)
Proof: The assertion (1) follows from the relations a~m) = m, m = 0,1,2, and
Thea- rem 38. For the proof of the remaining assertions it is sufficient to mention
19. GEOMETRIC INTERPRETATION OF AEs 287
examples of the relevant criteria. Let us enlarge the list (17.1)-(17.3) by the
functionals:
7;.(0) (1-2 A(On)b~(Oj X), (19.41)
(19.44)
7;,(2) = (1-2cp(On'O)
0 0
1 -2 -2 7 -1 -1
9 1 1 1
1 0 1 -1 1 -3 '4 -'4 -3 3
1 1 1
2 1 -1 1 -2 1
9 1
2 0 2 -1 3 -6 '4 -3 1
45 4
1 2 -3 3 -12 T -1 -3 1
27 3
2 2 -2 3 -9 T -'4 -1 1
3 2 -2 3 -9 7 -1 -1 1
Appendix
I SUBSIDIARY FACTS
For ease of reference, in this Appendix a series of results is included on which the
presentation of the principal material is based. In many cases an assertion is given
in a 'uniform' form.
Let us consider the independent r. v .-s ~l' ... , ~n and let us assume that Sn =
E ~j. The following assertion is owed to Petrov ([172] pp. 52-54) and strengthens
Bernstein's Inequalities.
THEOREM A.l: Let there exist positive constants rl, ... , rn and R such that
Then
j = 1, ... ,n.
Let us set
289
290 APPENDIX
Then
EISnl B < X(S)(MB,n + B~/2), s ~ 2,
The first inequality is owed to Rosenthal ([173] p. 86), the second to Berry
and Esseen ([173] p. 98). In particular, for the r.v. {j = where j ~ 1, gj{j, gj,
is a sequence of numbers and Cj a sequence of independent identically distributed
r.v.-s, we obtain
p{ sup
1' ,1" EFnvo (Q)
11J(u') -1J(u")1 > eJ : ; Xo ( sup
1EFnvo (Q)
l(U)) Qqhm-qe- B ,
11'-1"lo~h
where the constant Xo depends upon s, m and q and does not depend upon Q, h, e
and the set F.
In particular, when the conditions outlined above
p{ sup
1',1"EFnvo(Q)
11J(u') -1J(u")1 > e} ::; it'o ( sup
1EFnvo(Q)
I(U)) Qme- B ,
where
en = o{n-B+1)
and is independent of r, and m{n-1Sn) is the median of the r.v. n-1Sn'
.
The assertion (I) of Theorem AA coincides with Theorem 28 of the book [172]
p. 286. Assertion (2) is proved in the same way as Theorem 27 on p. 283 of this
same book (Billinger, Baum and Katz). See also the work of Nagaev and Fook
~~.
8 E 9, j = 1, ... ,n, n ~ 1.
Then
where
an ~ {8 - 2 + 8)1/210g1/2 n
then
sup P;{n- 1
(JET
IL1]j l > c}
n ----+ 0,
n-+oo
(1) sup
(JET
L JI3:I~lm
f Pjn(J, dx) ----+ 0,
n-+oo
----+ 0,
n-+oo
(3) sup n- 1
(JET
L f
J I3:I<Tn
xPjn(J,dx) ----+ O.
n-+oo
THEOREM A.7: {{172} p. 272). Let ~n, n = 1,2, ... , be a sequence of independent
r.v.-s with zero mathematical expectation, and for some p, 1 ~ p ~ 2, and a
sequence an t 00,
n-+oo
~ EI~jIP
L..J a1! < 00.
j=l :J
1. SUBSIDIARY FACTS 293
Then
a~1 Sn - - t 0 a.c..
n-too
THEOREM A.8: {[172} p. 994). Let en, n = 1,2, ... , be a sequence of independent
r.v.-s with d./. Pj(x). Let the conditions
L: p{lejl ~ j} <
00
(1) 00,
j=1
be satisfied. Then
n- 1 Sn - - t 0 a.c..
n-too
j = 1, ... ,n, n ~ 1,
9 ET,
P3 = lim
n-too
sup P3,n(9)
(JET
< 00,
(1) b= 1 00
Ig'(t)lt q - 1 dt < 00,
r
}C.\C
g(lxl)dx ~ b (21l"(~/2)) .
r '2Q
r
}C\C-.
g(lxl) dx.
The full proof of Theorem A.ll is contained in Section 3 of the book [33]
and is associated with the names of Ranga Rao [190], Sazonov [221,222] and von
Bahr [10].
1. SUBSIDIARY FACTS 295
THEOREM A.12:
Let a r.v. Cj with c.f. 1f;(>.) have density p(x),
and let f.L2 < 00. Then the following inequalities hold:
(2)
The assertion (1) was obtained by Survila [207], and assertion (2) is owed to
Statulevicius [206].
Let f.L be a finite signed measure on (JRP, Bn). With the signed measure f.L there
are associated set three functions f.L+, f.L- and 1f.L1 which are called the positive,
negative, and absolute variations of the signed measure f.L, and
From the Hahn-Jordan decomposition [89] it follows that f.L+, f.L- ,1f.L1 are finite
measures on (JRP, BP) and that
The following Theorem indicates one subtle property of the absolute variation
1f.L1 of the signed measure
k-2
f.L = Qn((J) - L n- r / 2Pr ( -cI>j {;~:v((J)}), Po = cI>.
def
r=O
THEOREM A .13: On the statistical experiment {lRn , Bn, P; ,(J E e} let there be
given a triangular array {jn, j = 1, ... , n, n ~ 1, of random vectors, independent
in each row, with values in IRP , having zero means. Let us assume that:
(1)
296 APPENDIX
o ~m ~n-u, n ~ u+ 1,
where Xv(O) is the arithmetic mean of the cumulants of vth order (Ivl = 3, ... , k+
1) of the vectors K;1/2(8)~jn' j = 1, ... , n.
Theorem A.13 is a uniform variant of one of the assertions in Section 19 of the
book [33] (cl, p. 206, equation (19.100)).
= O(n-(k-l)/2).
the variational norm of the signed measure 1-". Then by Theorem A.13
have been used. The first ten polynomials Hs have the forms:
Ho(z) = 1,
H3(Z) = Z3 - 3z,
H5(Z) = z5 - 10z 3 + 15z,
= z6 - 15z4 + 45z 2 - 15,
H6(Z)
H7(Z) = z7 - 21z 5 + 105z 3 - 105z,
Hs(z) = ZS - 28z 6 + 21Oz 4 - 420z 2 + 105,
H9(Z) = z9 - 36z7 + 378z 5 - 1260z 3 + 945z.
The cumulant of any arbitrary order 'Yk of some r.v. c is expressed through the
moments ml, ... , ml of the r. v. considered, by the formula [172]
In particular,
and conversely
298 APPENDIX
If
C = Cj and
13 = m3,
then
e,
P PI' are measures induced by a sequence of observations Xl, ... , X n , .. of the
form (0.1).
E; (Et') is the mathematical expectation under the measure P (pf). e
D;e = E;e - (E;e)2 is the variance of the r.v. e.
E is the mathematical expectation under the measure P.
ms = Eej.
/Ls = Elejls.
"fs is the cumulant of degree s of the r.v. ej.
n
L=L
j=l
s~ = n- Le;.
1
8
gi = 8fJ i g.
U1, U2 E U~(O).
gh ... i r (80 h ~~ 80 ir ) g, r ~ 1.
h ... i r = gh ... dj,O + n1/2d~1(O)u),
a (a1, ... ,aq ) is a multi-index, with
lal a1 + ... + a q ,
g(O:) (j, 0 + n1/2d~1 (O)u).
810:1 )
(
(80 1 )0:1 ... (80 q )O:q g.
etc.
II. LIST OF PRINCIPAL NOTATIONS 301
Ps(it; {Xv(O)}), PB ( - CPK; {Xv(O)}) are polynomials of the a.e. of Section 10 (na-
tations are taken from [33]) .
-end of proof, end of example, end of remark.
Commentary
CHAPTER 1
1. A lot of attention was paid to questions of the existence of the l.s.e. in the works
of Demidenko [70,71), see also the work of Gallant [83]. The inequality (1.6) was
first made use of by Malinvaud [152] for the proof of the consistency of the l.s.e ..
2. The results of Section 2 are basically new. A result close to Theorem 2
is contained in the work of Dzhaparidze and Sieders [75]. Prakasa Rao [184]
considered the case of Gaussian regression. Exponential bounds for the probability
of large deviations of the l.s.e. for an infinite dimensional model of non-linear
regression were presented in the work of Kukush [146].
3. Power bounds of the probability of large deviations for the l.s.e. of the form
(3.4) for q = 1 were first obtained by Dorogovtsev [72] and then by the author
[124]. Theorem 8 was published in the author's work [128] and generalises a result
of Malinvaud [152]. Theorem 9 is owed to the author [129]. The property of
consistency of the l.m.e. was discussed in the works of Oberhofer [167), Kozlov
and the author [132,133] and others. Theorems 10 and 11 were published in a
note by the author [130].
4. This Section contains a revised form of the results of the author's work
[128]. A series of assertions about the consistency of statistical estimators of
the amplitude and angular frequency of harmonic oscillations is contained in the
book by Ibragimov and Has'minskii [120]. Other references may be found in the
commentary on Section 5.
5. In this Section the generalisation to the case of non-identical distibutions
of observations of the concept of minimal contrast defined in the work of Pfanzagl
[174] referred to by Kozlov and the author [133] is considered. The method of the
proof of the consistency, used in Theorem 13, goes back to Wald [208] and Le Cam
[216], and in conformity with various statistical estimates has been developed by
many authors: see the bibliography in the article by Wolfowitz [213] and the book
of Zacks [216), also the works ofPfanzagl [174), Dorogovtsev [73], Knopov [144] and
others. The problem of the detection of hidden periodicities in a setting similar
to that of our book was solved in the works of Whittle [211], Walker [209,210]'
Hannan [109,110], Has'minskii [120], Knopov [144], Kutoyants [147], the author
[126], and others.
303
304 COMMENTARY
The theme of Sections 2-5 is close to the work of Prakasa Rao [185-187],
van de Geer [86], Bunke [43], Jennrich [138], Hartley and Baker [112], and Birge
and Massart [35].
6. One example is considered at length, related to taking the logarithm of a
non-linear regression model, can be found in Rao [189]. Other examples may be
compared in the works of Demidenko [70,71] and Christonsen's monograpbh [67].
CHAPTER 2
7. The first s.a.e.-s of maximum likelihood estimators in a scheme of identically
distributed observations were obtained by Linnik and Mitrofana [149,150,162].
Then, for weakened restrictions such results for more general m.c.e.-s were ob-
tained by Chibisov [55-64] and Pfanzagl [176,177]. Michel [158,159] proposed
an approach related to the Newton-Raphson calculational procedure be used to
obtain the s.a.e. of statistical estimators. The s.a.e. of different estimators are
obtained in the works of Burnashev [44,45], Gusev [102], and many other authors.
In the work of Bhattacharya and Ghosh [32] broad regularity conditions ensuring
the existence of a s.a.e. of statistical estimators are given. The s.a.e. of the l.s.e.
for the model (0.1) was first obtained by Zwanzig and the author [136], and in
a more general form by Bardadym and the author [14]. Theorem 17 is a revised
variant of the results [136] and uses the scheme of proof in the work of Chibisov
[59]. Theorem 19 has not been previously published.
8. The bound in Theorem 20 for q = 1 for the m.c.e. in a scheme of independent
identically distributed observations was obtained by Michel and PfanzagI160]. For
the l.s.e. (q = 1) this result is contained in the work of the author [125]. The first
results on the approximation of the distributions of statistical estimators by a
Gaussian distribution with a rate O(n-l) (the Berry-Esseen Inequality) is owed
to Pfanzagl [175,178]. One sharpening of such results, related to the computation
of the constants in the Berry-Esseen inequality for the m.c.e., was obtained by
Michel [157]. The Berry-Esseen inequalities were also obtained by Prakasa Rao
[183,187], and Bhattacharya and Ghosh [32]. For the distribution of the l.s.e. of a
scalar parameter, using the method of [175] the author obtained [123] the Berry-
Esseen inequality. Theorems 21 and 22 substantially strengthen the results of [123]
and have not been published before. The property of asymptotic normality of the
l.s.e. of an infinite dimensional parameter was studied by Dorogovtsev [74] and
Kukush [146].
9. The asymptotic normality of the l.m.e. for linear and non-linear regression
was studied by Bloomfield and Steiger [37,38], Basset and Koenker [23], Gure-
vich [101], and van de Geer [87]. Theorem 23 was published in [129]. Various
generalisations of it in the case of correlated observations are contained in the
articles of Borshchevsky and the author [40,41]; c/., also the book of Leonenko
and the author [134]. The work [15] of Bardadym and the author is devoted to
the asymptotic normality of the la-estimators.
COMMENTARY 305
10. Latterly the theory of the a.e. of the probabilistic characteristics of statis-
tical estimators has been mostly constructed. In addition to the source literature,
mentioned in the commentary on Section 7, we refer to the important theoretical
investigations of Bhattacharya [30,31]' Chibisov [58], Hayakawa [113], Chandra
and Ghosh [47,48], Pfanzagl [180], Goetze [91], Skovgaard [205], and Pfanzagl and
Wefelmeyer [181].
An alternative to the expansions of Edgeworth of the distributions of statistical
estimators is the expansions obtained by the saddlepoint approximation method
and they have a series of useful statistical properties. From the large amount of
current literature on this problem we point to the theme of our book close to the
publications of Robinson et al. [139], Shuteev [204], and Jing and Robinson [139].
The application of the saddlepoint approximation to non-linear regression analysis
will be carried out in the near future.
The first variants of Theorem 24 were obtained by the author [124] for q = 1
and by Zwanzig and the author [135, 137] for q ~ 1. A full proof of Theorem 24
on the a.e. of a distribution of the l.s.e. with weakened regularity conditions is
published for the first time.
11. The calculation of the initial terms of the a.e. of Theorem 24 and others
is a subject of much applied interest, and is a laborious task. In the calculations
of Section 11 we follow the scheme of Michel's work [158]. Very helpful machinery
for a.e.-s in statistics is developed in the book by Barndorf-Nielsen and Cox [223].
CHAPTER 3
12. The a.e. of the moments of some statistical estimators of a scalar parameter
were obtained by Gusev [102] in connection with a problem of Linnik, see also the
work of Burnashev [45]. For the model (0.1) with Gaussian errors of observation
the first terms of the a.e. of the correlation matrix of the normed l.s.e. without
accounting for the remainder terms was found by Clarke [68]. The first terms of
the a.e. of the bias vector of an l.s.e. were obtained earlier by Box [42]. The Section
contains a revised version of results of the author [127].
13. Theorem 26 generalises a result of Bardadym and the author [13]. Theo-
rem 29 was published in a note by Ivanitskaya and the author [122] and is an essen-
tial strengthening of the results of Schmidt and Zwanzig [199,217]. Lemma 29.1
is a special case of more general assertions of Sadikova [197] and Yurinsky [219],
based on a Lemma of van Korput. Nevertheless, the proof of Lemma 29.1 uses
other conditions and is not based on van Korput's Lemma. One general fact
about the c.f. of singular distributions is contained in Bhattacharya [30]. Our
Lemma 29.2 is close to one of the assertions of Qumasiyeh [188].
14. The polynomials Rl and R2 can, in fact, be obtained within the framework
of the linear theory, see also the article by Schmidt and Zwanzig [199]. The
polynomial R3 was found by the &-method by Grigor'ev [92]. The results of the
Section were published in the work [97] by Grigor'ev and the author.
306 COMMENTARY
15. This Section contains, in a revised form, the results of the work of Bar-
dadym and the author [17-19,11]. Theorems 31 and 32 generalise and sharpen
the results of Zwanzig [217]. The work of Shao [220] borders on the theme of this
Section, and we also point to the monograph of Hall [105].
16. Theorem 35 is a basic result of the work of Grigor'ev and the author [99].
The preliminary assertions about the a.e. of the distributions of the functionals
7(4) and 7(1) were obtained by Bardadym and the author [12,16]. The first a.e. of
the distribution 7(1) was in fact found by Beale [28]. The X2 a.e. for the likelihood
ratio and other statistics is contained in the principal work of Chandra and Ghosh
[47], see also the work of Hayakawa [113].
The results of the section together with the corresponding results of Chapter 4
immediately border upon the problem of the construction of intervals and regions
of confidence for the non-linear regression parameter. At different times this prob-
lem was considered by Beale [28], Halperin [106], Hamilton, Watts and Bates [24],
Khorosani and Milliken [143], Pazman [168], Bates and Watts [26], Hamilton [107],
and others.
17. We follow the account in the article of Grigor'ev and the author [100].
The extension of the results of this Section to the case q ~ 2 is more likely to be
associated with difficulties that are calculational rather than of principle.
CHAPTER 4
18. Many publications have been dedicated to the development of geometric
methods in mathematical statistics. We must mention the works of Chentsov [53],
Amari [3-5], Barndorf-Nielson and Cox [21], Barndorf-Nielson, Cox and Reid [22],
McCullagh [155], McCullagh and Cox [156], Efron [76,77], Eguchi [78], Saville and
Wood [198], Murray and Rice [164], to name but a few, see the surveys by Kass
[141], and Morozova and Chentsov [54].
The presentation of Section 18 is based on the texts of the theses of Grigor'ev
[93] and the author [131] and their joint publications [94,95]. Subsections 1 and 2
are presented in the spirit of the approach of Amari [4,5], see also [156]. The
measures of non-linearity of a regression model were first studied by Beale [28]
and then by Guttman and Meeter [103], Bates and Watts [25,26], Bates, Watts
and Hamilton [24], and Hamilton and Wiens [108]. The reparametrisation of non-
linear regression models was studied by Haugaard [116,117], Kass [140], Ross [195],
see also the works [25,26] cited above. A series of geometric questions in the theory
of non-linear regression was considered by Clarke [69], Pazman [168,169] (see also
his works [170, 171]). The proof of Theorem 40 is borrowed from [93].
19. Box [42] indicated the possibility of decreasing the bias of the l.s.e. with the
help of the reparametrisation of a model. The representation (19.8) was obtained
by Grigor'ev and the author [95]. The presentation used here for the material of
Subsection 1 is based on the theses of Grigor'ev [93] and the author [131]. The
result of Subsection 2 on the invariance with respect to the reparametrisation
COMMENTARY 307
309
310 BIBLIOGRAPHY
[13] Bardadym, T.A. and Ivanov, A.V. (1986) Asymptotic Expansions Related
to the Estimation of the Error Variance of Observations in the 'Signal Plus
Noise'Model, Theor. Prob. Math. Statist., 33, 11-20
[14] Bardadym, T.A. and Ivanov, A.V. (1985) On a Polynomial Approximation
of Least Squares Estimator Using Non-Standard Normalisation, Dokl. Akad.
Nauk Ukrain. A, 7, 59-61 (in Russian)
[15] Bardadym, T.A. and Ivanov, A.V. (1988) Asymptotic Normality of la-
Estimator of Non-Linear Regression Model Parameters, Dokl. Akad. Nauk
Ukrain. A, 8, 68-70 (in Russian)
[16] Bardadym, T.A. and Ivanov, A.V. (1993) An Asymptotic Expansion of the
Distribution of Some Functional of the Least Squares Estimator, Theor.
Prob. Math. Statist., 47, 1-8
[17] Bardadym, T.A. and Ivanov, A.V. (1995) Asymptotic Expansions Connected
with Jack-Knife Estimator of the Error Variance of the Observation in Non-
Linear Regression Model. I. Ukrain. Mat. Zh., 47, 443-451 (in Russian)
[18] Bardadym, T.A. and Ivanov, A.V. (1995) Asymptotic Expansions Connected
with Jack-Knife Estimator of the Error Variance of the Observation in Non-
Linear Regression Model. II. Ukrain. Mat. Zh., 47, 731-736 (in Russian)
[19] Bardadym, T.A. and Ivanov, A.V. (1995) Asymptotic Properties of a Func-
tional Used as an Estimator of Observational Error Variance in the 'Signal
Plus Noise' Model, Dokl. Akad. Nauk Ukrain. A, 5, 69-72 (in Ukrainian)
[20] Bardadym, T.A. and Ivanov, A.V. Cross-Validation Functional Asymptotic
Expansion, Theor. Prob. Math. Statist., (to appear) (in Ukrainian)
[21] Barndorf-Nielsen, O.E. and Cox, D.R. (1986) Differential and Integral Geo-
metry in Statistical Inference, Some Aspects of Differential Geometry in
Statistical Inference, IMS Monographs, (Institute of Mathematical Statis-
tics: Hayward, CA)
[22] Barndorf-Nielsen, O.E., Cox, D.R. and Reid, N. (1986) The Role of Differ-
ential Geometry in Statistical Theory, Int. Statist. Rev., 54, 83-96
[24] Bates, D.M., Hamilton, D.G. and Watts, D.G. (1982) Accounting for Intrin-
sic Nonlinearity in Nonlinear Regression Parameter Inference Regions, Ann.
Statist., 10, 386-393
[25] Bates, D.M. and Watts, D.G. (1980) Relative Curvature Measures of Non-
linearity, J. Roy. Statist. Soc. B, 42, 1-25
BIBLIOGRAPHY 311
[26] Bates, D.M. and Watts, D.G. (1981) Parameter Transformations for Im-
proved Approximate Confidence Regions in Nonlinear Least Squares, Ann.
Statist., 9, 1152-1167
[27] Bates, D.M. and Watts, D.G. (1988) Nonlinear Regression Analysis and Its
Applications, (Wiley)
[28] Beale, E.M.L. (1960) Confidence Regions in Non-Linear Estimation (with
Discussion), J. Roy. Statist. Soc. B, 22, 41-88
[29] Bellman, R. (1960) Introduction to Matrix Analysis, (McGraw-Hill)
[40] Borschchevsky, A.V. and Ivanov, A.V. (1985) A property of the Optimum
Point in a Problem of Data Processing by the Least Deviations Method,
Dokl. Akad. Nauk Ukrain. A, 1, 53-56 (in Russian)
[41J Borshchevsky, A.V. and Ivanov, A.V. (1985) On the Normal Approximation
of the Distribution of the Optimum. Point in a Problem of Data Processing
by the Least Moduli Method, Kibernetika, 6, 86-92 (in Russian)
312 BIBLIOGRAPHY
[42] Box, M.J. (1971) Bias in Nonlinear Estimation (with Discussion), J. Roy.
Statist. Soc. B, 32, 171-201
[43] Bunke, H. (1977) Parameter Estimation in Nonlinear Regression Models,
Math. Oper.forsch. Statist.: Ser. Statist., 8, 23-40
[44] Burnashev, M.V. (1977) Asymptotic Expansions of Signal Parameter Es-
timators in White Gaussian Noise, Mat. Sbornik 104(146), 179-206 (in
Russian)
[45] Burnashev, M.V. (1981) The Investigation of the Second Order Properties of
Statistical Estimates in the Scheme of Independent Observations, Izv. Akad.
Nauk SSSR: Ser. Mat., 45, 509-539 (in Russian)
[46] Chandra, T.K. (1980) Asymptotic Expansions and Deficiency, (Doctoral
Thesis), (Indian Statistical Institute: Calcutta)
[47] Chandra, T.K. and Ghosh, J.K. (1979) Valid Asymptotic Expansions for
the Likelihood Ratio Statistic and Other Perturbed Chi-Square Variables,
Sankhya A, 41, 22-47
[48] Chandra, T.K. and Ghosh, J.K. (1980) Valid Asymptotic Expansions for
the Likelihood Ratio and Other Statistics Under Contiguous Alternatives,
Sankhya A, 42, 170-184
[49) Chandra, T.K. and Joshi, S.N. (1983) Comparison on the Likelihood Ratio,
Rao's and Wald's Tests and Conjecture of C.R. Rao, Sankhya A, 41, 226-
246
[50] Chandra, T.K. and Mukerjee, R. (1984) On the Optimality of Rao's Statis-
tics, Commun. Statist. - Theor. Meth., 13, 1507-1515
[51] Chandra, T.K. and Mukerjee, R. (1985) Comparison of the Likelihood Ratio,
Wald's and Rao's Tests, Sankhya A, 47, 271-284
[52) Chandra, T.K. and Mukerjee, R. (1991) Bartlett Type Modification for Rao's
Efficient Score Statistics, J. Multivar. Anal. 36, 101-112
[53) Chentsov, N.N. (1972) Statistical Decision Rules and Optimal Inference,
(Nauka: Moscow) (in Russian)
[54) Chentsov, N.N. and Morozova, E.A. (1991) Natural Geometry of the Families
of Probabilistic Laws, The Modern Problems of Mathematics. Fundamental
Directions, (VINITI: Moscow), 83, 133-265 (in Russian)
[55) Chibisov, D.M. (1972) Asymptotic Methods in Mathematical Statistics, (Doc-
tor of Sciences Thesis), (Moscow) (in Russian)
[56) Chibisov, D.M. (1972) An Asymptotic Expansion for Maximum Likelihood
Estimators, Teor. Veroyatnost. Primen., 17, 387-388 (in Russian)
BIBLIOGRAPHY 313
[68] Clarke, G.P.Y. (1980) Moments of the Least Squares Estimators in a Non-
Linear Regression Model, J. Roy. Statist. Soc. B, 42, 227-237
[69] Clarke, G.P.Y. (1987) Marginal Curvatures and Their Usefulness in the Ana-
lysis of Nonlinear Regression Models, J. Amer. Stat. Assocn., 82, 844-850
314 BIBLIOGRAPHY
[71] Demidenko, E.Z. (1989) Optimisation and Regression, (Nauka: Moscow) (in
Russian)
[75] Dzhaparidze, K. and Sieders, A. (1987) A Large Deviation Result for Para-
meter Estimators and Its Applications to Non-Linear Regression Analysis,
Ann. Statist., 15, 1031-1049
[76] Efron B. (1975) Defining the Curvature of a Statistical Problem (with Ap-
plications to Second Order Efficiency) (with Discussion), Ann. Statist., 3,
1199-1242
[81] Fikhtengolts, G.M. (1966) The Course of Differential and Integral Calculus.
Vol. II, (Nauka: Moscow) (in Russian)
[86] van de Geer, S.A. (1986) On Rates of Convergence in Least Squares Esti-
mation, Report, (Centre for Mathematics and Computational Science: Am-
sterdam)
[87] van de Geer, S.A. (1988) Asymptotic Normality of Minimum L1-Norm Es-
timators in Linear Regression, (Centre for Mathematics and Computational
Science: Amsterdam), MS-R8806
[88] Gelfand, 1.M. (1966) Lecture Notes in Linear Algebra, (Nauka: Moscow) (in
Russian)
[89] Gikhman, 1.1. and Skorokhod, A.V. (1977) Introduction to the Theory of
Random Processes, (Nauka: Moscow) (in Russian)
[90] Ghosh, J.K. (1991) Higher Order Asymptotic for the Likelihood Ratio, Rao's
and Wald's Tests, Statist. Prob. Lett., 12, 505-509
[91] Goetze, F. (1981) On Edgeworth Expansions in Banach Spaces, Ann. Prob.,
9, 852-859
[92] Grigor'ev, Yu.D. (1992) Asymptotic Distribution of Observation Variance
Estimator in Non-Linear Regression Model, Avtomat. Telemekh., 4, 37-43
(in Russian)
[93] Grigor'ev, Yu.D. (1994) Development and Investigation of Algorithms of
Non-Linear Regression Models Analysis, (Doctor of Sciences Thesis) (State
Technical University: Novosibirsk)
[94] Grigor'ev, Yu.D. and Ivanov, A.V. (1987) Asymptotic Expansions in Non-
Linear Regression Analysis, Zavod. Labor., 53, 48-51 (in Russian)
[95] Grigor'ev, Yu.D. and Ivanov, A.V. On Measures of the Non-Linearity of
Regression Models, Zavod. Labor., 53, 57-61 (in Russian)
[96] Grigor'ev, Yu.D. and Ivanov, A.V. (1991) Asymptotic Expansion of Power
of Criteria for Hypothesis Testing on Nonlinear Regression Parameter under
Contiguous Alternatives, Dokl. Akad. Nauk Ukrain., 1, 7-10 (in Russian)
[97] Grigor'ev, Yu.D. and Ivanov, A.V. (1992) Asymptotic Expansions Associ-
ated with an Estimator of the Observation Error Variance in Non-Linear
Gaussian Regression, Cybernet. Sys. Anal., 28, 62-71
[98] Grigor'ev, Yu.D. and Ivanov, A.V. (1992) On Asymptotic Expansion of the
Distribution of Some Functional of Least Squares Estimator, Dokl. Akad.
Nauk Ukrain., 7, 26-30 (in Russian)
[99] Grigor'ev, Yu.D. and Ivanov, A.V. (1993) Asymptotic Expansions for
Quadratic Functionals of the Least Squares Estimator of a Nonlinear Re-
gression Parameter, Math. Meth. Statist., 2, 269-294
316 BIBLIOGRAPHY
[100] Grigor'ev, Yu.D. and Ivanov, A.V. (1995) Comparison of Powers of a Certain
Class of Criteria for Hypothesis Testing for Non-Linear Regression Paramet-
ers, Siber. Adv. Math., 5, N2, 68-98
[101] Gurevich, V.A. (1983) The Least Moduli Method for the Non-Linear Re-
gression Model, Applied Statistics, (Nauka: Moscow) (in Russian)
[102] Gusev, S.1. (1975) Asymptotic Expansions Associated with Some Statistical
Estimators in the Smooth Case. I, Teor. Veroyatnost. Primen., 20, 488-514;
Asymptotic Expansions Associated with Some Statistical Estimators in the
Smooth Case. II, Teor. Veroyatnost. Primen., 21, 16-33 (in Russian)
[103] Guttman, I. and Meeter, D. (1965) On Beal's Measures of Nonlinearity,
Technometrics, 7, 623-637
[104] Haines, L.M. (1994) A Note on the Differential Geometry of Least Squares
Estimator for Nonlinear Regression Models, S. Afric. Statist. J., 28, 73-91
[105] Hall, P. (1992) The Bootstrap and Edgeworth Expansion, Springer Series in
Statistics, (Springer)
[106] Halperin, M. (1963) Confidence Interval Estimation in Nonlinear Regression,
J. Roy. Statist. Soc. B, 25, 330-333
[132] Ivanov, A.V. and Kozlov, O.M. (1980) On Properties of the Regression Es-
timators for Nonlinear Subjects, Kibernetika, 5, 113-119 (in Russian)
[133] Ivanov, A.V. and Kozlov, O.M. (1981) On Consistency of Minimum Contrast
Estimators in the Case of Non-Identically Distributed Observations, Theor.
Prob. Math. Statist., 23, 63-72
[134] Ivanov, A.V. and Leonenko, N.N. (1989) Statistical Analysis of Random
Fields, (Kluwer Academic Publishers: Dordrecht)
[135] Ivanov, A.V. and Zwanzig, S. (1981) An Asymptotic Expansion for the Dis-
tribution of Least Squares Estimator of a Vector Parameter in Non-Linear
Regression, Sov. Math. Dokl., 23, 118-121
[136] Ivanov, A.V. and Zwanzig, S. (1983) An Asymptotic Expansion of the Least
Squares Estimator of a Non-Linear Regression Vector Parameter, Theor.
Prob. Math. Statist., 26, 45-52
[137] Ivanov, A.V. and Zwanzig, S. (1983) An Asymptotic Expansion of the Dis-
tribution of Least Squares Estimators in the Nonlinear Regression Model,
Math. Oper.forsch. Statist.: Ser. Statist., 14, 7-27
[141] Kass, R.E. (1989) The Geometry of Asymptotic Inference, Statist. Sci., 4,
188-234
[142] Kendall, M., Stuart, A. and Ord, J. Keith (1987) Kendall's Advanced Theory
of Statistics, Vol. I, (5th edn), (Charles Griffin & Co.: London)
[143] Korosani, F. and Milliken, G.A. (1982) Simultaneous Confidence Bands for
Nonlinear Regression Models, Commun. Statist- Theor. Meth., 11, 1241-
1253
[148] Linnik, Yu.V. (1962) Least Squares Method and Foundations of Observation
Theory, (Fizmatgiz: Moscow) (in Russian)
[150] Linnik, Yu.V. and Mitrofanova, N.M. (1965) Some Asymptotic Expansions
for the Distribution of the Maximum Likelihood Estimate, Sankhya. A, 27,
73-82
[154] Markov, A.A. (1898) The Law of Large Numbers and the Least Squares
Methods, Selected Works, (1951) (Izdatel'stvo Akademiya Nauk S.S.S.R:
Moscow) 233-251 (in Russian)
[156] McCullagh, P. and Cox, D.R. (1986) Invariants and Likelihood Ratio Statis-
tics, Ann. Statist., 14, 1419-1430
[157] Michel, R. (1973) The Bound in the Berry-Esseen Result for Minimum Con-
trast Estimates, Metrika, 20, 148-155
[160] Michel, R. and Pfanzagl, J. (1971) The Accuracy of the Normal Approx-
imation for Minimum Contrast Estimates, Z. Wahr.theor. verw. Geb., 18,
73-84
[161] Mishchenko, A.S. and Fomenko, A.T. (1980) Lecture Notes on Differential
Geometry and Topology, (Moscow University Press)
[164] Murray, M.K. and Rice, J.W. (1993) Differential Geometry in Statistics,
Monographs in Statistics and Applied Probability, 48, (Chapman and Hall:
London and New York)
[165] Nagaev, S.V. and Fook, D.X. (1971) Probability Inequalities for the Sums
of Independent Random Variables, Theor. Prob. Appl., 16, 660-675
[166] Neyman, J. and Davis, F.N. (1938) Extension of the Markoff' Theorem on
Least Squares, Statist. Res. Mem., 2, 105-116
[173] Petrov, V.V. (1987) Limit Theorems for Sums of Independent Random Vari-
ables, (Nauka: Moscow) (in Russian)
[175] Pfanzagl, J. (1971) The Berry-Esseen Bound for Minimum Contrast Estim-
ates, Metrika, 17, 82-91
[176] Pfanzagl, J. (1973) Asymptotically Optimum Estimation and Test Proced-
ures, Proc. Prague Symp. on Asymptotic Statistics {1973} Vol. I, (Charles
University: Prague) 201-272
[177] Pfanzagl, J. (1973) Asymptotic Expansions Related to Minimum Contrast
Estimators, Ann. Statist., 1, 993-1026
[178] Pfanzagl, J. (1973) The Accuracy of the Normal Approximation for Estim-
ates of Vector Parameters, Z. Wahr.theor. Geb., 25, 171-198
[179] Pfanzagl, J. (1979) First Order Efficiency Implies Second Order Efficiency,
Contributions to Statistics. J. Hajek Memorial Volume, (Academia: Prague)
167-196
[180] Pfanzagl, J. (1980) Asymptotic Expansions in Parametric Decision Theory,
Developments in Statistics, 3, (Academic Press: New York) 1-97
[181] Pfanzagl, J. and Wefelmeyer, W. (1985) Asymptotic Expansions for General
Statistical Models, Lecture Notes in Statistics, 31, (Springer-Verlag: Berlin)
[184] Prakasa Rao, B.L.S. (1984) On the Exponential Rate Convergence of the
Least Squares Estimator in the Nonlinear Regression Model with Gaussian
Errors, Statist. Prob. Lett., 2, 139-142
[185] Prakasa Rao, B.L.S. (1984) The Rate of Convergence of the Least Squares
Estimator in a Non-Linear Regression Model with Dependent Errors, J. Mul-
tivar. Anal., 14, 315-322
[186] Prakasa Rao, B.L.S. (1984) The Rate of Convergence of the Least Squares
Estimator in the Nonlinear Regression Model for Multiparameter, (Preprint),
(Indian Statistical Institute)
[189] Rao, C.R. (1965) Linear Statistical Inference and Its Applications, (Wiley)
322 BIBLIOGRAPHY
[190] Ranga Rao, R. (1960) Some Problems in Probability Theory, (D.Phil. Thesis:
University of Calcutta)
[191] Rashevsky, P.C. (1967) Riemannian Geometry and Tensor Analysis, (Nauka:
Moscow) (in Russian)
[192] Ratkovsky, D.A. (1983) Nonlinear Regression Modelling, (Marcel Dekker:
New York)
[193] Ratkovsky, D.A. (1983) Handbook of Nonlinear Regression Models, (Marcel
Dekker: New York)
[194] Robinson, J., Hoeglung, T., Horst, L. and Quine, M.P. (1990) On Approx-
imating Probabilities for Small and Large Deviations in ]Rd, Ann. Prob., 18,
727-753
[195] Ross, G.J.S. (1982) Nonlinear Models, Math. Oper.forsch. Statist.: Ser.
Statist., 13, 445-453
[196] Ross, G.J.S. (1990) Nonlinear Estimation, (Springer-Verlag)
[197] Sadikova, S.M. (1966) Some Inequalities for Characteristic Functions, Teor.
Veroyatn. Primen., 11, 500-506 (in Russian)
[198] Savill, D.J. and Wood, G.K. (1991) Statistical Methods: The Geometrical
Approach, (Springer)
[199] Schmidt, W.H. and Zwanzig, S. (1986) Second Order Asymptotics in Non-
linear Regression, J. Multivar. Anal., 18, 187-215
[200] Seber, G.A.P. (1977) Linear Regression Analysis, (Wiley)
[201] Seber, G.A.P. and Wild, C.J. (1989) Nonlinear Regression, (Wiley)
[202] Sen, A. and Srivastava, M. (1990) Regression Analysis Theory, Springer
Texts in Statistics, (Springer)
[206] Statulevicius, V.A. (1965) Limit Theorem for Densities and Asymptotic Ex-
pansions for Distributions of the Sums of Independent Random Variables,
Theory Prob. Appl., 10, 645-659 (in Russian)
BIBLIOGRAPHY 323
[207] Survila, P. (1964) One Local Limit Theorem for Densities, Lit. Mat. Sbomik,
4, 535-540 (in Russian)
[208] Wald, A. (1949) Note on the Consistency of the Maximum Likelihood Es-
timate, Ann. Math. Statist., 20, 595-601
[209] Walker, A.M. (1971) On the Estimation of a Harmonic Component in a Time
Series with Stationary Independent Residuals, Biometrika, 58, 21-36
[220] Shoo, J. (1992) Consistency of Least Squares Estimator and Its Jackknife
Variance Estimator in the Non-Linear Model, Can. J. Statist., 20, 415-428
[222] Sazonov, V.V. (1972) On a Bound for the Rate of Convergence in the Mul-
tidimensional Central Limit Theorem, Proc. 6th Berkeley Symposium on
Mathematical Statistics and Probability, Vol. II, (University of California
Press: Berkeley), 563-581
324 BIBLIOGRAPHY
[223] Barndorf-Nielsen, O.E. and Cox, D.R. (1989) Asymptotic Techniques for Use
in Statistics, Monographs on Statistics and Applied Probability, (London,
New York: Chapman and Hall)
Index
325
326 INDEX
Jack Knife estimator of error vari- signed measure 125, 126, 130, 295
ance 196 simple hypothesis 230
asymptotic expansion of mom- skewness 205, 206, 253, 271
ents 204-206 statistical manifold 252
INDEX 327
strong consistency of
least moduli estimator 70, 71
least squares estimator 64, 65,
67
minimum contrast estimator 62
P.M. Alberti and A. Uhlmann: Stochasticity and Partial Order. Doubly Stochastic
Maps and Unitary Mixing. 1982,128 pp. ISBN 90-277-1350-2
A.V. Skorohod: Random Linear Operators. 1983,216 pp. ISBN 90-277-1669-2
I.M. Stancu-Minasian: Stochastic Programming with Multiple Objective Functions.
1985,352 pp. ISBN 90-277-1714-1
L. Arnold and P. Kotelenez (eds.): Stochastic Space-Time Models and Limit
Theorems. 1985,280 pp. ISBN 90-277-2038-X
Y. Ben-Haim: The Assay of Spatially Random Material. 1985,336 pp.
ISBN 90-277-2066-5
A. pazman: Foundations of Optimum Experimental Design. 1986, 248 pp.
ISBN 90-277-1865-2
P. Kree and C. Soize: Mathematics of Random Phenomena. Random Vibrations of
Mechanical Structures. 1986,456 pp. ISBN 90-277-2355-9
Y. Sakamoto, M. Ishiguro and G. Kitagawa: Akaike Information Criterion Statis-
tics. 1986,312 pp. ISBN 90-277-2253-6
G.J. Szekely: Paradoxes in Probability Theory and Mathematical Statistics. 1987,
264 pp. ISBN 90-277-1899-7
0.1. Aven, E.G. Coffman (Jr.) and Y.A. Kogan: Stochastic Analysis of Computer
Storage. 1987,264 pp. ISBN 90-277-2515-2
N.N. Vakhania, V.I. Tarieladze and S.A. Chobanyan: Probability Distributions on
Banach Spaces. 1987,512 pp. ISBN 90-277-2496-2
A.V. Skorohod: Stochastic Equations for Complex Systems. 1987,196 pp.
ISBN 90-277-2408-3
S. Albeverio, Ph. Blanchard, M. Hazewinkel and L. Streit (eds.): Stochastic
Processes in Physics and Engineering. 1988,430 pp. ISBN 90-277-2659-0
A. Liemant, K. Matthes and A. Wakolbinger: Equilibrium Distributions of
Branching Processes. 1988,240 pp. ISBN 90-277-2774-0
G. Adomian: Nonlinear Stochastic Systems Theory and Applications to Physics.
1988,244 pp. ISBN 90-277-2525-X
J. Stoyanov, O. Mirazchiiski, Z. Ignatov and M. Tanushev: Exercise Manual in
Probability Theory. 1988,368 pp. ISBN 90-277-2687-6
E.A. Nadaraya: Nonparametric Estimation of Probability Densities and Regression
Curves. 1988,224 pp. ISBN 90-277-2757-0
H. Akaike and T. Nakagawa: Statistical Analysis and Control of Dynamic Systems.
1998,224 pp. ISBN 90-277-2786-4
Other Mathematics and Its Applications titles of interest:
A.V. Ivanov and N.N. Leonenko: Statistical Analysis of Random Fields. 1989, 256
pp. ISBN 90-277-2800-3
V. Paulauskas and A. Rackauskas: Approximation Theory in the Central Limit
Theorem. Exact Results in Banach Spaces. 1989, 176 pp. ISBN 90-277-2825-9
R.Sh. Liptser and A.N. Shiryayev: Theory of Martingales. 1989,808 pp.
ISBN 0-7923-0395-4
S.M. Ermakov, V.V. Nekrutkin and A.S. Sipin: Random Processes for Classical
Equations of Mathematical Physics. 1989,304 pp. ISBN 0-7923-OO36-X
G. Constantin and I. Istratescu: Elements of Probabilistic Analysis and Applica-
tions. 1989,488 pp. ISBN 90-277-2838-0
S. Albeverio, Ph. Blanchard and D. Testard (eds.): Stochastics, Algebra and
Analysis in Classical and Quantum Dynamics. 1990,264 pp. ISBN 0-7923-0637-6
Ya.I. Belopolskaya and Yu.L. Dalecky: Stochastic Equations and Differential
Geometry. 1990,288 pp. ISBN 90-277-2807-0
A.V. Gheorghe: Decision Processes in Dynamic Probabilistic Systems. 1990,372
pp. ISBN 0-7923-0544-2
V.L. Girko: Theory ofRandom Determinants. 1990,702 pp. ISBN 0-7923-0233-8
S. Albeverio, PH. Blanchard and L. Streit: Stochastic Processes and their Applica-
tions in Mathematics and Physics. 1990,416 pp. ISBN 0-9023-0894-8
B.L. Rozovskii: Stochastic Evolution Systems. Linear Theory and Applications to
Non-linear Filtering. 1990,330 pp. ISBN 0-7923-0037-8
A.D. Wentzell: Limit Theorems on Large Deviations for Markov Stochastic
Process. 1990,192 pp. ISBN 0-7923-0143-9
K. Sobczyk: Stochastic Differential Equations. Applications in Physics, Engineer-
ing and Mechanics. 1991,410 pp. ISBN 0-7923-0339-3
G. Dallaglio, S. Kotz and G. Salinetti: Distributions with Given Marginals. 1991,
300 pp. ISBN 0-7923-1156-6
A.V. Skorohod: Random Processes with Independent Increments. 1991,280 pp.
ISBN 0-7923-0340-7
L. Saulis and V.A. Statulevicius: Limit Theorems for Large Deviations. 1991,232
pp. ISBN 0-7923-1475-1
A.N. Shiryaev (ed.): Selected Works of A.N. Kolmogorov, Vol. 2: Probability
Theory and Mathematical Statistics. 1992,598 pp. ISBN 90-277-2795-X
Yu.I. Neimark and P.S. Landa: Stochastic and Chaotic Oscillations. 1992,502 pp.
ISBN 0-7923-1530-8
Other Mathematics and Its Applications titles of interest: