You are on page 1of 332

Asymptotic Theory of Nonlinear Regression

Mathematics and Its Applications

Managing Editor:

M. HAZEWINKEL
Centrefor Mathematics and Computer Science, Amsterdam, The Netherlands

Volume 389
Asymptotic Theory of
Nonlinear Regression

by

Alexander V. Ivanov
Glushkov Institutefor Cybernetics.
Kiev. Ukraine

Springer-Science+Business Media, B.V.


A C.I.P. Catalogue record for this book is available from the Library of Congress.

ISBN 978-90-481-4775-5 ISBN 978-94-015-8877-5 (eBook)


DOI 10.1007/978-94-015-8877-5

Printed an acid-free paper

An Rights Reserved
1997 Springer Science+Business Media Dordrecht
Originally published by Kluwer Academic Publishers in 1997
Softcover reprint ofthe hardcover Ist edition 1997
No part of the material protected by this copyright notice may be reproduced or
utilized in any fonn or by any means, electronic or mechanical,
including photocopying, recording or by any infonnation storage and
retrieval system, without written pennission from the copyright owner.
Contents

INTRODUCTION 1

1 CONSISTENCY 5
1 Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . .. 5
2 Large Deviations of the Least Squares Estimator in the Case of
Errors Having an Exponential Moment . . . . . . . . . . . . . . .. 9
3 Large Deviations of the Least Squares Estimator in the Case of
Errors with a Moment of Finite Order . . . . . . . . . . . . . . .. 25
4 The Differentiability of Regression Functions and the Consistency
of the the Least Squares Estimator . . . . . . 45
5 Strong Consistency . . . . . . . . . . . . . . . 58
6 Taking the Logarithm of Non-Linear Models. 73

2 ApPROXIMATION BY A NORMAL DISTRIBUTION 79


7 Stochastic Asymptotic Expansion of Least Squares Estimators 79
8 Asymptotic Normality of Least Squares Estimators: First Results. 92
9 Asymptotic Normality of Least Moduli Estimators . . . . . 108
10 Asymptotic Expansion of the Distribution of Least Squares
Estimators. . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
11 Calculation of First Polynomials of an Asymptotic Expansion of the
Distribution of a Least Squares Estimator . . . . . . . . . . 144

3 ASYMPTOTIC EXPANSIONS RELATED TO THE LEAST SQUARES


ESTIMATOR 155
12 Asymptotic Expansion of Least Squares Estimator Moments. 155
13 Asymptotic Expansions Related to the Estimator of the Variance
of Errors of Observation . . . . . . . . . . . . . . . . . . . . . . .. 168
14 Asymptotic Expansion of the Distribution of the Variance
Estimator of Observational Error in Gaussian Regression 188
15 Jack Knife and Cross-Validation Methods of Estimation of the
Variance of Errors of Observation. . . . . . . . . . . . . . . . . 196
16 Asymptotic Expansions of Distributions of Quadratic Functionals
of the Least Squares Estimator . . . . . . . . . . . . . . . . . . . . 207

v
vi CONTENTS

17 Comparison of Powers of a Class of Tests of Hypotheses on a


Non-Linear Regression Parameter. . . . . . . . . . . . . . . . 229

4 GEOMETRIC PROPERTIES OF ASYMPTOTIC EXPANSIONS 251


18 Certain Aspects of the Differential Geometry of Models of
Non-Linear Regression . . . . . . . . . . . . . . . . . . . 251
18.1 Embedded Riemannian Manifolds and Statistical
Connectedness . . . . . . . . . . . . . . . . . . . 251
18.2 Statistical Curvature . . . . . . . . . . . . . . . . 254
18.3 Measures of the Non-Linearity of Regression Models 256
18.4 Statistical Invariants . . . . . . . . . . . . . . . . . 267
18.5 Invariants of Non-Linear Regression with a Scalar
Parameter. . . . . . . . . . . . . . . . . . . . . . . 268
18.6 Invariants of Non-Linear Regression with a Vector
Parameter. . . . . . . . . . . . . . . . . . . . . . . 272
19 The Geometric Interpretation of Asymptotic Expansions. 276
19.1 Geometry of the AE of the LSE Moments. . . . . 276
19.2 The Geometry of AEs Associated with the Estimator of the
Variance (J'2 279
19.3 Geometry of AE of Distributions of Quadratic Functionals
of the LSE . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
19.4 Geometries of the Statistical Criteria for Testing Hypotheses
about Non-Linear Regression Parameters .. . . . . . . .. 284

ApPENDIX 289
I Subsidiary Facts . . . . . . 289
II List of Principal Notations. 298

COMMENTARY 303
Chapter 1 303
Chapter 2 304
Chapter 3 305
Chapter 4 306

BIBLIOGRAPHY 309

INDEX 325
Introduction

Let us assume that an observation Xi is a random variable (r.v.) with values in


(1R1 , 8 1) and distribution Pi (1R1 is the real line, and 8 1 is the cr-algebra of its
Borel subsets). Let us also assume that the unknown distribution Pi belongs to a
certain parametric family {Pi () , () E e}. We call the triple i = {1R1 , 8 1, Pi (), () E e}
a statistical experiment generated by the observation Xi.
We shall say that a statistical experiment n = {lRn, 8 n , P; ,() E e} is the
product of the statistical experiments i, i = 1, ... ,n if PO' = P1() X ... X Pn() (IRn
is the n-dimensional Euclidean space, and 8 n is the cr-algebra of its Borel subsets).
In this manner the experiment n is generated by n independent observations
X = (X 1 , ... ,Xn ).
In this book we study the statistical experiments n generated by observations
of the form

Xj = g(j, (}) + cj, j = 1, .. . ,n. (0.1)

In (0.1) g(j, (}) is a non-random function defined on e c , where e c is the closure in


IRq of the open set e ~ IRq, and Cj are independent r. v .-s with common distribution
function (dJ.) P not depending on ().
Let Ell (E) be the expectation with respect to the measure PO' (P). Let us
assume that Ell Xj = g(j, (}) (Ecj = 0). We shall call the representation (0.1)
a regression model, and g(j, (}), understood as a function over N x (}C (N is the
set of natural numbers) a regression function. If there exists a parametrisation of
a regression function for which g(j, (}) is a linear form of the coordinates of the
vector () = ((}1, ... ,(}q), then the observational model (0.1) is then called a linear
regression model. Otherwise the model (0.1) is called non-linear.
The totality of statistical inference methods for unknown parameters () =
((}1, ... ,(}q) of the function g(j, ()) and probability characteristics of the r.v. Cj
(for example, the variance cr 2 ) by observations X j , j = 1, ... , n, traditionally be-
longs to regression analysis. According to this, such regression models (0.1), either
linear or non-linear, as are admitted by experimentalists, are referred to as linear
or non-linear regression models.
The prevalence in regression analysis of linear models (see, for example, the
books of Seber [200] and Sen and Srivastava [202]) reflects the fact that for solv-
ing concrete problems - using the terminology accepted in applications - the un-
known response of the system studied is replaced by its polynomial approximation

A. V. Ivanov, Asymptotic Theory of Nonlinear Regression


Springer Science+Business Media Dordrecht 1997
2 INTRODUCTION

and the coefficients of the approximating polynomials play the role of the estimated
parameters. The non-linear regression models have important advantages over the
linear ones. The main one consists in the greater adequacy of the non-linear
models that have essentially fewer unknown parameters. Often the parameters of
non-linear models have the meaning of physical variables at the same time as the
linear parameters are usually devoid of physical significance.
The last quarter of a century's study of non-linear regression models has
steadily attracted the interest of specialists; we refer only to the books of Bard [9],
Ratkowsky [192,193]' Gallant [84], Bates and Watts [27], Demidenko [70,71], Seber
and Wild [201], and Ross [196]. However, on introducing into statistics the use of
non-linear regression analysis it is necessary to overcome a series of mathematical
difficulties which do not have analogues in the linear theory. For example, the least
squares estimator (l.s.e.) non-linearly entering in a regression function parameters
can not be found in an explicit form, and this complicates the description of its
mathematical properties. Another principal peculiarity consists in this: character-
istics of the l.s.e. (bias, correlation matrix, and such like) are local, i. e., depend on
unknown values of the parameters. These two cases approximate the model (0.1)
with general models of parametric statistical inference by independent non-equally
distributed observations, although it does not follow that we should neglect that
all inhomogeneity of the sample (0.1) X = (Xl' ... ' Xn) is concentrated only in
the shifts g(j, 8), j = 1, ... , n.
The maximum likelihood method plays an important role in the theory of
parametric estimation, for the application of which it is necessary to know to
within an unknown parameter the distribution (of the density) ofthe observations.
Concurrent with it, the Bayes' method of estimation and the method of maximal
a posteriori density also have such a peculiarity.
Not forgetting about the success, fallen to the the theory of normal regression
(the ej are Gaussian (0,0'2) r.v.-s), let us emphasise that the meaning of regres-
sion analysis consists in the performance of statistical inference using minimal
information about the distribution of the r.v.-s ej. Consequently the methods of
estimation used in regression analysis must be oriented towards the use of mea-
gre information about the errors of observation ej. Such methods are those of
M -estimators, the minimisation of empirical risk, minimal contrast (if the con-
trast function does not depend upon the density of observations), the method of
recurrent estimation, and others (see the book by Birkes and Dodge [36]).
In the stream of publications devoted to the problems of regression analysis
the central place is occupied by the least squares method of estimates of para-
meters, which has a protracted history. The basic statement of the linear theory
was worked out in the researches of Gauss [85] and Markov [153,154], and then
developed by Aitken [7], Neyman and David [166], Linnik [148], Rao [189], and
many others.

DEFINITION: The l.s.e. of an unknown parameter 8 E e, obtained from the ob-


servations X = (Xl, ... ,Xn) in the form
(0.1) is any vector (O~, ... ,O~) = On =
INTRODUCTION 3

9n (X) E e c having the following property

L(r) = L: [Xj - g(j,r)]2, (0.2)

(the symbol 'E ' stands in the whole book for the summation over the index j from
1 to n).
If follows that we should recognise that the asymptotic (i. e., n -+ 00) properties
of the l.s.e. with parameter () of the non-linear model (0.1) were in fact not studied
until the works of Jennrich [138] and Malinvaud [152], and the basic results in this
area of mathematical statistics were obtained in the flow of the last two decades.
A large part of the proposed investi~ation devoted to the study of the asymp-
totic statistical properties of the l.s.e. (}n of the parameter () of the model (0.1),
for example such as the probabilities of large deviations l.s.e., is weak and strong
consistency, stochastic asymptotic expansions (s.a.e.) and asymptotic expansions
(a.e.), l.s.e., 9n of dJ., and various functionals of them (in particular, the variance
estimator Ec = 0'2 > a of the errors of observation Cj, and others). A series of
questions is also examined that is connected with the differential geometry of the
a.e. obtained in this book. Less attention is given in this book to the least moduli
estimator (l.m.e.).
DEFINITION: The I. m. e. of the parameter () E e for the observations X = (Xl, ... ,
e
Xn) of the form (0.1) is any random vector (O~, ... ,O~) = On = On (X) E c having
the property

R(r) = L: IXj - g(j,r)l (0.3)

The least moduli method as a method of smoothing out observations first


appeared in the work of Boscovitch [79] earlier than Gauss' method of least squares;
however, it did not obtain such a wide prevalence as the latter because of the non-
differentiability of the function (0.3). This book contains some statements about
the consistency and asymptotic normality of the l.m.e. On.
In the text the sections and theorems are provided with continuous numbering
(in the Appendix a theorem's number contains the letter 'A'). Lemmas, remarks
and corollaries have two numbers: the first refers to the number of the theorem
to which they are related; the second is its own number. In the body of the work
a double numbering is adopted for formulae: the first refers to the number of the
section and the second to the number of the formula. In conditions and proofs
we often write 'for n > no' in place of the expression 'for sufficiently large n', or
perhaps nothing if we seek to avoid needless repetition.
Chapter 1

Consistency

1 INTRODUCTORY REMARKS

Let En = {R n , En, PO' ,8 E e} be a statistical experiment generated by the inde-


pendent observations (0.1), e E Eq. Let us write

L(7) = L(7,X) = ~)Xj - g(j,7W,


X = (Xl, ... ,Xn ) E lRn , 7 = (7 1 , ... ,7q ) E e c .
We consider briefly the question of the existence of the l.s.e. On, i.e., the question
of the existence of the r.v. On = On(X) with values in e c satisfying the equation

Here the simplest fact, but one very important in applications, is this: if e c is
compact and g(j, 8) E C(e C ), j 2: 1, then the l.s.e. On exists [138].
Clearly, if e c is compact, and g(j,8) E C(e C ), j 2: 1, then there may hold
an analogous assertion for the l.m.e. On and any other estimates of the parameter
8, which are defined as point optima of a functional continuous in the arguments
(8,X).
It is clear that there is also interest in the case when e is an unbounded set.
Let us mention one assertion covering this case for the continuous functions g(j, 8),
j 2: l.
Let us assume that the functions g(j, 8) E C(e C ), j 2: 1, in which inf L(8, X)
is attained in e c for each X E lRn. Then the l.s.e. On exists [174].
Even if only for one of the functions g(j, 8)

lim \g(j,8)\=oo,
1111-+00
then for each X E lRn and number m > 0 it is possible to determine a compact
set Tx,m C e c such that

inf L(8, X) > m


IIE9 c \Tx,m

and consequently inf L(8, X) is attained in ec .

A. V. Ivanov, Asymptotic Theory of Nonlinear Regression


Springer Science+Business Media Dordrecht 1997
6 CHAPTER 1. CONSISTENCY

Let us note that the assertions mentioned above are a very special case of the
theorems of measurable choice [8, 145] and, in actuality, one is able to prove the
existence of the l.s.e. On for considerably weaker requirements. We do not cite the
corresponding formulations since continuous and differentiable regression functions
only are considered below.
Let En, n ~ 1, be a sequence of statistical experiments generated by inde-
pendent observations X = (Xl"'" Xn), On = On(X) is a certain sequence of
estimators of the parameter (J E e.
DEFINITION: A sequence On, n ~ 1, is called a consistent sequence of estimators
(J (On is a consistent estimator (J) if for any r > 0
P;{IOn-(J1 ~r} ~ O. (1.1)
n-+oo

Since the experimenter does not know the value of the parameter (J, it is im-
portant to guarantee uniform convergence to zero of the probability (1.1) for some
sets of parameters in the set e.
DEFINITION: A sequence On n ~ 1, is called a uniformly consistent sequence of
estimators (J in the set Tee (On is a uniformly consistent estimator (J in the set
T) if for any r > 0
supP9{IOn - (JI ~ r} ~ O. (1.2)
9ET n-+oo

If the observations X have the form (0.1) and On is the l.s.e., then it is not
difficult to adduce an example of a sequence of functions g(j, (J) for which On does
not satisfy (1.1).
EXAMPLE 1: (Cf., [152]). Let
g(j, (J) = e- 9j , (J E e = (0, A), A < OOj Ec~ = 0'2 < 00.
Let us assume that On is a consistent estimator of the parameter (J = (Jo. If
IOn - (Jol < r for r < (Jo, then On satisfies the equation
dLJ;n) = 0
or
o = Lje- 90j cj + Lj(e- tinj - e- 90j )Cj + Lje-tinj(e-90j - e- tinj )

an + f3n + 'Yn,
with
lf3nl < r Lj2e-(9 0 -r)jlcjl,

I'Ynl < r Lj2 e-(9 0 -r)j ~ cI((JO)r.


1. INTRODUCTORY REMARKS 7

Let us introduce the event

A = {:~:::j2e-(9o-r)jlejl < r- 1/ 2 }

Then for elementary events from the set {IOn - 80 1 < r} n A the inequality lanl <
r1/2 + c1(80)r holds. Consequently,

By Markov's inequality

Since On is assumed to be a consistent estimator of 80 , then from (1.3) it follows


that an converges in P~ probability to zero. In other words, the r.v. an for n -+ 00
converges in the mean square to the r. y.
00

a = Lje- 90j cj,


j=1

whence
00

Ea 2 = 0'2 Lj2 e-290j > o.


j=1

The contradiction obtained shows that in the example under consideration the
l.s.e. On is not a consistent estimator 80
The larger the l.s.e. On becomes, the property (1.2) does not always hold. Let
us denote

In the model (0.1) let Cj have a Gaussian distribution with parameters (0,1). If
the sequence offunctions g(j, 8) is such that for some 81 , 82 ETC e, 81 :f:. 82 ,

then there do not exist uniformly consistent estimators of the parameter 8 in T


obtained from the observations X [120]. Indeed, for the model (0.1) a more general
fact holds [214].
In the model (0.1) let the r.v.-s ej have almost everywhere positive differential
density p(x) with a finite Fisher information number

1 00

-00
(P'(X))2
P() dx < 00.
x
8 CHAPTERl. CONS~TENCY

If there exists a consistent estimator On of the parameter 9, then for any 91 ,92 E e,
91 '" 92 ,

Let us restrict ourselves to giving only a short explanation. Let (JRN , 8 N ) be


a direct product of a countable number of components (JR1 , 8 1 ). Let us consider
the statistical experiment eN = {JRN , 8N , PI", 9 E e} generated by the sequence
of independent observations (Xl, X 2 , . ) = X N , having the form (0.1),

PI" = II PilJ.
iEN

The definitions of consistency have the same meaning if in (1.1) and (1.2) we
write PI" instead of PO'. If the estimator On is consistent, then the measures
{PI" ,9 E e} form a class of mutually singular measures. The sequence of r. v .-s
X N which corresponds to the measure PI:
is a shift of the sequence X N which
corresponds to the measure Pl". For the singular measures PI" and PI" the series
E~dg(j, 9d - g(j, 92)]2 diverges [115,203,213]. 1 2

Let us remark that in Example 1 the series E~de-(hi - e- j2 converges,


82i
and consequently there generally does not exist a consistent estimator 9 in its
conditions.
And thus there exists a link between the properties of consistency of estimators
of the parameter 9 in the model (0.1) and the character of the behaviour of the
function 'Pn (91 , 92 ). In Sections 2-4 this link is investigated in detail to obtain
sufficient conditions for the uniform consistency of the l.s.e.-s On of the parameter
(J of the model (0.1). The study of strong consistency, i.e., convergence of an
estimate to a true value of a parameter almost certainly (a.c.), is postponed to
Section 6.
Let us denote

w(9 1,92) = b(9d - b(92);


'P~1(91,92)W(91,92)' 91 ", 92 ;
v(9 1 ,92 ) ={
0, 91 = 92
By the definition of the l.s.e. On, PO' a.c.
8* = n- 1 L(9) > n- 1 L(On)
= 8* - 2n- 1w(On, 9) + n- 1 cpn(On, 9),
or

(1.4)
2. ERRORS WITH EXPONENTIAL MOMENTS 9

Let us assume that for any 6 > 0 and 01 , O2 E ec


(1.5)

Let Bn(O) E l3 q be a Borel set, the closure B;(O) of which does not contain O. If
On E Bn(O) then from (1.4) and (1.5) it follows that

Ptf{On E Bn(O)} ~ Ftf{ sup V(T,O) 2:


TEB~(8)
~} . (1.6)

The inequality (1.6) provides the basis of obtaining sufficient conditions of


consistency of the los.e. On. In particular, if Bn(O) is the exterior of an open ball
of radius r with centre at 0, then the convergence of the right hand side of (1.6)
to zero as n -+ 00 implies (1.1). The inequality (1.6) coincides in meaning with
the observation on p. 60 of the book by Ibragimov and Has'minsky [120].

2 LARGE DEVIATIONS OF THE LEAST SQUARES ESTIMATOR IN


THE CASE OF ERRORS HAVING AN EXPONENTIAL MOMENT

In the proofs of the assertions of this and the following sections a general approach
to the study of the probability of large deviations of statistical estimators is used,
which was developed in the works of Ibragimov and Has'minskii [120]. In this
connection moment conditions are imposed upon the r.v.-s j only.
Let us assume that the r.v. j satisfies the condition:
100 There exist constants T > 0 and 0 < R ~ 00 such that

for ItI ~ R.
In this Section it will be shown that the fulfilment of the condition 100 and of
a series of requirements on the functions gU, 0) ensures the exponential bound for
the probability of large deviations of a normalised los.e. On.
Let

be a diagonal matrix of order q with elements din, i = 1, ... ,q along the diagonal.
We shall normalise the matrices d to n en.
Let us write

For a fixed 0 E e the function ~n is defined on the set U;(O) x U;(O),

Let Tee be some compact set. Let us assume the following:


10 CHAPTER 1. CONSISTENCY

110. For some a E (0,1] there exist constants Cl = Cl (T) < 00 and C2 = c2(T) ~
o such that
sup sup CP~P(Ul,U2)lul-U2IoQ:$Cl(I+Qc2), Q>O. (2.1)
9ET Ul,U2Evo"(Q)nU;;(9)
Subsequent conditions - conditions of the distinctness of the parameters of the
regression function g(j, 8) - are more subtle. Let

be a sequence of functions,

Wn(O, . .. ,0) = 0,
Let

Zin t 00, n -t 00, i = 1, ... ,m, m:$ q

be some sequences of a numbers. For m = 0,1, ... , q let us put

DEFINITION: We shall say that a function wn(pI, . .. ,pq):


(1) belongs to the class D(m), 0:$ m < q, if there exists a function

wmn(X) :$ w~n(x), X ~ 0,
monotonically non-decreasing in n and x, such that for any a >0
lim x G e-'I1 mn (z) = 0; (2.2)
n-too,z-too

(2) belongs to the class D(q) if there exists a function

wqn(x) :$ w~n(x), X ~ 0,

monotonically non-decreasing in n and x, such that for any a > 0 and some
constant Cs (a) < 00

(2.3)

Let us introduce the conditions (m = 0,1, ... ,q)


111m

for some function +nED (m).


2. ERRORS WITH EXPONENTIAL MOMENTS 11

The fulfilment of the condition IlIo signifies that the function iP;!2 (u, 0) dis-
tinguishes all variables u 1 , . .. ,uq (or, what amounts to the same, the regression
function gU, ()) distinguishes all the parameters ()1, ... ,()q so well that (2.2) holds.
The conditions IIIm , m = 1, ... , q - 1 describe the situation in which the vari-
ables u 1 , ... , u m (the parameters ()1, ... , ()m) are distinguished well only in certain
intervals of values depending upon n, and outside these intervals the functions
iP;!2(U,0) lose sensitivity as the quantities lUll, ... , luml grow. If also luil ~ Zin,
i = 1, ... , m, then good discrimination of the variables as lulo ---+ 00 is expressed
by the relation (2.2), which is realised at the expense of the function iP;!2(U, 0) in
response to the growth of lum+1I, ... , luql. The condition IIIq includes the cases
when iP;!2(u,0) discriminates the variables well (relation (2.3)) only inside the
parallelepiped defined by the sequences Zin, i = 1, ... , q.
EXAMPLE 2: Let
gU, ()) = ()1 cos ()2 j, j 2: 1,
e = (0, A) x (h,7r - h), > 0, A < 00,
h
T = [a, bl x [ep,7r - epl, ep > h, a > 0, b < A.
Let us set

Then

. ( 1) -3/2 2)
(u 1)2 + ()1 (()1 + n
-1/2 1)
un ( 1 _ sm n .+ 1'2 n 3/2 u
2 + Xn,
2nsm'2n- u

where Xn is uniformly bounded in n and () E T. If


lu2 1~7rn1/2,
then taking advantage of the inequalities
x3 x3 x5
X - - < sinx < x - -
6 - - 6
+ -120' x >_ 0 and ()1(()1 + n- 1 / 2 u 1 ) >_ a2 ,
we find

where x > 0 is some constant. And so it is possible to take

sin(n + 1)n- 3 / 2 p2 )
'fln(p\p2)=(pl)2+a 2n ( 1- . 2 _ +Xn ,
2n 8m!n 3/2 pl
12 CHAPTER 1. CONSISTENCY

a2
. ( 1;1)
Wln(x)=mm ( 1- 20
1T2 ) ) x-x.
2

Let us introduce the events

A~m)(o) = {ldin(O)(O~ - Oi)1 ~ Zin, i = 1, ... ,m}, m = 1, ... , q;

A~O) (0) = IRq.


def

THEOREM 1: Let the conditions 100 , 110 , IIIm be satisfied for some
Then there exist positive constants C4 -C7 such that for H > ~ m ~ q.

(2.4)
(2.5)

Proof: Let us define the sets

a~O) d~flRq, a~m) = {u E IRq : luil ~ Zin, i = 1, ... ,m}, m = 1, ... ,q.

By the inequality (1.6) rewritten for the normalised On,

p;{ A~m)(o); Idn(O)(On - 0)10 ~ H}

~ p;{ sup
uE,;(H)na~m)nu~ (Ii)
1I(0 + d;;lU,O)) ~!}.
2
(2.6)

Let us consider the sequence of sets:

u(p) = (v8(H + P + 1) \ vo(H + p)) n a~m) n u~(O), p = 0,1, ....

If m = q then the sequence U(p) contains only a finite number of non-empty sets
not exceeding [maxl~i~q Zin - H] - 1. Extending the inequality (2.6) we find

Let
2. ERRORS WITH EXPONENTIAL MOMENTS 13

be an h-net of the set U(p),

Then for any 8 E (O,!)

Pl}{ sup V(fJ + d;;lu,fJ) 2:: -21 }


uEUP

< t,Pl}{V(fJ + d;;lui,fJ) 2:: ~ -8}

+Pl}{ sup IV(fJ+d;;lUI,fJ)-V(fJ+d;;lUII,fJ)I2::8}. (2.8)


lu'-u"lo::;h
u',U"EU(P)

In the condition 100 let R < 00 be a constant. Let us fix i and apply the
Theorem A.l to the r.v.

The conditions of this theorem are satisfied for R from condition 100 and for

where r is the constant from condition 100 , Consequently, in the given case

G= I:rj =r.
Let us take

By condition HIm X > r R for sufficiently large nand H, and consequently

Pl}{V(fJ + d;;lui,fJ) 2:: ~ - 8}


= Pl}{W(fJ+d;;l ui ,fJ)<I>;;1/2(Ui,0) 2:: (~ -8) <I>;!2(Ui'0)}
< exp{-~ (~-8)R<I>;!2(Ui,0)}

< exp{-~(~-8)R\r!mn(H+P)}' (2.9)


14 CHAPTER 1. CONSISTENCY

If R = 00, then let us apply the same theorem to the r. v.

But this time

and by condition HIm

P;{V(8+d;;-1Ui ,8) ~~ -<5} = P;{W(8 + d;;-1Ui,8) ~ (~ -<5) <Pn(Ui,O)}


< exp{-2~ (~-<5Y <Pn(Ui,O)}
< exp { -2~ (~-<5Y 1It~n(H+P)}. (2.10)
Let us further remark that for s ~ 1

Eolv(8 + d;;-1u l , 8) - v(8 + d;;-1 ull , 8W


< 2s - 1 {Eolw(8 + d;;-1u l , 8 + d;;-1U Il W<p;;-S(U I , 0)

Using the inequality of Theorem A.2 we find

Eoiw(8 + d;;-1u 8 + d;;-1U


l , Il W~ C9<P:(2(U I, ull ), (2.12)

where one can take C9 = x(s)(/1-s + /1-;/2) for s > 2 and C9 = 2/1-s for 1 ~ s ~ 2. On
the other hand,

(2.13)

~ 2S-1<p:(2(U I ,UIl ) (<p;;-S/2(U I ,0)<p;;-S(ull ,0) + <p;;-S/2(UIl ,0)<p;;-S(u l ,0)) .

From conditions Ho and HIm, inequalities (2.11)-(2.13) for any 8 E T, and any
u l , u ll E U(p), we obtain

where
2. ERRORS WITH EXPONENTIAL MOMENTS 15

Let s be large enough for as > q to hold. Applying Theorem A.3 to the random
field

17(U) = 11(0 + d~lu,O), UE F = U(p), Q = H + p + 1,


we find that

where "0 depends on s, q, a and does not depend upon h and the set U(p).
In this way, for any 0 E T and R < 00, from (2.8), (2.9) and (2.15) it follows
that

(2.16)

Analogously, from (2.8), (2.10) and (2.15), for R = 00 we obtain

p; { sup 11(0 +
uEU(p)
d~lU, 8) ~ -21 }

< cs(q)(H + p)q-1h- qexp { - 2~ (~ - 8Yw~n(H + p)}

+ 8- "olsn(H + P + 1)(H + P + l)qh


S QS - q. (2.17)

In (2.16) let us set

and in (2.17)
16 CHAPTER 1. CONSISTENCY

Then the inequalities (2.16) and (2.17) can be rewritten in the following form:

p;{ sup v(()+d;;Iu,())


uEU(p)
~ -21 }

< (2.18)
R=oo,

where

Since the function fn(Q) has no more than a power growth as Q -+ 00, con-
dition HIm and the bound (2.18) give an opportunity to obtain an inequality
extending (2.7):

p;{ A~m)(()); Idn(()) (On - ())Io ~ H}


ClO L exp { - Cll (~
p~o
- 8) ~ (1- :J wmn(H + p)} , R < 00;

< (2.19)

R=oo,

where Cll,C13 E (0,1).


If R < 00 and m =j:. q, then

Lexp {-Cll (~ - 8) ~ (1- :J Wmn(H + P)}


p~o

< C14exP{ -C15 (~-8) ~ (1- :JWmn(H)}

x i7exp{-C16(~ -8) ~(1- :JWmn(H+p)}dP

< C17exp{-C15(~ -8)~(1- :JWmn(H)}, (2.20)

C15 + C16 = Cll .

If R < 00 and m = q, then it follows that in the inequality (2.20) we should


integrate from -1 to maxi~i~q Zin - H. The case R = 00 is considered similarly.

2. ERRORS WITH EXPONENTIAL MOMENTS 17

Let us assume that q,n(pl, .. . ,pq) E D(O), q,n(Pl, . .. ,pq) = q,n(lql), and that
the function q,n(x), x ~ 0, is monotonically non-decreasing in x and n. Since
q,On{x) ~ q,n(x), x ~ 0, then in (2.2) it is possible to take q,On{x) = q,n(X).
THEOREM 2: Let conditions 100 , IIo and 1110 be satisfied, with

Then there exist positive constants ClS --C2l such that

R< 00; (2.21)

R=oo. (2.22)

Proof: Theorem 2 is an 'isotropic', or, as one can say, 'radial', variant of Theorem 1
and is proved analogously: in the argument it follows that one only substitutes
the Euclidian norm 1 . 1 for the uniform norm 1 . 10.

Let us mention some corollaries of Theorem 2.


COROLLARY 2.1: In condition IIo let

a = 1, dn -= n l / 2 t q.

Then for any r >0

Proof: In fact, in (2.21) and (2.22) one only has to set H = rn l / 2


COROLLARY 2.2: Let

In (2.21), (2.22) let us set

and

H = H2 = (xCilc"i} logn)l/2.8 if R = 00.


Then for any x > 0 and i = 1,2
18 CHAPTER 1. CONSISTENCY

COROLLARY 2.3: Under the conditions of Theorem 2 let wn(x), x ~ 0, be a


continuous function. Then for certain constants C28, C29 > 0

sup E~ exp{ c28 Wn(ldn(B) (On - B)I)} < 00, R < 00;
(JET

Proof: Let C28 < C19' Integrating by parts, for BET we obtain

E~ exp{c28Wn(ldn(B)(On - B)I)}

= 1 00
eC2SWn(X) d( - P;{ldn(B)(O~ - B)I ~ x})

= 1 + C281O P;{ldn(B)(On - B)I ~ x}eC2SWn(X) dWn(x)

< 1 + C28C181O e-(ct9- C2S)W n (X) dWn(x)

< 1+ C18 C28 .


C19 - C28
The case R = 00 is analysed similarly.
For R < 00 the inequalities (2.4) and (2.21) can be sharpened at the expense
of additional assumptions about the behaviour of the regression function g(j,8).
Let us assume

Let us return to the proof of Theorem 1. Obtaining the inequality (2.8) let us
apply, for fixed i, Theorem A.l to the r.v.

The conditions of Theorem A.l are fulfilled for R from condition 100 and for

rj = T(g(j, B + d~lui) - g(j, B))2~O;(Ui' 0),


where T is the constant from condition 100 , Consequently,

and therefore
2. ERRORS WITH EXPONENTIAL MOMENTS 19

= p;{W(O+d~IUi,O)~O~(Ui'O) ~ (~-O) ~n(Ui,O)~O~(Ui'O)}

exp { - (~ - 0) ~ ~n(Ui' O)~o~(Ui' O)}, if ~On(Ui' 0) ~ ~ r~ 0


< (2.23)

if ~On(Ui' 0) ~ ! r~ 0 .
2

Let us assume that for a certain sequence ZOn t 00


n-+oo

ZOn sup sup ~on(U,0)~~1/2(u,0) ~ 1. (2.24)


9ET uEU:;(9)

Let us note that ZOn ~ n l / 2


THEOREM 3: Let conditions 100 (R < 00), IIo, IIIm and (2.24) be satisfied. Then
there exist constants C30 > 0 and C31 > 0 such that

Proof: If
rR
~On(Ui'O) < ! _ 0
2

then from (2.23) and condition HIm there follows the inequality

If, on the contrary,


rR
~On(Ui'O) ~ ! _ 0'
2

then from (2.23), condition HIm, and (2.24) follows the inequality

p;{V(O + d~IUi'O) ~~ - o} ~ exp {- (~ - 0) ~ Wmn(H + p)zon} .


Consequently,
20 CHAPTER 1. CONSISTENCY

Further arguments repeat the proof of Theorem 1.



An assertion analogous to Theorem 3 can be formulated for the isotropic case
analysed in Theorem 2. Here we also remark that if instead of (2.24) we require
the fulfilment of the condition
lim sup sup <I>On(U,O) < 00, (2.25)
n-+oo BET uEU~(B)

then the inequality


rR
<I>On(Ui, 0) S -1-
2"-8
can always be satisfied by choosing the corresponding number 8 E (0, ~). Therefore
for the fulfilment of (2.25) it is possible to obtain the analogue of Theorems 1 and
2 with quantity 'I!:nn(H) or 'I!;(H) in the the exponent. Clearly (2.25) holds for
a bounded regression function g(j, 0).
Let us consider the question of to what extent one can broaden the set A~\O)
with retention of the exponential bound in the formulation of Theorem 1 for m ::f: O.
Let the functions
'I!n(p1, ... ,pq) E D(m), 1Sm S q,
'I!n(i,p) = 'I!n(O,oo.,p,oo.,O),
with p in the i th place, i = 1, ... , q.
DEFINITION: We shall say that the function 'I!n(p1, ... , pq) belongs to the class
D(m) if'I!n E D(m), m = 1, ... , q, 'I!n is monotonically non-decreasing in each of
its arguments pi, i = 1, ... , q, and
(2.26)

Let us introduce the events


A(m)(.
~n Z,U
D)

{ld 1n (0)(8; - ( 1 )1 S Zln; 00.; Zin S Idin(0)(8~ - Oi)1 S a~iebai1lt~(i'Zin);

00.; Idmn(0)(8~ - om)1 s Zmn} ,

where m = 1, ... , q, and a~i = a~i(T) < 00 and b~i = b~i(T) > 0 are some
constants, and ~ = 1,2.
LEMMA 4.1: Let conditions foo, lIo, fIlm be fulfilled for the functions 'I!n E D(m).
Then there exist constants a~i < 00, b~i > 0, f ~i < 00, and g~i > 0 such that

sup p;{A~n;; (i, O)} S hi exp {- g~i'I!~(i, Zin)} , i = 1,00', m,


BET
2. ERRORS WITH EXPONENTIAL MOMENTS 21

where ~ = 1 if R < 00 in condition 100 and ~ = 2 if R = 00.


Proof: Let us consider the case R < 00, m < q, i = 1. The remaining possibilities
are argued analogously. Let introduce the sets
U(pl, ... ,pq) = {u E IRq : Zln + pI :$ lUll :$ Zln + pI + 1; p2 :$ lu 21 :$ p2 + 1;
... ; pq :$ luql :$ pq + I},
pI = 0, ... , [aue-blllJtn(l,Zln) - Zln] = p~;
pi = 0, ... , [Zin] , i = 2, ... , m, pi = 0,1, ... , i = m + 1, ... ,q,
{ u E IRl : Z In <
_ lUll <
_ a U eblllJtn(l,Zln) ,

lu 2 1 :$ Z2n, .. . , luml :$ Zmn}.


Analogously to (2.6) and (2.7) we obtain

< p;{ sup


uEai:> (1 ,9)nU~ (9)
v(9 + d;;-lU, 9) ~ ~}

P; {uEU(pl ,...sup ,pq)nU~ (9)


v( 9 + d;;-lu, 9) ~ ~} . (2.27)
For fixed values of pi, i = 1, ... , q - 1 let us estimate interior series in (2.27). Let
Uh = {Ui' i = 1, ... , b} be an h-net (in the norm I . 10) of the set U(pl, ... , pq),
and b '" h- q Then for a general term of the series investigated it is possible to
obtain the bounds analogous to (2.8), (2.9) and (2.15), having one difference. This
difference consists in this, that we do not pass to the function wmn(x), and we use
the monotonicity of the function Wn(pl, ... ,pq) for each of the arguments. As a
result, instead of (2.16) we obtain the inequality

Pq = p;{ sup
uEU(pl ,... ,pq)nu~ (9)
v(9+d;;-lU,9) ~~}
:$ C32 h- q exp { - ~ (~ - 8) RWn(Zln, p2, .. . , pq) } (2.28)

+8- B(C33 + C34 (Q~2B+qw;;-2B (1, Zln) + Q~C2B+QB+qw;;-3B (1, Zln))) hQB - q,
where
Qq = max(zln + pI + l,p2 + 1, ... ,pq + 1) :$ Qq-l + pq + 1,
Qq-l max(zln + pI + 1, p2 + 1, ... ,pq-l + 1).
22 CHAPTER 1. CONSmTENCY

In (2.28) let us set

h=exp{-~ (~-6)! qln{Zln,p2, ... ,pq)}.


Then, coarsening the bound (2.28), we find
Pq ~ (C35 + C36Q~~:+O:8+q + C37{pq + 1)2 c2 8+O: 8+q
x exp{ - c38q1n{Zln, p2, ... , pq-l, 0) - c39q1n{Q, pq)}, (2.29)

C38,C39 > 0, C38 + C39 = (~ - 6) ~ (1- :J.


Let us further note that for any binary collection X = (Xl, ... , xq ) f. 0
inf ,T. ( 1 1 q q) (2.30)
~nXP, .. ,Xp
I:%:plo~z

for those values of x for which the right hand side of this inequality has a meaning.
From (2.30), in particular, it follows that

qlmn{x) ~ qln{Q, x).

Therefore from (2.29) there follows the inequality


00

L...J P.q <


"" 2C2 8+O: 8+q) exp { -C38~n
- C40 (1 + Q q-l ,T. {Zln,P 2 , ... ,pq-l , O} (2.31)
p=O

with the constant C40 not depending on pi, ... , pq-l .


Successively estimating the series L~-l=O"'" L~+l=O and summmations
L~~b, ... ,L~2,:& using (2.31) and arguments analogously adduced, we arrive at
the inequalities

2: {Zln +
p~
< c41e-c421l1n (l,zln) pI ) 2C 2 8+O: B+q
pl=O

(2.32)
where

911 = (C42 - b1l{2c2S + as + Q+ 1)),

The index of the exponent in (2.32) is negative if

b1l < -R
2
(1)
--6
2
as - q
as{2c2s + as + Q+ 1)
.
(2.33)
2. ERRORS WITH EXPONENTIAL MOMENTS 23

In the inequality (2.33) let us define the maximum point s* of the function
as - q
/(s) = as(2c2s + as + q + 1) .
Thus we evidently obtain the upper bound values of the constant bl1 for which
the constant 911 still remains positive. Some simple calculations show that

q(2C2 + a) + .jq(2C2 + a)(a(q + 1) + q)


s* = a(2c2 + a)

/(s*) = a 2.ja(q + 1) + q (.jq(2C2 + a) + .ja(q + 1) + q r l

x (q(2C2 + a) + .jq(2C2 + a)(a(q + 1) + q) + a(q + 1) r l


.

In particular, if C2 = 0, a = 1, then
s* = q + .j2q2 + q, /(s*) = ( y'q + .j2q + 1 ) -2 , bl1 < ~ /(s*).
Analogous arguments show that for C2 = 0 and a = 1 with R = 00,
b2l < 8~ /(s*).
The same bounds hold for bli, b2 i, i = 2, ... , m. Let us denote

THEOREM 4: Under the conditions 0/ Lemma 4.1

+L
m
c4e-cslltmn(H) /li e -(9li/ Cr,)llt n (i,H), R < 00;
i=l
<
+ L hie-(92'/C2,)Ilt~(i,H), R =
m
Cse-c71lt~n(H) 00,
i=l

where
24 CHAPTER 1. CONSISTENCY

Proof: The assertion of the Theorem follows from Theorem 1, Lemma 4.1 and
inequality (2.26).
The following result is closely related to the previous one. Let condition 100 be
satisfied for R < 00. Then there exists a constant a> 0 such that
Eet(!ej!-I'l) < 00 for ItI ~ a.
Consequently constants ro > 0 and Ro > 0 can be found ([172], pp 54-55) such
that for ItI ~ Ro
Eet(!e1!-I'1) ~ e(ro/2)t 2

If conditions HIm and (2.24) are satisifed, then for any () E T and sets ain =
{u E IRq: luil ~ Zin}, i = 1, ... ,m
m
< L p;{ Idin ((}) (9; - (}i) ~ Zin}
i=l

Let us assume that

(2.34)

Applying Theorem A.l to the sum of the r.v.-s ei = ICil - /Ll, we find for i =
1, ... ,m,

p;{ L ICil ~ ~ ZOn 'lIn(i, Zin)}


< J exp { - ~ Ro (~n 'lin(i, z,.) - PI) }, n- 1ZOn 'Ii.(i, Z;n) :::: 2(7llRo + pd;

l exp { - 2~0 (~n 'linU, Z;n) - 1'1)'}' n-1ZOn 'Ii.(i, Z;n) '" 2(ToRo + pd
THEOREM 5: Let the conditions 100 (R < 00), 110 , lIlm, (2.24) and (2.34) be
satisfied. Then
3. ERRORS WITH MOMENT OF FINITE ORDER 25

< C30 exp { - C3l (ZOn Wmn(H) /\ W~n (H)}


m

+ Lexp {- ~ (~Yn /\Toly~)},


i=l

ZOn,T, (. )
Yn 2n 'Kn Z,Zin -ILl'

Proof: The result follows from Theorem 3 and the inequality (2.35).

3 LARGE DEVIATIONS OF THE LEAST SQUARES ESTIMATOR IN
THE CASE OF ERRORS WITH A MOMENT OF FINITE ORDER

We shall assume that the r.v. Cj satisfies condition:


Is. ILs < 00, for some real s ~ 2.
Our goal, as in Section 2, lies in obtaining a statement about the probabilities
of large deviations of the normed estimator On. Let Tee be some compact
set. Let us restrict ourselves to the case of isotropic (radial) discrimination of the
parameters of the regression functions and let us assume the following (keeping
the notation of Section 2):

(3.1)

where the function wn(x), x ~ 0 is monotonically non-decreasing in each of its


arguments n and x, and

n,x -+ 00.
The constants C2 and a are taken from condition Ill, which reproduces condi-
tion IIo of Section 2 in the following form:
Ill' For some a E (0,1] there exist constants Cl = cdT) < 00 and C2 = c2(T) ~
o such that
sup sup ~~(2(Ul,U2)lul - u21- a ~ cl(1 + QC2), Q >0.
(JET ut,u2Evc(Q)nu:;(J)

THEOREM 6: If the conditions Is, Ih, IIIq+1 and as > q are satisfied, then for
some constants C3, C4 < 00
26 CHAPTER 1. CONSISTENCY

Proof: The proof is analogous to the proof of the Theorem of Section 2. For
p= 0,1, ... let us write
u(p) = (vC(H(p + 2)) \ v(H(p + 1))) n U~(8).
Then for any 8 E T
P9{ldn (8)(On - 8)1 ~ H}

fp;{
~ p=o sup v(8+d;;:lU,8)
uEU(p)
~~};

p;{ sup v(8+d;;:lu,8) ~ -21 }


uEU(p)

< p;{ uEv c


sup
(H(p+2nU;; (8)
Iw(8 + d;;:lU,8)1 ~ ~ \I!~(H(p + I))}

~ ~ \I!~(H(p + I))} . (3.3)

By the inequality of Theorem A.2 and condition Ill, for Ul, U2 E vC(H(p + 2))n
U~ (0) we have
E;lw(8 + d;;:lUl,8 + d;;:lU2W
~ x(s)(f-Ls + f-L;/2)cf(1 + (H(p + 2W2 )81 u l - u21 s.
Therefore Theorem A.3 applied to the variable for the random field
u E vC(H(p + 2)) n U~(8),
allows one to arrive at the upper bound for the probability (3.3) of the form
c5(H(p + 1))(c2+ o )s\I!;;:2S(H(p + 1)).
Consequently
P9{ldn (8)(On - 8)1 ~ H}
00

< C5 ~)H(p + 1))(c2+o)s\I!;;:2S(H(p + 1))


p=o
3. ERRORS WITH MOMENT OF FINITE ORDER 27

It will be understood that the inequality (3.2) is non-trivial if the integral


converges.
COROLLARY 6.1: Let W'n{x) ~ caxf3, 0 < {3 ~ 0:, 2{3 - 0: - C2 > o. Then
sup pn Idn (8) {8 n - 8)1 ~ H} ~ C7 H -(2 f3 -a-c 2 )s.
fJET

In particular, if 0: = (3 = 1, C2 = 0, dn (8) == n l / 2 1q , then

suppnn l / 218n - 81 ~ H} ~ C7H-s. (3.4)


fJET

In (3.4) let us set H = n l / 2 r, where r > 0 is an arbitrary number. Then with the
condition s > q the inequality

nolds, indicating that the l.s.e. 8n is an uniformly consistent estimate.


The following assertion is analogous to the Theorems of Section 2, relating to
the case of weaker discrimination of parameters than condition III q+1, where, as
previously, only the isotropic case is considered. Let us introduce the condition
III q+2. The relation (3.1) is fulfilled, with
(1) the function W'n{X), x ~ 0, is monotonically non-decreasing in n and x,
and W'n{O) = OJ
(2) W'n{X) ~ xoxf3 for x ~ Zn, where Zn t 00, n -4 00, is some sequence,
0< {3 ~ 0:, 2{3 - 0: > S-l (o: and s are numbers from conditions III and Is)j
(3) n-+oo
lim suPz>z
- n
z;f3W'n{x) ~ Cg < OOj
(4) for the sequence ZOn from condition (2.24) and some ~ >0
lim n-lzOnW'n{Zn) ~ Jl.1 +~. {3.6}
n-+oo

THEOREM 7: Let conditions Is be satisfied, with o:s > q, condition III with C2 = 0,
condition IIIq+2 , and (2.24). Then

where 'Y E (0,1) is some constant such that 2{3'Y - 0: > S-l.
28 CHAPTER 1. CONSISTENCY

Proof: For any () E T and H > 0,

b
+P;{ao1lf nO(Zn) ~ Idn(())(()n - ())I ~ Zn}
A

+P;{ldn (())(8 n - ())I ~ ao1lf~O(Zn)}

where ao is some constant, and bo = 278/(0.8 + 1). Let H ~ Zn. Then using the
condition 2/3 - a > 8- 1 , analogously to the proof of Theorem 6, we obtain
[Zn/ H ]-1
P1 ~ C12 L (H(p + l))aB1lf~2B(H(p + 1))
p=O

L
[Zn/ H ]
< C12"02B H-(2/3-a)B p-(2/3-a)B
p=1

< C13"0
-2BH-(2/3-a)B
. (3.7)

Let us further note that


[aollt:o (Zn)z;;-1]-1
P2 < C14 L (zn(P + l))aB1lf~2B(zn(P + 1))
p=O

[aollt:O(Zn)Z;;-l]
< C141lf~2B (zn)z~B L paB
p=1

< aB+1 -2B(1-'Y)H-2B(1-'Y)/3-1


C15 aO "0 . (3.8)

From the conditions of the Theorem it follows that


3. ERRORS WITH MOMENT OF FINITE ORDER 29

Therefore by Theorem A.4 applied to the r. v. ej = Iej 1-J-tl, thanks to the inequality
(3.6) we obtain the bound

P3 ~ P;{~)lejl- J-td ~ zon'lin(zn) - nJ-tl} = o(n- s+1). (3.9)

The result of the Theorem then follows from (3.7)-(3.9) if H ~ Zn.


Let Zn ~ H ~ aow~O(zn)' Then Pl = 0, and instead of P2 one should evaluate
the probability

r1J!~O(Zn)H-l+l
< Cl6'li;;2s(H)HO:s 10 pO:S dp

< c17H-lw;;2s(H)w~'YS(zn)

< C17 H - lW;;2S(l-'Y)(H).


Lastly, let

Then
Pl = P2 = 0,
and by virtue of the Theorem's conditions

REMARK 7.1: In the proof of Theorem 7 the relation (3) of condition IIIq+2 is not
used directly. It shows, however, that we are not justified in arguing also as in the
proof of Theorem 6. In fact, if (3) is satisfied then for x > Zn and n > no

XO:W;;2(X) ~ cg2 XO:Z;;2/3.

If, for example,


x n -- z2/3/O:
n > Z n,
then x~w;;2(xn) does not tend to zero as n -t 00.

REMARK 7.2: Let us assume that E> is a bounded set and

where

d(E = sup Ix - yl,


z,yE9
30 CHAPTER 1. CONSISTENCY

Then in the formulation of Theorem 7 a term o(n-S+l) is missing.



Let us further restrict ourselves to the case, important in applications, in which
are satisfied the relations

(3.10)

The constraint (3.10) covers the bulk of the regression models used in practice,
and allows us to apply different limit theorems of the theory of probability for the
study of the asymptotic statistical properties of the l.s.e. On. Strictly speaking,
the general asymptotic theory of non-linear regression, when the condition (3.10)
is violated, has not been constructed up to now.
We prove an assertion analogous to the preceding theorems of this section, for
H = rn 1 / 2, r > 0 an arbitrary number. Let us set

~n (Ul' U2) = i.pn(O + nl/2d;;,I (O)Ul' 0 + nl/2d~1 (0)U2)'


For a fixed 0 E e the function ~ n (Ul , U2) is defined on the set iJ:;, (0) x iJ:;, (0),

Let us assume the following.


Ih. For any c > 0 and R > 0 there exists 8 = 8(c, R) > 0 such that
sup sup n-l~n(Ul,U2) ~ c. (3.11)
{JET 1.1 ,1.2EvC(R)nU~ (II)
11.1-1.21<0

III q +3' For some Ro > 0 and any r E (0, Ro] there exist numbers ~ = ~(Ro) >
o and p = p(r, Ro) > 0 such that
inf inf _ n-1<Pn(u,0) 2:: p, (3.12)
IIET 1.E(vc(Ro)\V(r))nU;;(II)

(3.13)

Let us denote

The following Theorem generalises a result of Malinvaud [152].


THEOREM 8: If the conditions Is, II2 and IIIq+3 are satisfied, then for any r > 0

supP9{lu n(O)1 2:: r} = o(n-(S-2)/2).


IIET
3. ERRORS WITH MOMENT OF FINITE ORDER 31

Proof: Let r E (0, Ro] be a fixed number, and Ro, p, ~ numbers the existence of
which is guaranteed by the condition IlIq +3' Let us write
v(u) = v(() + n l / 2d;;l(())u,()),
11"(~) = P{s* ~ J.t2 + ~}.
By the inequality (1.6) for any () E T we obtain

P;{lun(())1 ~ r} ~ p;{ sup


UEU:; (8)\v(Ro)
v(u) ~ !}
2

+P;{ sup
uE(vC(Ro)\v(rnu:; (8)
v(u) ~ ~}
= P+Pl 2

Using the Cauchy-Bunyakovski inequality, condition (3.13) and Theorem A.4, we


find

Pl < p;{S* ~ ~ _ inf n-l~n(u,O)}


uEU:;(8)\v(Ro)

< 11" ( ~) = o(n-(s-2)/2).


Let p(l), ... ,p(l) c vC(Ro) \ v(r) be closed sets and
I
U p(i) = vC(Ro) \ v(r),
i=l

the diameter of each p(i) is less than is, corresponding in condition 112 to the
numbers c and Ro (the quantity c will be chosen below), Ui E p(i) n iJ~(()), i =
1, ... , lo, lo ~ l (we consider the p( i) to be renumbered such that p( i) n iJ~ (()) :f:. 0
for i ~ lo). Then

+P;{ sup _
u',u"EF(i)nu:;(8)
Iv(u') - v(u")1 !}
~4
32 CHAPTER 1. CONSISTENCY

Using the Chebyshev inequality, the inequality of Theorem A.2 for 8 from condition
Is and (3.11), we obtain

P3 ~ C18<J?~S/2(Ui'O) ~ C19n-s/2.

We remark that

Iv(u') - v(u")1
< Iw(8 + n1/2d~lu',8)11<J?~1(u',O) - <J?~l(U",O)1

+<J?~l(U", O)lw(8 + n1/2d~lu', 8 + n1/2d~lu"l;

1<J?~l(U',O) - <J?~l(U",O)1

< 21/2<J?;;2 (u', u")

Consequently, for u', u" E F( i) n fJ~ (8),


Iv(u') - v(u")1 < (8*)1/2<J?;;2(U',U")
x ((1 + 21/2)<J?~1(U",O) + 21/2<J?~1/2(U',O)<J?~1/2(UIl,O))
< (1 + 23 / 2)C;1/2 p-1 (8*)1/2.
Therefore
P4 = o(n-(s-2)/2)
if c; in (3.13) is chosen such that

Collecting together the bounds for P1 -P4 , we obtain the assertion of Theorem 8 .

REMARK 8.1: Let us assume that

i = 1, ... ,q,

uniformly in 8 E T. Then in the conditions and formulations of this and the


statements below it is possible to replace the normalised matrix dn (8) by the
standard norm n 1 / 2 1q , which leads to the vanishing of the norm n- 1 / 2 dn (8).

REMARK 8.2: Let


3. ERRORS WITH MOMENT OF FINITE ORDER 33

and e be a bounded set. Then Theorem 8 is valid with the assumption IIIq+s in
the simplified version:
For any r > 0 there exists a number p = p(r) > 0 such that
inf inf n-1<pn(() + u,()) > p. (3.14)
(JET uE(9 c -(J)\v(r) -

Let us take one result on the consistency of the l.m.e. On for the model (0.1)
and the symmetric r.v. Cj. Let us write

4kn(Ul, U2) =L If(j, Ul) - f(j, u2)l k , k = 1,2, ... ,


4On(Ul,U2) = m?J(
l~J~n
IfU,ud - f(j,U2)1,

And so 42n = 4. Let us assume that:


I~. J.tB < 00 for some natural number s.
IIIq+4. For any r > 0 there exists .6. = .6.(r) > 0 such that

inf _ inf n- 1E; R(() + nl/2d;:;I (())u) ~ J.tl + .6.(r), (3.15)


(JET uEU:;(J)\v(r)

with which an ro > 0 can be found such that


.6.(ro) = POJ.tl + .6.0 ,
where Po > 2, and .6.0 > 0 are some numbers.
lIs. (1) For any C > 0 and r > 0 there exists 8 = 8(r,c) such that
sup sup n- 141n(U1,U2) ::; c. (3.16)
(JeT Ul,u2EvC(r)nU~(J)
IUI-U21~6

(2) For any r > 0 and s from condition I~ there exist constants X(B) = 8(B) (r) <
00 such that
sup sup n- 14Bn(U,0) ::; X(B), s ~ 2, (3.17)
(JET uEvC(r)nu:;(J)

sup sup
(JET uEvC(r)nu:;(J)
4on{u,O) ::; x(1), s = 1.
(3.18)

THEOREM 9: Let Cj be a symmetric r.v .. If the assumptions I~, lIs, IIIq+4 are
satisfied, then for any r > 0
O{n-B+1) s>2
sup P;{ldn {()) {On - ())I ~ rn 1 / 2 } ={ ' -,
(JET 0(1), 8 = 1.
34 CHAPTER 1. CONSISTENCY

Proof: Let us fix 8 E T and set

hn (8, u) = R(8 + nl/2d;;lu) - E(j R(8 + nl/2d;;lu).

Clearly

By the definition of the l.m.e. On


R(On) ::; hn(O, 0) + nJLl (mod P;).
Therefore by condition III q +4

P;{ldn(8)(On - 8)1 ~ rn l / 2 }

< p;{n- l hn (8,0)+ _inf n-lE(jR(8+nl/2d;;lu)-a(r)


uEU:;(9)\v(r)

< Prin- l h n (8,0) ~ (l--y)a(r)}

+P;{ _ inf n- l R(8 + nl/2d;;lu)


uEU:;(9)\v(r)

= Pl +P2,
where -y E (0,1) is some number.
The probability
Pl = o(n- a+!)
by Theorem A.4. On the other hand,

P2 ::; p;{ _inf


uEU:; (9)\v(r)
n- l hn (8,u) ::; --ya(r)} . (3.19)

Since, evidently,

~ln(U,O)-Llejl < R(8+nl/2d;;lu)

< L lejl + ~ln(U,O), (modP;)


3. ERRORS WITH MOMENT OF FINITE ORDER 35

then

(3.20)

Let us set r = ro and "I = 2/ Po. Then by condition III q +4, the inequality (3.20),
and Theorem AA the probability (3.19) is a quantity of order o(n- s +1). Conse-
quently, it remains to estimate the probability

P9{ro > In- 1/ 2dn(0)(On - 0)1 ~ r}


< P9{n-lhn(O,O) ~ (1- "I')~(r)}

+P;{ inf _
uE( vC(ro)\v(rnu~ (9)
n-1hn(0, u) ~ - "I' ~(r)}

and "I' E (0,1) is some number.


Let F(l), ... , F(l) C vC(r) be closed sets, the diameter of which does not exceed
the value 8, corresponding by condition (3.16) to the numbers r = ro and =
f3~(rh' /2, and f3 E (0,1) is some number,

UF(i) = vC(ro).
1

i=l

Let us fix the points Ui E F(i) n iJ.~(O), i = 1, ... , lo, lo ~ l. Then

Let us remark that

Ihn(O, u') - hn(O, u")1 < IR(O + n 1 / 2 d;;lu') - R(O + nl/2d;;lu ll )1


+EoIR(O + n 1 / 2 d;;lu') - R(O + nl/2d;;lU") 1
36 CHAPTER 1. CONSffiTENCY

< L:IIXj - /(j,u')I-IXj - /(j,u")11


+E; L:IiXj - /(j,u')I-IXj - /(j,u")11
< 24i1n (u',u").
Therefore by condition (3.16)
10
P3 ~ L: p;{ n- Ihn (9, ui)1 ~ (1 -
1 Ph' a(r)} ,
i=l

and consequently it is sufficient to bound each term of the latter summations


separately.
Let us single out certain properties of the r.v.-s ~jn = IXj - /(j, ui)l. By
condition If and (3.17)
n- 1 L:En~~
(J In < 2s- 1(JLs +n- l 4i sn (Ui,O))
< 2S - 1(JL8 + ,,(8) (ro)) , 8 ~ 3; (3.21)
n- 1 L:Ene
(J In = JL2 + n- l 4in(ui, 0)
< JL2 + x2(ro).
Assuming that JL2 < 00 let us consider the r.v. lej + gl and its variance
Dlej + gl = JL2 + g2 - (Elej + g1)2, 9 ~ 0.
Evidently Dlej + gl is a continuous function of g. Let us show that
Dlej + gl ~
9-+00 JL2

Since

EICj + gl = 9 1
-9
9+
P (dx) + 2 100 xP (dx),
g+
then

Die; + gl ~ 1'2 + g' (1- (f>(dX),) (3.22)

g
-49 / + P(dx) [00 xP(dx) -4 ([00 XP(dx))2
-g 19+ 19+

< 4 [00 x 2p(dx) ~ 0.


19+ g-+oo
3. ERRORS WITH MOMENT OF FINITE ORDER 37

The convergence of the latter two summands of the right hand side of (3.22) is
quite clear. Consequently
inf Dlcj
g~O
+ gl > 0,

(3.23)

The inequalities (3.21) and (3.23) make it possible to apply Theorem A.5 to
the r.v.-s
17jn = ~jn - E; ~jn
and, basing on this, for s ~ 3 to write
PJ'{n- 1Ihn (O, ui)1 ~ (1 - .Bh' ~(r))

< P(J
n{ Ihn(O, ui)I(E(Jnhn(O,
2
Ui))
-1/2
~
(1 - .Bh' ~(r)n1/2 }
(J.L2 + x(2) (ro))1/2

(T) (J.L2 + x
(2) ( ))8/2
< rO -8+1 (3.24)
x ((1 _ .Bh'(r))8 n .
In Theorem A.5 we took

For s = 2 the inequality (3.24) is a consequence of the Chebyshev inequality


(x(T) = 1).
Let us consider the case s = 1. Let us show that the triangular array of r. v .-s
j = 1, ... ,n, n ~ 1,
satisfies the conditions of Theorem A.6 for a compact T. In this case, for any C > 0
supPJ'{n- 1 Ihn (O,Ui)1 ~ c} --t 0,
(JET n-too

which completes the proof of the Theorem.


Let us remark that for () E T, by condition (3.18)

I~jn - E;~jnl < ICjl + J.L1 + 2~On(Ui'0)
< ICjl + J.L1 + 2X(1) (ro) (modPJ').
Therefore for any C >0
r Pjn(O, dx)
L Jlzl~gn = L PJ'{I17jnl ~ en}
< nP{lc11 ~ en - J.L1 - 2X(1) (ro)}

< 1,
C
r
J1z1>gln
IxlP (dx) --t
n-too
0,
38 CHAPTER 1. CONSISTENCY

where c:' E (0, c:) is some number. Consequently condition (1) of Theorem A.6 is
satisfied.
Let us next verify that condition (2) is satisfied for T = 1:

< E;[c:~ + 21C:i Ix(1) (ro) + (x(1) (ro))2 + (1-1 + x(1) (ro))2]
xX{IC:il < n + 1-1 + 2X(1) (ro)}
< ( . x2P(dx)+41-1x(1)(ro)+2(x(1)(ro))2+1-~.
J1z1<n+J.&1 +2x(1) (ro)

And so (2) is satisfied if

n- 1 ( x 2 P (dx) ----t O. (3.25)


J 1z 1<n+J.&1+2x(1)(ro) n-too

However, (3.25) follows ([172], p 318) from the condition

nP{IC:il ~ n} ~ { IxlP (dx) ----t O.


J1z1?n n-too

To verify (3) (for T = 1) let us write for 9 E T

By analogy with Theorem 8 we can make the following remark.


REMARK 9.1: Let
dn (9) = n 1 / 2 1q,
e
and be a closed set. Then Theorem 9 can be formulated with a simpler discrim-
ination condition for the parameters than III q+4.
For any r > 0 there exists a number .6. = .6.(r) > 0 such that

inf inf
9ET uE(9 c -9)\v(r)
n- 1E; R(9 + u) > 1-1
-
+ .6.(r). (3.26)
3. ERRORS WITH MOMENT OF FINITE ORDER 39

Nevertheless, the relations (3.15) and (3.26) are awkward to verify. Let us
mention one sufficient condition for (3.26) to be fulfilled, for example. Let us
assume that the dJ. of the r.v. Cj has a Lebesgue decomposition

"'a> 0

with absolutely continuous components Pa, P~ = Pa and


(1) sup sup Ig(j,8l ) - g(j,82 )1 = go < 00, (3.27)
j2::l 01002E9C

(2) inf Pa(X) = Po > O. (3.28)


1"'1::;90
Then, as far as for any 9 E [-go, go]

to the extent that

n- l Eo R(O + u) ~ J..ll + Po"'an - l CPn (8 + u, 8).


Consequently in the situation outlined (3.26) is a corollary of (3.14).
In the conclusion of this section let us consider the question of the consistency
of a class of estimators of the parameter 0 in the model (0.1), in a sense defined
close to the l.s.e. and l.m.e ..
DEFINITION: The estimator lo: of the parameter 0 E e, obtained from the observa-
tions Xl, ... ,Xn , is the name given to any random vector 8~ having the property

In particular, the l2 estimator is this l.s.e., and the h-estimator it the l.m.e ..
Keeping the notation ~kn(Ul,U2) for non-integral k let us assume that 1 < a < 2,
J..l20: < 00 and:
Ik (1) For any C > 0, R > 0 there exists 8 = 8(c, R) such that
sup sup n-l~o:n(Ul,U2) ~ C; (3.29)
OET Ul ,u2EvC(R)nU~ (0)
IU 1-u21<o
(2) For any R > 0 there exists a constant
0(20:) = ",(20:) (R) < 00
such that

sup sup n-l~20:,n(U,0) ~ ",(20:). (3.30)


OET uEvC(R)nU;;(O)
40 CHAPTER 1. CONSISTENCY

II q +5. For any r > 0 there exists ~ = ~(r) > 0 such that
inf inf n -1/0: Eo S~/O: (0 + n1/2d;;1 (O)u)
(JET uEU~((J)\v(r)

(3.31)

and where there also exists flo > 0 such that

where Po > 2, > 0 are some numbers.


~o

THEOREM 10: Let J.L20: < 00 for some 0: E (1,2), and let the conditions II4 and
IIIq+5 hold. Then for any r > 0

supP;{ln-1/2dn(O)(O~ - 0)1 ~ r} = O(n- 1).


(JET

Proof: Although the proof is similar to the proof of Theorem 9, it contains some
details that differ from the preceding arguments.
Let us denote

let us fix 0 E T and set

hn(O,u) = S~/O:(u) - EoS~/o:(u).

By the definition of the 10: estimator

Therefore by condition III q +5, for 'Y E (0,1)

P;{ln-1/2dn(O)(O~ - 0)1 ~ r}

< P;{n- 1/O:h n (O,0) ~ (1- 'Y)~(r)}

+P;{ _ inf
uEU;; ((J)\ V(r)
n- 1/O:hn(O,u) ~ -'Y~(r)}
(3.32)

Evidently,

PI! a.c.,
3. ERRORS WITH MOMENT OF FINITE ORDER 41

and

Therefore

E'8n- 1/ OI S!/0I(O) ----+ J-tlja.


n-+oo

Let 0 < C20 < (1 -,,)~(r) be some number. Then for n > no and
C21 = (J-tlja + (1 -,,)~(r) - C20)0I - J-ta > 0,
we have

PI ~ p;{ n- 1 L lejlOl - J-ta ~ C21} ~ (J-t201 - J-t~)~ln-l


by the Chebyshev inequality. Clearly,

4iljnOl (u, 0) - S!/OI(O) ~ S!/OI (u) ~ 4iljnOl (u, 0) + S!/OI (0) (mod P;),

and consequently

n- 1/ 0I hn(O, u) ~ - n- 1/ 0I S!/OI(O) - n- 1/ 0I E'8 S!/OI (0) (mod P;).

In (3.32) let us set R = Ro and " = 2/ PO. Then by condition IIIq+5

P2 < p;{ n- 1 L lejlOl - J-ta ~ C22 }


< (
J-t201 - J-ta2) C22
-2 -1
n ,
C22 = (J-tlja + 2pol ~O)OI - J-ta.
And so it remains to estimate the probability (,,' E (0,1))

pnRo ~ In-l/2dn(O)(O~ - 0)1 ~ r}

~ p;{ sup _
uEvc(Ro)nu:;(IJ)
n- 1 / 0I Ih n (O,u)1 ~ "'~(r)} + O(n- 1 ). (3.33)

Let us introduce, as above, the closed sets F(I), ... ,F(l), the diameters of which
do not exceed the number 8, and which correspond to the condition (3.29) with
the numbers Ro and

e= (c23~(rh' /2)01, C23 E (0,1),

U
I
F(i) = VC(Ro).
i=1
42 CHAPTER 1. CONSISTENCY

Then for u', u" E F(i), by condition (3.29)

Therefore for the probability P3 , of the right hand side of (3.33), we can write
10
P3 ~ L P;{n- /"lh 1 n ((J, ui)1 ~ (1 - C23h' ~(r)). (3.34)
i=l

We note that

Using the inequalities

and condition (3.30) we obtain

n- 1 /" (El(S" (u)) 1/" _ n- 1 /" Efs~/"(u)

IS~/"(U) - (EeS,,(u) Y/"I


_ - 11 /"
~ 1S,,(u) - EeS,,(u) . (3.37)

From the inequalities (3.35)-(3.37) it follows that for some constant 0 < C24 <
(1 - C23h' ~(r) and n > no

PO{n- 1 /"lh n (ll, ui)1 ~ (1- C23h' ~(r))

~ p;{ n-1IS,,(ui) - EeS"(Ui) I ~ ~4} ~ C25 n - 1 ,

where the latter bound is valid for each summand of the right hand side of (3.34) .

3. ERRORS WITH MOMENT OF FINITE ORDER 43

Let us show that the condition IIIq+5 in many cases can be replaced with
something more convenient for checking one. It was established above that
n- l / o E;S~/O(fJ) -----+ J.Lljo.
n-+oo

Together with this

Consequently, from the bound for (3.34) it follows that if instead of (3.30) the
condition

lim sup
n-+oo (JET,pEec
n- l L Ig(j, fJ) - g(j, r)1 2o < 00 (3.38)

is satisfied, then instead of (3.31) it is sufficient to verify the inequality

inf inf n- l / o (E;So(U))l/O > J.Ll/O + ~(r). (3.39)


(JET uEU;;(J)\v(r) - 0

Clearly (3.38) holds for a bounded regression function g(j, fJ).


Let us assume that E> is a bounded set and

Let us introduce the following conditions:


(1) The dJ. P(x) of the r.v. Cj are absolutely continuous (P'(x) = p(x)) and
the density p(x) is a bounded even continuously differentiable function on ]Rl that
is monotonically non-decreasing on (-00,0], and J.L20 < 00;
(2) For any C > 0 there exists a fJ = fJ(c) > 0 such that

sup
(J1>(J2E9 c ;l(Jt-(J21<6
n- l L Ig(j,fJl ) - g(j,fJ2 )IO ~ C;

(3) sup sup Ig(j, fJ) - g(j, r)1 ~ go < 00. (3.40)
j~l (JET,rEe c

THEOREM 11: If the conditions (3.14), (1)-(3), and

Go = [:0 IxIO-lp'(x) dx > 0


are satisfied, then for any r >0

as n -+ 00.
44 CHAPTER 1. CONSISTENCY

Proof: Since (3.40) ensures (3.38), then it is sufficient to verify the inequality
(3.39), which is now turned into the inequality

(3.41)

i: i:
Let us indicate one bound for the quantity ~ * (r). Let us consider the function

k(z) = Jx + zJap(x) dx = JxJap(x - z) dx, z ~ O.


Clearly,
k(O) = J.ta.

i: i:
Let us note that

k'(z) = - JxJap'(x - z) dx =a sgnxJxJa-lp(x - z) dxj

k'(O) = OJ

k"(z) = a fO JxJa-lp'(x -
Loo
z) dx - a
10
tx) xa-1p'(x - z) dx.

Since p'(x) ~ 0 for x ~ 0 and p'(x) ~ 0 for x ~ 0, then

- a 1 00
xa-1p'(x - z) dx > aza-lp(O),

a iOoo xa-1p'(x - z) dx = a i: Jx + zJa-lp'(x) dx

i:
Therefore

k"(z) ~a JxJa-lp'(x) dx. (3.42)

Let us write
aj = Jg(j, 0) - g(j,r)J.
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 45

Using the conditions of the Theorem and the bound (3.42), for IT - 81 ~ r we
obtain

0:
> 2' Gop(r) ,
where per) is the number from the condition (3.14). Consequently, in (3.41) it is
possible to take
~*(r) ::; (0:/2)G op(r).

EXAMPLE 3: The conditions of Theorem 11 are satisfied by the r.v.-s Cj with


densities

_ 0:/31/Ot -,Blxl"
p(x) - 2r(1/0:) e , /3 > 0, 0: E (1,2), x E ~1 ,

for which the lOt-estimators of the parameter B are estimators of the maximum
likelihood.

4 THE DIFFERENTIABILITY OF REGRESSION FUNCTIONS AND


THE CONSISTENCY OF THE THE LEAST SQUARES ESTIMATOR

If we regard condition (3.10) as being satisfied we show that the assumption of


differentiability of the regression function g(j, B) allows one to sharpen the result on
the consistency of On contained in Theorem 8. If g(j, B), j ~ 1 is a differentiable
function, then it is natural to take as the normalising matrix dn (8) the matrix
composed of the elements

din(B) = (L g;(j, 8) )
1/2
, gi =
o
OBi g, i = 1, ... ,q.
Let us introduce a series of assumptions.
lI5 . The set 0 is convex. The functions g(j, B), j ~ 1, are continuous on 0 c,
continuously differentiable in 0, whence for any R > 0:
(1) there exist constants iJi = iJi(R) < 00 and!!.i = !!.i(R) < 00 such that
sup sup din (8 + n1/2d;;1(B)u)d~1(B) ::; iJi, i = 1, ... ,q, (4.1)
(JET uEvc(R)nU:;((J)

sup sup d~l(B + n1/2d;;1(B)u)din (B) ::; /3 i, i = 1, ... ,q, (4.2)


(JET uEvC(R)nU:;((J) -
46 CHAPTER 1. CONSfflTENCY

(2) There exists a constant ')'i = ')'i(R) < 00 such that

sup sup din(B) (4i~) (Ul, U2) f/21u1 - u21- 1 :::; ')'i, (4.3)
(JET Ul ,u2Eve(R)nu~

i = 1, ... ,q.

The following condition sharpens (3.11) in some neighbourhood of zero.


III q+6. For some TO > 0 there exists a number "0 > 0 such that
inf inf _ n- l 4i n (u, O)lul- 2 ~ "0. (4.4)
(JET uEve(ro)nU~((J)

Let us denote
fi (j, U) = gi(j, gh + n 1/ 2d;;1 (O)u),

cp~)(Bl,B2) = 2:[gi(j,Bt} - gi(j,02)J2, B1,B2 E e,


4i~)(Ul,U2) = 2:[Ii(j,ut} - fi(j,u2)J2, Ul,U2 E U~(B), i = 1, ... ,q.

IV t . For some integer t ~ 3 and any R >0


nt / 2- 1
;~~ sup sup_
(JET uEv(R)nU,,((J)
d!; (B)
m
L 11i(j, u)l t < 00, i = 1, ... , q. (4.5)

Let us show that from the condition (4.1) there follows a condition that makes
the requirement II2 of Theorem 8 more precise.
LEMMA 12.1: If (4.1) holds, then
sup sup n-l~n(Ul,U2)lul - u21- 2 :::; 41,8(RW < 00 (4.6)
(JET Ul ,u2Eve(R)nu~ ((J)

where 1,8(R)1 is the norm of the vector ,8(R) = (i31(R), ... ,i3q (R)).
Proof: Let BET be fixed. By the finite increments formula for Ul, U2 E vC(R) n
U:;(B), with the aid of the Cauchy-Bunyakovski inequality we find
n- 14iN(ul,U2)

= 2n- 1 L(j(j, Ul) - f(j, Ul + TJn(U2 - Ul)))

x (\1 f(j, Ul + TJn (U2 - ut}), nl/2d;;1 (B) (U2 - Ul)}


4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 47

where 'TIn E (0,1), V f(j, u) is the gradient of the function f(j, .) at the point u.
Then from the inequality obtained in (4.1) it follows that

sup n-14in(ul,U2)lu2 - ull- 2


Ul,U2Ev C (R)nu:; (9)

< 2 sup n-l/24i~2(Ul' Ul + 'TIn(U2 - Ul))


Ul,U2 Ev C (R)nu:; (9)

~ 21.B(R)I sup n-l/24i~2(Ul' u~)lu~ - ull- 1,


Ul ,u~EVC(R)nu:; (9)

from whence we also obtain (4.6).


The relation between the smoothness of the function g(j,O) and the condition

IlIq+6 is somewhat more complicated. Let us set

o E e.
The symmetric matrix 1(0) is non-negative definite. Let Amin(I(O)) be the smallest
eigenvalue of the matrix 1(0). We now introduce a condition which plays an
important role in the following chapters:

v. For n > no (4.7)

If a second derivative exists for the function g(j, 0), then let us set

1/2
diln(O) = ( Lg~I(j,O) ) , i,l = 1, ... ,q.

II6 The functions gi(j,O), i = 1, ... , q, j ~ 1 are continuous on e c , and are


continuously differentiable in the convex set e, whence for any R > 0 there exist
constants "til = "til(R) < 00 such that

sup sup n 1/ 2din1 (O)dinl (O)diln(O + nl/2d~1(O)u) ~ "til, (4.8)


9ET uEvC(R)nU~(9)

i,l = 1, ... ,q.


LEMMA 12.2: The condition IIlq+6 is an implication of the conditions (4.1), Il6
and V. The condition (4.3) follows from 116
48 CHAPTER 1. CONSISTENCY

Proof: By the Taylor expansion for U E vC(R)U;(O), T > 0, we find


n- l 4i n(u,O) = (1(0, u)u, u} + (I(I)(O,U)u,u),
1(0, u) = (dinl(O)dinl (0) L h(j, TJnU)f,(j, TJnU))~
,,/=1 ,
1(1) (0, u) = (din1(O)dinl (0) L(j(j, TJn u ) - f(j, O))fil(j, TJn u ) ):,/=1 '
where Ii is defined above,

i,l=l, ... ,q, TJnE(O,l).

From (4.6) and IIa, for any element li)I) of the matrix 1(1) we obtain

(4.9)

On the other hand, for u E vC(R) n U~(O) and for the difference between the
general elements of the matrices l(u,O) and 1(0), with the help of (4.1) and (4.3)
we obtain

Ilil(O,U) - lil(O)1

< dinl(O)din (O + nl/2d;:/TJnu)dinl(0) (4i~) (TJnU, 0) f/2


+din1(0) ((')
4-; (TJnu,o) )1/2

(4.10)

The inequalities (4.9), (4.10) and condition V show that there exist numbers TO > 0
and X'o > 0 such that for any 0 E T (4.4) holds.
Let us now convince ourselves that condition (4.3) is a corollary of IIa. In fact,
we apply the finite increments formula to the function

and repeat the arguments which have led to the inequality (4.6). We then obtain
(4.3) with the constants

"Y(i) = (')'il, ... ,"Yiq), i=l, ... ,q. _.

After the elucidation of the relations between the conditions introduced above
we are able to formulate the basic assertion of this section.
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 49

THEOREM 12: Let the conditions of Section 9 be satisfied: l:


for some s ~ 3 and
IIIq+3 , also conditions lIs, IIIq+6, IVB Then if S2 > s + q there exists a constant
x > 0 such that
supP;{ldn(8)(On - 8)1 ~ xlogl/2 n} = o(n-(B-2)/2). (4.11)
(JET

Proof: Let 0 < r* :::; ro, where ro is the constant in condition IIIq+6' By Theorem 8
the conditions that flow from the conditions of Theorem 12,
sup P;{lu n (8)1 ~ r*} = o(n-(B-2)/2).
(JET

Therefore in order to obtain (4.11) it is sufficient to show that for


Tn = n- 1 / 2 1og1 / 2 n
we have
sup P;{r* > lun (8)1 ~ XTn} = o(n-(B-2)/2). (4.12)
(JET

We now show that for

we have

supp;{ sup v(u) ~ ~} = o(n-(B-2)/2), (4.13)


(JET uE(ve(r.)\V( ....TnnU~((J)

whence (4.12) follows by arguments analogous to the previous ones. Let us fix
8 E T. Let us introduce the sets

Qrn = {rXTn :::; lui:::; (r + 1) XTn} n vC(r*) n fJ~(8), r = 1, ... , [;J .


By the condition IIIq+6

p;{ sup
uE*ve(r)\V( ....TnnU~ (9)
v(u) ~ ~}
[r/ ....Tnl
< L P; {
r=l
sup v( u) ~ -
UEQrn
1}

2
50 CHAPTER 1. CONSISTENCY

Applying the finite increments formula to w(O + nl/2d;;lU, 0) as to the function of


u E v*r + l)xTn) n vC(r*) n U~(O) we obtain
w(O + nl/2d;;lu, 0) = b(O + nl/2d;;lu) - b(O)
q
Lbi(O + nl/2d;/TJnu)nl/2dinl(O)u i ,
i=l

where

i=l, ... ,q, TJnE(O,l), (ut, ... ,uq)=u.

Let us cover the ball vr+ l)xTn) nv(r*) with an -net N = {u(m)}. The number
of points IN I in N does not exceed the quantity

where the constant c(q) depends only on the dimension q of the parametric set 8.
Let {s(m)} be sets formed by the intersection of balls of radius with centres at
the points u(m) with the set v(r*) n Un(O). Then

{ sUP _ IW(o+nl/2d;;lU'O)I~~nr2xo~T';}
uEv C r+1 )XTn )nv c (r* )nu;; (0)

q- 1 r2 }
c { sup _ "L...J Ib(O
~
+ n 1 / 2d-n 1 uld-:-tn1 > - n 1/ 2 -
- 2
- Xi XT.
r+l 0 n
uEvr+l)XTn)nV(r*)nU;; (0) i=l

(4.14)

where 8 E (O,~) is some number. The last inclusion was made possible thanks to
condition (4.3):
q
sup L bi(O + nl/2d;;lu(m)) - bi(O + nl/2d;;lu) dinl
uEs(m) i=l
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 51
q
< (ns*)1/2 g L 'Yi(r*).
i=1
Let us denote q
'Y(r*) =L 'Yi(r*).
i=1
Then

p;{ sup II(U) ~ ~}


uEQr ..

(4.15)

Let us consider the last summand of the right hand side of (4.15), setting
g = XTn; X = x(r)is a number which will be chosen later. We need to estimate
the probability 1r(~r)' where

Consequently
~r > 0, r = 1, ... ,r1,
if

(4.16)

and

if

x> 'Y~r*) (1 + JL2(r + 1)2r-4 x 2(r))1/2, (4.17)


xO(2 - 8)

x(r) < r2(r + 1)-1, (4.18)

If we consider the inequalities (4.16), (4.17) and (4.18) as being satisfied, for
r = 1, ... ,rl we find that

(4.19)
52 CHAPTER 1. CONSISTENCY

by statement (1) of Theorem AA. For r > r1, by statement (2) of the same
theorem

(4.20)

where
(n = o(n-(S-2)/2)
and does not depend upon r.
Using the condition (4.1) let us estimate for fixed m the probability entering
into the sum on the right hand side of the inequality (4.15):

P1 = P; {t Ibi(O + n1/2d;;-lu(m) Idi; :2 t5xox r: 1 n 1/ 2Tn }

L P;{bi(O + n 1/ 2d;;-l
q

< U (m)Idfu1(0 + n1/2d;;-lu(m) :2 p,~/2ain} ,


i=l
where
~
ain = uXOXT 2( P,11/2(r + 1) q(3-i (*))-11
r og1/2 n, i = 1, ... ,q.
For the estimation of P1 let us take advantage of the conditions (4.2), IV s, and
Theorem A.5. For this it is necessary that for some some fixed 15' > 0 there is
satisfied the inequality

r:2 1,
or

(4.21)

where

If (4.21) holds, then setting, in Theorem A.5,

<,,In 1/-1/2 cJ'f t'(J' , u(m)n1/2d-1(O+n1/2d-1(O)u(m)


c. -- ,.-2 in n , j = 1, ... ,n,

we find that
CT~(O) = 1,
and the requirement of Theorem A.5 on the quantity Ps,n(O) is satisfied at the
expense of the condition IV s and the relation (4.2). Therefore
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 53

where the constants xi(T) do not depend upon r.


Let us assume that the constant x satisfies the inequalities (4.16), (4.17) and
(4.21). Then, thanks to (4.14), (4.15), (4.19), (4.20) and (4.22), we obtain

p; { sup
UE( vC(r")\V(XTn))nU:; (0)
v(u) ~ I}
2

[r"/xTnl
L
rl

< '~:::>(~r) + (n (r + 1)Sr- 28 x S(r)


r=l

[r" /xTnl
X L r- 28 (r + l)s+qx- q(r). (4.23)
r=l

Let us set
x(r) = rq/(q+s) ,
Then the series Er(r + 1)Sr- 2S x S(r) and Er(r + 1)q+sr- 2S x- q(r) converge if the
series Er r- s2 /(s+q) converges, i.e., if 8 2 > 8 + q. Since the latter is stipulated in
the formulation of the theorem, the relation (4.13) will then be established if the
constant x is chosen appropriately.
Clearly the function

r +1 1 1
----;:2 x(r) = rs/(s+q) + r1+s/(s+q)

decreases monotonically for r > O. Therefore in the inequality (4.16) it is possible


to take

(4.24)

Let us consider the inequality (4.17). If 4JL2 > 1 then an integral number rl ~ 1
can be found such that for r > rl

Consequently (4.17) will be satisfied for r > rl if (4.24) holds. Let 4JL2 ~ 1. Then
if

(4.25)
54 CHAPTER 1. CONSISTENCY

then even for r = 1 (4.17) will hold. Independently of the value that J.t2 takes, the
inequality (4.18) holds for
r > r1 miner : rs/(s+q) > 1 + r- 1 ).
In this way it is possible, in the formulation of the Theorem, to take x >
max(x1' X2, X3), where the quantities Xi, i = 1,2,3, are obtained from (4.21),
(4.24) and (4.25).
EXAMPLE 4: Let

g(j, (J) = (J1 cos (J2 j, j ~ 1,

e = (0, A) x (h,7r - h), A < 00, h> 0,

T = [a, b] x [<p,7r - <p], a > 0, b < A, <p > h, s = 3.


In this case

2 1)1/2
d (J) = ('" 2 (J2 .)1/2 = ( ~ ! (sin(2n + 1)(J )-
1n L.J cos J 2+4 sin(J2 '

and

uniformly with respect to (J E T.


On the other hand,

d2n (J)

= (J1 (Lj2 sin2 (J2j y/2

(J1 (n3 + n 2 _ ~ + n 2 sin(2n + 1)(J2 + ~ cos2n(J2 _! sin2n(J2 COS(J2)1/2


6 4 12 4 sin (J2 4 sin2 (J2 8 sin3 (J2

n 1/ 2d2";(J)

= (J2..j6 (1 + O(n-1))
n
uniformly with (J E T. Therefore it is appropriate to take the matrix
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 55

as the normalising matrix instead of dnO.


Keeping the meaning of the notation introduced above, we obtain for the stated
norm

+ (01)2 (!2 + ~4n (sin(2~sm0+2 1)02 _ 1))

01(0 1 + v'2 u 1)
2n

X (
sin(n + ~) (20 + oV:n u
2
)
------;-"------,--~
2

+
(n
sm + 1)
2 0v'6
1 U
n
2 1
- 1
(2
sin 0 + -v'6- u
20 1 n
2) . v'61 u
sm 20
n
2

(4.26)

uniformly with 0 E T.
Clearly condition (3.11) is satisfied. Let us verify that (3.12) is satisfied. Let
us consider first the case lu 2 1> c, where c > 0 is some number. For a fixed u 2 we
obtain

>-
-
1
(0 -
)2 ( 1 - -,-------,.
2
1
2n sin 20~n
v'6'
u2
1 (4.27)
56 CHAPTER 1. CONSmTENCY

If (0 1 )2 > BJ..L2' then it follows from (4.26) and (4.27) that a number Rl can always
be found such that for lu2 1;::: Rl the inequality (3.12) is satisfied. In fact,

2n V6 u 21 ;:::
sin 2Jln V6 .Inf 2V6
sin x 1 21 > -
-- U - R 1.
u 01 xE(O,1I"/2) x - rrb

And so, if
aJ J..L~/2 > 2v'2
then in (3.12) it is possible to take

Ro J
= R~ + 4J..L2
Indeed, if

then (3.12) holds. If

then

Now let lu2 1 ~ c. Then if

then
lu 1 12 ;::: R~ + 4J..L2 - c2
Now let us choose Rl so large that u 1 can not take such values for 0 E T.
This means that all points u = (u 1 , u 2 ) with lu 2 1 ~ c are found in the set
VC ( JR~ + 2J..L2) n U~(O).
Let us note that the quotient 0 1 JJ..L~/2 in the statistical theory of communication
is called the signal to noise ratio, and the condition aJ J..L~/2 ;::: 20 has a physical
meaning, considering the observations Xj as noise-contaminated by the signal
g(j,O) = 01 cos0 2 j.
Let us verify the condition IIIq+6. The calculations show that
4. DIFFERENTIABILITY OF REGRESSION FUNCTIONS 57

uniformly in () E T. Consequently the number ro in condition IIIq+6 can always


be found if we choose Xo < 2.
Let r* = ro. It is easy to see that the condition (4.1) is satisfied for R = r* if
~1(r*1

It is equally easy thus to convince ourselves of the validity of (4.2). Let us verify
(4.3). Let us note that

~ 4'~l)(Ul' U2) ~ (2(()1)2 + O(n-1))lul - U2/ 2 ,


n
i. e., it is possible to take

Some simple calculations show that

6
((jl)2 n 3
82 (2)
(8u 1 )24'n (Ul> U 2)
I =
2
((jl )2 + O(n
-1
),
2 U2=Ul

= O(n- 1 ),

uniformly in (j E T and U E v(r*), and where l1Jnl ~ r*. Consequently it is possible


to choose

and

Let us set c5 = i. Then it is possible to assert that

~~~ pun { (~ (e~ _ ()1)2 + (()l ~2n3 (e~ _ ()2)2y/2 ~ x'logl/2 n}


= o(n- 1 / 2 ), (4.28)
where

x' > max (>4, ;.4),

>4 = 8{t;/2(1 + V2r*a- 1 ),


;.4 = max (2V2, 4{t~/2) . max (V2 a-I j J3.6 (b + V2 r*)) ,
58 CHAPTER 1. CONSISTENCY

or

5 STRONG CONSISTENCY

Let us consider the statistical experiment N = {R N , EN, pf, () E e} generated


by the sequence of observations X 1 ,X2, ... of the form (0,1). The probability
spaces (RN, EN, pf) naturally use in the study of almost certain (a.c.) convergence
of the statistical estimators to a true value of the parameter () E e. It is understood
that all assertions of the preceding sections can be reformulated for the experiments
considered. Let
On = On(X1" " ,Xn), n ~ 1,
be a certain sequence of estimators.
DEFINITION: The sequence On, n ~ 1, is called a strongly consistent sequence of
estimators of the parameter () E e (On is a strongly consistent estimator of () E e)
if

pf {On ----t ()} = 1 ( On ----t 0 pf a.c.) .


n--+oo n--+oo

From the theorems of the preceding sections it is possible to obtain some suf-
ficient conditions for strong consistency of the l.s.e.-s en and l.m.e.-s On.
Let us mention two examples. Let us assume that in the conditions of Theo-
rem 8 {Ls < 00 for some s > 4. Then for any r > 0 and () E T

and, consequently,

Analogously, under the conditions of Theorem 9, if {Ls < 00 for some natural
number s ~ 3, then for any () E T

In fact, these estimators are strongly consistent for less severe constraints. The
corresponding assertions will be introduced in the second part of the section, here
we shall formulate and prove one general assertion about the strong consistency of
minimal contrast estimators (m.c.e.) for non-identically distributed observations.
5. STRONG CONSISTENCY 59

Let us consider a sequence of families of Borel functions


Fj = {fj( . ,0) :]R1 -t iiP, 0 E eel, .
J ~ 1,
-1
]R = [-00, +00]. (5.1)
Clearly, for any 01 , O2 E e and j ~ 1

E~1;(Xj,02) = E~~) 1;(Xj,02)

on conditions that there exist the mathematical expectations E~, E~~) under the
measures PI:, pIt
l
We shall assume that the assumption (3.10) is satisfied.
DEFINITION: The sequence {5.1} is a sequence of families of contrast functions for
the family of measures {pf, 0 E e} if
{1} for any j E N and any 01,02 E e
there exists E~~) 1;(Xj ,02);
e
{2} for any 0 E for any r > 0 there exists ~ = ~(r) > 0 such that

_inf n-1(LEf1;(Xj,0+n1/2d:;/(0)u)- LEf1;(Xj,O)) > ~(r). (5.2)


uEU:;(9)\v(r)

DEFINITION: The m.c.e. of the parameter 0 E e, obtained from the observations


Xl, ... ,X n, is the name given to any random vector en for which the relation

is satisfied.
Below we shall assume that ee is compact and that inf9EeCn-1E1;(xj,0)
is attained in ee for any x = (Xl' ... ' Xn) E ]Rn. We shall also assume that for
any j E N and any Borel set B ~ ee that infrEB 1; (x, r) and sUPrEB 1;(x, r) are
Borel functions of x E ]R1.
Let us denote
vro(r) = {r E ]R1 : Ir - rol < r}, 1;( .. 0) = 1;(0).
and introduce the assumptions:
VI1. For any ro,O E e and p > 0
sup Pj9 {
j?:l
sup
rEv"'o(r)nec
11;(r) - gj(ro)1 > p} ----t
r--+oo
O. (5.3)

VI2 For any ro E e there exists ro > 0 such that the sequence of r. v.-s
inf9Ev"'o(ro) 1;(r) is uniformly integrable in the sense that for any 0 E e

supE~j)
j?:l
1 inf 1;(r) 1
rEv"'o(ro)
x{ rEv"'o(ro)
inf 1;(r)
1 1 > R} ----t
R--+oo
O. (5.4)
60 CHAPTER 1. CONSISTENCY

The r.v. SUPrEv,.o(ro) h(T) has a similar property.


VI 3 . For any (), TO E e there exists Po > such that for any r E [0, Pol

SUPrEv,.o(r) /jeT), r E [0, Po], has a similar property.


The condition VII is the condition of equicontinuity in probability of the se-
quence ofrandom fields /j (T), T E e c From VI 2 it follows, in particular, that for
any TEe the sequence h (T) is uniformly integrable with respect to pf for any
() E e. The following assertion is almost obvious.

LEMMA 13.1: Let us assume that the r.v.-s e~jl, n ~ 0, are given on the probability
spaces (OU),.'F(i),p(i)), j ~ 1, and

(1) sup p(i)


i~1
{le~P - eai )I > p} ~
n-+oo
for any p> 0,
(2)

Then

sup E(i)
i~1
le~P - eai )I ~ 0.
n-+oo

Proof: Let P > be fixed, and

Then for R > P

( Ieai ) - e~) I dP(i)


lou)

Clearly

I(i)
I
<
-
P,
5. STRONG CONSISTENCY 61

uniformly in j because of condition (1); I~j) ----t 0 uniformly in j owing to the


R-+oo
condition (2) of uniform integrability.

LEMMA 13.2: Let the conditions VI1 and Vh be satisfied, and rn .j.. 0 as n --+ 00.
Then for any TO, () E e

sup E~j) sup !i(T) - E~j) !i(TO) ----t O.


j?:l rEv To (rn) n-+oo
n-+oo

Proof: Let us set


sup
!i(T),
inf
11 Ev To (rn)

Then the LV.-S ~~j), dfl satisfy the conditions of Lemma 13.1.
Let us set

LEMMA 13.3: If the conditions VI1 , VI2 , VI3 are satisfied, then for any () E e

sup IH(T)I
rEe"
----t
n-+oo
0 PI: a.c .. (5.6)

Proof: Let () E e be fixed. By Lemma 13.2, for any e > 0, TO E e it is possible


to show that there is a neighbourhood v ~ vro (ro) of the point TO such that
simultaneously for all j 2:: 1 and T E v

EU)f'(Tio)
e J - ~2 <
- EU) inf f'(T) -< E(j)f'(T)
11 rEv J 11 J ,

(5.7)

e
Since c is compact there exists a finite number of points T1,"" T m E e
and corresponding neighbourhoods V1, .. , Vm such that c c U:'l Vi, and for e
each neighbourhood Vi the inequality (5.7) is satisfied. From these inequalities for
i = 1, ... , m and T E Vi there follow the inequalities

EU)
11
!-(T)
J
- e<
-
EU) inf !-(T) <
11 rEv; J -
EU)
(j
!-(T)
J ,

which are satisfied for all j ;::: 1 simultaneously. Therefore

inf H(T);:::
rEe"
~in
l:'Sl:'Sm
(n- L 1 inf !i(T) -
rEv;
n- 1"
L...J
E~j) rEv;
inf !i(T)) - e.
62 CHAPTER 1. CONSfflTENCY

Consequently, by condition VI3 by reason of the arbitrariness of e > 0


lim inf H(r) ~ 0 pf a.c .. (5.8)
n-+oo TEe

Analogously one establishes the inequality


lim sup H(r) ~ 0 pf a.c .. (5.9)
n-+oo TEe.

Finally, (5.6) follows from (5.8) and (5.9).


THEOREM 13: Let the sequence of functions (5.1) satisfy the property (5.2) as
well as conditions VIi, VI2 and VI3 Then for any 9 E e
n- l / 2 dn (9)(8n - 9) ------t 0 pf a.c ..
n-+oo

Proof: By the definition of the estimator 8n


(5.10)

Evidently
H(9) ------t 0 pf a.c .. (5.11)
n-+oo

Let X(l), X(2) ~ IRN be sets of total pf probability, for which the relations
(5.8) and (5.11) are satisfied, respectively. Let us fix the elementary event x E
X(l), X(2) and let us assume that

n-+oo

(8n(x) in fact depends only upon the first n coordinates Xl, ... , xn of a point
x = (Xl, X2, ) E IRN. This means that there exists eo > 0 such that for an
infinite sequence of indices nk, k ~ 1,

Let ~(eo) be a number from (5.2). Then for n > no


_ inf
uEU~(/I)\v(eo)
n- l (L Ef h(9 + nl/2d;;t(9)u) - L Ef h(9)) (5.12)

> ~(eo),
H(9) ~ ~~o) , (5.13)

_ inf H(9 + nl/2d~1(9)u) ~ _ ~(eo) . (5.14)


uEU~(/I)\v(eo) 2
5. STRONG CONSISTENCY 63

The latter is always possible, since from (5.8) it follows that for any sequence of
sets en ~ e c

lim inf H(r) 2: O.


n-too TEen

Since

~ _ inf H(() + nl/2d~lu) + _ inf n- 1 L Er h(() + nl/2d~lu),


uEU:; (O)\V{co) uEU:; (O)\V{co)

then from (5.14), for n > no there follows the inequality

From (5.15) we find, for nk > no,

On the other hand, from (5.10), (5.12) and (5.13), for nk > no we obtain

The inequalities (5.16) and (5.17) contradict each other. Consequently

n- 1 / 2 dn(())(On - ()) ---+ 0 pf a.c..


n-too

Let us note that in the proof of Theorem 13 only a part of Lemma 13.3 was
used, namely the inequality (5.8).
Let us assume that J.L2 < 00. For a contrast function let us take

Then
64 CHAPTER 1. CONSISTENCY

and the contrast condition (5.2) takes the following form: for any 0 E e, for any
r > 0 there exists 6. = 6.(r) > 0 such that
_ inf n-l~n(u, 0) > 6.(r). (5.18)
uEU:; (IJ)\v(r)

Let us assume that Jl.2+6 < 00 for some 8 > 0 and that
A = sup sup Ig(j,O)1 < 00.
j~1 uE9 c

Then
1+6/2
supE$ sup !J(r) <00, (5.19)
j~1 rEv.,.o (ro)

and therefore the condition VI2 of uniform integrability holds. On the other hand
(5.19) implies that the conditions of Theorem A.7 are satisfied for the r.v.
sup sup
!J(r),
inf inf
rEv.,.o (ro) rEv.,.o (ro)

therefore condition VI3 is satisfied. Let us further remark that


sup 1!J(r) - !J(ro)1 ~ 2 sup Ig(j,r) - g(j,ro)I(lXjl + A).
IJEv.,.o (r) rEv.,.o (r)

Therefore VII is satisfied if the sequence of functions g(j, 0), j ~ 1, is equicontin-


uous on e c , A < 00, and, for example, Jl.l < 00.
The remarks made allow us to formulate, for compact functions !J(Xj,()) =
[Xj - g(j, 0)]2, the following:
COROLLARY 13.1: Let us assume that for the contrast functions

Jl.2+6< 00 for some u > 0, satisfying the inequality (5.18) and that the sequence
of functions g(j, 0), j ~ 1, is compact in the space of continuous functions C(e C ).
Then the m.c.e. en (equal to the l.s.e. On) satisfies the conclusion of Theorem 19.

For concrete contrast functions one can present conditions, milder than those
of Theorem 13, which yield the validity of the 'uniform' law of large numbers
(5.6). Finally, Theorem 13 will be valid under milder assumptions. Let us use
some contrast functions !J(Xj , 0) = [Xj - g(j, 0)j2 for which the requirements of
Corollary 13.1 can be relaxed to illustrate what we have just mentioned.
THEOREM 14: In the model (0.1) let Jl.2 < 00, and let dn (()) == n 1/ 2 1 q satisfy the
conditions (9.14) for T = e c and
(5.20)
5. STRONG CONSISTENCY 65

be satisfied where the junctions cp( 01 , ( 2) ~ 0 are continuous on e c x e c , with


CP(Ol,02) = 0 if and only if 01 = O2. Then the l.s.e. en is a strongly consistent
estimator of the parameter 0 E e.
Proof: For fixed 0 E e
H(r) = s* - J.L2 + 2n- 1w(O, r) pf a.c ..
Consequently

sup IH(r)1 ~ 0 pf a.c.


rEe" n--+oo

if
sup n-1Iw(O,r)1 ~ 0 pf a.c .. (5.21)
rEe" n--+oo

Let us establish (5.21). By the condition (5.20) for any 6 > 0, for n > no
sup (n- 1<pn(Ol,02) - CP(Ol,02)) < 6.
Ih,92Ee"

Therefore for any 01 , O2 E e c and n > no


In- 1w(Ol,02)1 2 ~ n- 1cpn(Ol,02)8* < (CP(Ol,02) +6)8*. (5.22)

Let us show that for a fixed r E ec


n-1w(O, r) ~ 0 pf a.c ..
n--+oo

Clearly
n-1<pn(O, r) - (n - l)-l<pn_l (0, r)
= n-1(g(n,O) - g(n,r))2 - n-1(n -l)-lcpn_l(O,r).
The series

L n-1(n-1cpn(O,r) -
00

(n -l)-lcpn_t{O,r))
n=2
converges by the Dirichlet criterion, thanks to (5.20). On the other hand, the
series

L n- (n - l)-l<pn_l (0, r)
00
2

n=2
also converges by the condition (5.20). Consequently

L n- 2(g(n, 0) - g(n, r))2 <


00

00
n=2
66 CHAPTER 1. CONSISTENCY

and

n-1w(O,T) ~ 0 pf a.c.
n-+oo

(Theorem A.7).
Let 8' C 8 c be count ably dense in the set 8 c and X(3) ~ ]RN be a set of total
pf probability for elementary events of which
n-1w{O,T) ~ 0, T E 8', S*~J.L2.
n-+oo n-+oo

sup In-1w(O, T)12


rES"

where Vri (6), i = 1, ... , rna is a finite covering of 8 c , Ti E 8'. If x E X(3), then for
n > no

s* < J.L2 +c:.

Then from (5.22) and (5.23) it follows that for n > no

sup In-1w{O, TW ~ 2C:{J.L2 + c: + 1),


rES"

and thus (5.21) is established.


From the definition of the l.s.e. Bn we obtain (cj., Section 1)

But by (5.21)

n-1w{Bn,O) ~ 0 pf a.c ..
n-+oo

Therefore also

From the latter relation and (3.14) we obtain the assertion of the Theorem in
the following way. Let us assume that for an elementary event x E ]RN

n-1'Pn(Bn,O) ~ 0 but Bn(x) --f--t o.


n-+oo n-+oo
5. STRONG CONSISTENCY 67

This means that there exists r > 0 and a sequence of indices nk, k 2:: 1, such that

For this sequence

And so we arrive at a contradiction.


The following conditions are sufficient for (5.20) to hold:

(1) (5.24)

where the function cp((h'(h) is as in the formulation of Theorem 14.


(2) For some 0: > 0 and c < 00

sup n-lcpn((~1,02)1(h - 02 1- 0 ~ c. (5.25)


ill,il2E8 e

Since condition (5.24) also implies the validity of (3.14), then the assertion
coinciding with the result of Jennrich [138] is valid.
COROLLARY 14.1: In the model (0.1) let J.L2 < 00, dn(O) = n 1 / 2 1 q , and let the
functions CPn(Ol, ( 2 ) have the property (5.24). Then for any 0 E e

On ----t
n-too
0 PI' a.c ..

If (5.25) is also correct, then in the formulation of Theorem 14 it is possible to


set

EXAMPLE 5: Let g(j,O) = g(Yj,O), Yj E Y ~ IRm , 0 E ee, j 2:: 1, where the


functions g(y,O) are continuous and bounded on Y x ee c IRm +q . With respect
to the location of the points Yj = (YJ' ... ,Yj) let us assume the following.
We define the "empirical d.f."

where x(x) = 1 if x > 0 and x(x) = 0 if x ~ O. Let us assume that the sequence
Fn(Y) weakly converges to some probability d.f. F(y), i.e., for any continuous and
bounded function a(y), y E Y,

{ a(y)Fn (dy) ----t { a(y)F (dy).


}y n-too }y
68 CHAPTER 1. CONSISTENCY

Then

uniformly in t E ((h,fh) E ee x ee = T'.


Let us demonstrate this. Let

and

An(t) = i a(y, t)Fn (dy) - i a(y, t)F (dy).

Let Vt C T' be some neighbourhood of the point t. Then

< r sup a(y, T)Fn (dy) - }yr inf a(y, T)F (dy),
}y rEv, rEv,

< r SUp a(y, T)Fn (dy) - }yr inf a(y, T)F (dy).
}y pEv, rEv,

By virtue of the weak convergence of Fn to F

lim sup IAn(t)1


n-+oo rEv,
r w(y, Vt)F (dy),
~ }y (5.27)

where

w(y,Vt) = sup la(y,T1) - a(y,T2)1


Tl,T2E v t

Since An(t) -----t 0 for any t E T', then (5.26) follows from (5.27). In order that
n-+oo
(5.24) hold it is sufficient to require that the function g(y,8) have the following
property: for any 81 ,82 E ee, 81 ::f 82 , F-measure of those points y E Y for which
g(y,8t} ::f g(y, ( 2 ) is positive.

It is interesting to notice that the function

8E e = (O,A) x (h,7r - h),

considered in Example 4 of Section 4, does not satisfy the condition (5.20). Clearly,
for this function dn (8) ::f n 1 / 2 1 q also. Nevertheless, the relation (5.21) can be
obtained in this case as well. For this it is sufficient to show that if JL2 < 00, then

i
."n = sup n -1 \L...J
' " eirj ej \ -----t 0 pN
fJ a.c .. (5.28)
rE[O,1rj n-+oo
5. STRONG CONSISTENCY 69

Consequently we obtain sequentially


n n-Ikl
(~ = sup n- 2
TE[O, ... ]
'
~ " e-iTk '
~ " ejelkl+j
k=-n j=1

n n-Ikl
< n-2Le~ +n- 2 L L ejelkl+k ,
1o=-n j=1
kO

EG"~ < p,n- 1 + n-' .~. (EG' C~I e;el'IH1)'f'


= J.L2(n- 1 +2n- 2 L(n-j)I/2)

= O(n- 1 / 2 ).
Let us set
n(m) = [ma] + 1, 0: > 2.
Then
(n(m) ~ 0 pf - a.c.,
m--+oo
since
00

L Ef (;(m) < 00.


m=1
Let us consider next the r. v.

<

( 1) n(m+l)
( n m+ ) -1
< n(m) - 1 (n(m) +n (m) L lejl
j=n(m)+1

It is easy to see that

Ef ( n-l(m) L
n(mH)2
lejl ~ J.L2 (n(m + 1) - n(m))
2
= O(m- 2),
j=n(m)H n(m)

n(m + 1)
~l.
n (m) m--+oo
70 CHAPTER 1. CONS~TENCY

Therefore (m ----t 0 pf -a.c., and consequently (5.28) holds. Since the inequality
m-+oo
(5.18) in the case considered is also satisfied (see Example 4, Section 4), it is then
possible to conclude that for any () E e

n 1 / 2 dn((})(On - (}) ----t pf a.c.,


n-+oo

i.e.,

The relation (5.28) in models with discrete and continuous time can be used
for the solution of the problem of detecting hidden periodicities. In the language
of the model (0.1) the question is that of the estimation of the parameters of the
regression function
Q
g(j, (}) = ~)Ai sinwij + Bi coswd), () = (A1, B 1,W1, ... , AQ, BQ,wQ).
i=l

This problem has an extensive bibliography (see the Commentary).


Let us consider one more example of the contrast function f;(Xj, (}) = IXj -
g(j, (})I, assuming that the tOj are symmetric r.v.-s. The corresponding m.c.e. 8n is
here the l.m.e. 8n , the property of consistency of which was considered in Theo-
rem 9. From Theorem 13 it is easy to deduce:
COROLLARY
'1
14.2: Let us assume, for the contrast function f;(Xj ,(}) IXj - =
g(j,O)1 and model (0.1) that ""1+0 < 00 for some 6 > 0, that the sequence of
functions g(j,(}), j ~ 1, is compact in C((}C), and that the inequality (5.2) holds:
for any () E e, there exists for any r > 0 a 6.(r) > 0 such that

_ inf n- 1Ef R((} + n1/2d:;;t((})u) ~ ""1 + 6.(r). (5.29)


uEU:;(9)\t1(r)

Then

for any () E e.
Let us formulate one assertion which uses essentially the form of the function

Let us introduce a condition analogous to (5.20):

where the function cp((}1,(}2) ~ 0 is continuous on e c x ec , where 9((}1,(}2) = 0 if


and only if (}1 = (}2.
5. STRONG CONSISTENCY 71

Let us notice that (5.30) is a corollary of (5.20). In fact, if (5.20) is true, then
n-1~lg(j'{h) - g(j,021 ~ (n- 1<pn(01,02))1/2,

and consequently it is possible to take


C{5(01,02) = <p1/2(01,02).

THEOREM 15: In the model (0.1) let J.L1 < 00, dn(O) == n 1/ 21q, and let relations
(3. 14}, (3. 27}, (3.28) and (5.30) hold. Then for any 0 E e
On ------t 0 p/i a.c ..
n-too

Proof: In the given case


H(r) = n- 1 2: IXj - g(j, r)l- n- 1 2: EflXj - g(j, r)l,
and for any r1, r2 E e c

Therefore also, as in Theorem 14, using condition (5.30) it is possible to show that
sup IH(r)1 ---t 0 p/i a.c.,
TEec n-too

if for any r E ec
H(r) ---t 0 p/i a.c ..
n-too

For the verification of this fact let us apply Theorem A.8 to the sequence of r.v.-s
~j = IX j - g(j, r)1 - Et'IXj - g(j, r)l
Using the notation of (3.27), we obtain sequentially
00 00

2:P/i{I~jl~j} < 2:P{lc11~j-J.L1-2go}


j=l j=l

2: 2: P{m - J.L1 -
00 00

2go ~ IC11 < m + 1 - J.L1 - 2go}


j=l m=j

2: mP{m -
00

J.L1 - 2go ~ IC11 < m + 1 - J.L1 - 2go}


m=l

< 2(J.L1 + go) < 00, (5.31)


72 CHAPTER 1. CONSISTENCY

i.e., condition (1) of Theorem A.8 is satisfied.


Let us verify that condition (2) is satisfied. As in Section 3 we obtain

Let us estimate the sum of the series

L
00

< 2 mP{m - 1 + ILl + 2go ~ lell < m + ILl + 2go}


m=l

(5.32)

The series (5.32) is estimated similarly to the series (5.31).


The verification of condition (3) is no different from the verification of condition
(3) of Theorem A.6 in the proof of Theorem 9 for the case s = 1.
To complete the proof of the theorem we refer to the argument of Section 3
related to the conditions (3.27) and (3.28).

EXAMPLE 6: Let g(j,O) = (Yj, 0), Yj, j ~ 1, be a bounded sequence in IRq, ej be


symmetric r.v.-s with positive probability density distribution, and ILl < 00. Let
us introduce the matrix Yn by the equality

and let us assume that


6. TAKING THE LOGARITHM OF NON-LINEAR MODELS 73

Then it is possible to take ~{(h,02) = (,x*)1/2101 - 02 1 in condition (5.30). Con-


sequently, since the other conditions of Theorem 15 are also satisfied, the l.m.e.
On ~ 0 Pf-a.c. for any 0 E e.
n-+oo

6 TAKING THE LOGARITHM OF NON-LINEAR MODELS

Let us consider the statistical experiment of the preceding section in which

g(j,O) = exp{ a(j, O)}.


The problem of the evaluation of an l.s.e. of an unknown parameter can be sim-
plified if (I) the observations Xj and the function g(j,O) have their logarithms
taken (the observations Xj, it is understood, must be positive numbers), and (2)
the functional:

L(O) = L:[logXj - a(j,OW

is used for the definition of the l.s.e. instead of the functional L(O). We call the
l.s.e. On obtained in this way the logarithmic l.s.e ..
A particularly natural such estimation procedure is represented by the case
of the log-linear model, i.e., when a(j,O) = (Yj, O) and e = ]Rq; the logarithmic
l.s.e. is calculated in the explicit form. It is therefore of very great interest to
explain under which conditions on the function a(j, 0) and the errors of observation
the logarithmic l.s.e. has satisfactory statistical properties. Unfortunately, and it
was to be expected that, the logarithmic l.s.e. for the observational model (0.1)
with additively entering errors 'rarely' happens to be consistent. This section is
dedicated to the explanation of this property. Up to the end of this section we
shall assume that e is a bounded set.
Let us assume that
(1) infj~l inf(le9 a(j, 0) ~ ao > -00;
C

(2) Cj ~ -bo P-a.c. for some bo > 0, where the constants ao and bo are linked
by the relation boe- ao < 1.
The condition introduced ensures the formal possibility of the taking of the
logarithms of the observations Xj for any realisation of the errors Cj.
EXAMPLE 7: Let a(j,O) = OYj, 0 E e = (a, b), a > 0, b < 00. Let us assume
that the logarithmic l.s.e. On is a strongly consistent estimator of the parameter O.
Then for pf almost all x E ]RN, for n > no(x), On = On(X) satisfies the equality

or
74 CHAPTER 1. CONSISTENCY

Let us note that

if in correspondence with Theorem A.7, for example,

L aj2yj2 Ef log2(1 + Cje-


00
9I1j ) < 00. (6.1)
j=1
But even if the given series converges, then the relation

(6.2)

may not be satisfied, and consequently en


is not a consistent estimator of O.
Let us mention one sufficient condition for (6.2) to hold. Assuming that ""2 <
00, by Taylor's Theorem we find

iE flog(1 +Cje- 9I1i )i ~ ""; (e 9l1j - bO)-2.

Let us assume that Yj > 0 and that Yj --:-----t 00. Then


3-+00

Consequently (6.2) holds, and if the series (6.1) converges then en is a consistent
estimator of O.

Let us write

!Pn(fh,02) = E(a(j,OI) - a(j,02))2, (}2,(}2 E ee,


iiJ(r,O) = E(a(j, r) - a(j, 0)) log(1 + Cjg-l(j, 0)), r,O E ee,

E9N-
W(On,O)
A

= E9N-W(On,O) Ir=9
A

Let us assume the following:


(3) For some Q > 0 and C2 < 00
sup n-1!pn(OI,02)IOI - 021-a: ~ C2;
91>92E6 C
6. TAKING THE LOGARITHM OF NON-LINEAR MODELS 75

LEMMA 16.1: If the conditions (1)-(4) are satisfied, then for any (J E e
n-1tpn(en , (J) ~ 2n- 1 Ef w(e n , (J) + 0(1) pf a.c., (6.3)
where 0(1) ----t 0 pf -a.c ..
n-+oo

Proof: By the definition of the logarithmic l.s.e. en


n-1L(en ) = n- 1L(J)-2n- 1(w(e n ,(J)-Efw(en ,(J
- 2n- 1 Ef w(en , (J) + n-1tpn(en , (J)
< n- 1 L(J) pf a.c .. (6.4)
Let us show that for any (J E e
sup n-1Iw(r,(J) - Efw(r,(J)1 ----t 0 pf a.c .. (6.5)
rES" n-+oo

For fixed r E e c , by Theorem A.7


n-1w( r, (J) - Ef n -lW( r, (J) ----t 0 pf a.c.
n-+oo

if
00

2:r2Eflog2(I+cjg-l(j,(J))[a(j,r) -a(j,(J)]2 < 00. (6.6)


j=l

If Cj ~ 0 then
log2(1 +Cjg-l(j,(J ~ log2(1- boe- aO ).
Let us further note that if Cj > 0 then
1 + Jog-l(Jo, (J) <
-
1 + Je- ao <
-
~e-2ao
J

in the case where


j
> 1+
_e ao - 2J5
-.
Therefore
Ef log2(1 + jg-l(j, (J
<

ev'5) p{
log2(1 - boe-ao)p{ - bo ~ Cj ~ O}

+ log" +2 00 ej o e"' 1+2 v'5}

C3 < 00 (6.7)
76 CHAPTER 1. CONSISTENCY

by condition (4). Consequently (6.6) is valid if

L:F 2(a(j,7) - a(j,O))2 <


00

00.
j=l

The latter fact is a corollary of (3) and the compactness of the set ec
On the other hand, for 71,72 E e c we obtain

In- 1(w(71,O) - Efw(71'O)) - n- 1(W(72,O) - Efw(72,O))1


< (n- 1 L: (log(l + Cjg-1 (j, 0)) - Ef log(l + Cjg-1 (j, 0))) 2y/2

X (n-1~n (71,72) )1/2. (6.8)

Inequality (6.7) shows that

sup sup n- 1 L: Ef log2(1 + Cjg-1(j, 0)) :::; C3 < 00.


n2:1 OEe c

Alternatively, by the same Theorem A.7

---+ 0 pf a.c.,
n-+oo

if
00

L:F2Eflog4(1 +Cjg-1(j,O)) < 00. (6.9)


j=l

But the convergence of the series (6.9) follows from (4), i.e., analogously to (6.7)
it is possible to obtain the uniform bound

Therefore (6.5) follows from (6.8) in the same way as relation (5.21). In this way
we obtain (6.3) from (6.4).

The result of Lemma 16.1 shows that if the function 'Pn(Ol, ( 2 ) distinguishes
the parameters, for example in the sense of inequality (3.14), then On consistently
estimates 0 if

sup In- 1Efw(7,O)1 ---+ 0 pf a.c .. (6.10)


rEec n-+oo

It is hardly probable that one can mention a simple sufficient condition for the
fulfilment of (6.10), covering a more or less wide and interesting class of functions
6. TAKING THE LOGARITHM OF NON-LINEAR MODELS 77

a(j,8) and errors Cj. However, using the result of Lemma 16.1, in one case it is
easy to show such a neighbourhood of 8 which contains all the limit points of the
sequence of On. In addition to the assumptions introduced we shall assume that:
(5) Cj E (-c, c), j ~ 1, for some C E (0, bol and P(O) = ~.
(6) For a> 0 from condition (3) and some 2:!. > 0
(6.11)

THEOREM 16 : If conditions (1)-(6) are satisfied, then pf -a.c.

2 1/2 )~2/0I.
lim 10 - 81 < ",-1/OI. c 2/OI. e -2ao/0I. ( 1 + min ( _1_ (_1_~)
n-too n - - 1 - /3' 1 - /3 6
2 '

Proof: Applying the Cauchy-Bunyakovsky inequality to the sum n- 1 E$' W(On, 8),
from (6.3) we obtain

n- 1<pn(On,8) ~ 2(n- 1<pn(On,8W/2 (n- 1 "L(E$'log(1 +Cjg-1(j,8)))2)1/2

+ 0(1) pf a.c.,
or

pf a.c., (6.12)
hence it is possible to take 01(1) = 01/ 2(1). Let us estimate

IE$' log(1 + Cjg-1 (j, 8)) I = II'>Og(1 + xg- 1(j, 8))P (dx) I

< 1: log(1 + xg- 1(j,8))P (dx)

+ 1:(- log(1 + xg- 1 (j, 8)))P (dx)

= h +12 ,
11 < '2cg Ll)
1 -1(.J,U,

1 1 1 -
12 < - '21og(1 - cg- (j, 8)) ~ '2 cg- 1 (j, 8)~,

= 1+ 2/3 +3""+
/3 2 ....
2
~
78 CHAPTER 1. CONSISTENCY

The series ~ can be estimated in the following way:

1
_ {
1-,8'
~<
- 1 11'2 1/2

C-,826) ,,8>
Therefore

L
r
n- 1 (E:log(l + Cjg-1(j,O)))2

< ~ ( 1 + min (1 ~ P' (I! P' ~'r) n-' L e- 2 .(;,61. (6.13)

The assertion of the theorem easily follows from (6.11)-(6.13).



It is self-evident that the intermediate inequality (6.13)may turn out to be
sharper than that used in the formulation of Theorem 16.
Clearly, in taking the logarithm of the initial statistical model we must dis-
tinguish two cases. The first is examined above, the second is related to a model
similar to the regression model, but with multiplicative occurrence of the error:

Xj = exp{a(j,O)}Cj, j = 1, ... ,no


If Cj admits a representation of the form Cj = eO;, where the OJ are independent
identically distributed r.v.-s with positive variance, then taking the logarithm is
an absolutely correct operations, and problems analogous to those considered do
not cause any trouble.
Chapter 2

Approximation by a Normal Distribution

In this second chapter we find sufficient conditions that must be satisfied .by the
function g(j, 8) and the r.v.-s Cj in order to ensure that as n --+ 00 the distributions
of the normalised differences On - 8 and On - 8 tend uniformly to Gaussian distrib-
utions in the proximity of the parameter. The basic result of the chapter consists
in obtaining the asymptotic expansion (a.e.) of the distribution of l.s.e.-s On (see
Sections 10--11), significantly revising the usual statements about the asymptotic
normality of statistical estimators.

7 STOCHASTIC ASYMPTOTIC EXPANSION OF LEAST SQUARES


ESTIMATORS

In this section a stochastic asymptotic expansion (s.a.e.) is obtained for the normed
l.s.e.-s On, i.e., a result about the On-approximation of the sum of vector polynom-
ials of the standard sums of r.v.-s with random remainder term which is stochas-
tically small. Theorems about s.a.e.-s of an l.s.e. reveal the structure of the l.s.e.-s
and are the starting point for the further study of subtle properties of the estimate
On.
Let us assume that condition (3.10) is satisfied, that e E IRq is an open convex
set and that Tee is compact. Let us introduce some notations. Let a =
(a1,"" a q ) be a multi-index with lal = a1 + ... + a q For a smooth function c(8)
we denote

We also make use of other notations for derivatives. Let k ~ 2 be an integer. Then
for r = 1, ... , k and i 1 , .. , ir = 1, ... , q

Thus, for example,

(0<) _ ( 82 ) (0<)
cij - 88 i 88j c , etc ..

79

A. V. Ivanov, Asymptotic Theory of Nonlinear Regression


Springer Science+Business Media Dordrecht 1997
80 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

In the course of all the chapters we shall assume that to the functions g(j,O)
for each j there also exist all continuous partial derivatives with respect to the
variables 0 = (0 1 , ... ,oq) in eO up to order k ~ 2 inclusive.
Let us set

Analogously li(OI), li~OI) will denote that functions g~OI), g~f) with the same compli-
cated arguments.
Let use denote

~~OI)(U1,U2) = L (/(OI)(j,U1) - I(Ot)(j,U2)f, U1,U2 E fj~(O),

d;(ojO) =L (g(OI)(j,O)f

If U = (u 1 , , uq) and 0 is a multi-index, then

In particular,

d~ (0) = d~~ (0) ... d~ri (0).


Let ei E ]Rq be a vector of which the ith coordinate is 1 and all the others are zero.
Let us assume
b(Oju) = n(I Ot I-1)/2 (d~)(O))-1 Lj/(OI)(j,U),

bi(Oj u) = n 1OtI / 2 (d~+ei(o))-1 Ljli(Ot)(j,U),

bi/(oju) = n(I Ot I+1)/2 (d~+ei+el (0)) -1 L jli\Ot)(j, u),

bi (0) = bi(Oj 0), bit (0) = bit (OJ 0) etc.,


a(oju) = n- 1 E; L(OI) (0 + n1/2d;;,I (O)u).
Analogously we write ai (OJ u), aij (OJ u), aijl (OJ u) with

for the mathematical expectations of the derivatives L~Ot), L~j), L~;'), etc.. We
also write

A(O) = (Ail(O))~,1=1 = r1(O),


the matrix 1(0) being defined in Section 4,

G(j,O) = [Xj - g(j, 0)]2.


7. STOCHASTIC ASYMPTOTIC EXPANSION OF LSEs 81

For the formulation of the assertions of Section 7 the following condition will
be needed.
II. For any R > 0 there exist constants ci(a, R) < 00, i = 1,2, such that
(1) sup sup n(lal-l)/2 (d~(O))-l dn(aj 0 + nl/2d;;-1(0)u)
9ET uEvC(R)nUN(9)

:::; Cl (a, R), lal = 1, ... , k, (7.1)

(2) sup sup In1 1al - 1 (d~(0))-2 ~~a) (Ul, u2)lul - u21- 2
uET Ul,U2EvC(R)nU:i(9)

lal =k. (7.2)


Let us note that in (7.1) for lal =1
dn(eij 0) == din(O), i = 1, .. . ,q,
and (7.1) is an extension of conditions (4.1) and (4.8). In exactly the same way
(7.2) generalises condition (4.3).
III. If g(a)(j,O) "- 0 then
lim inf n(l a l-l)/2 (da(O))-l d (a' 0)
n-::;;;' 9ET n n,
> 0, lal = 2, ... ,k. (7.3)

IV. For any integer m 2:: 3


lim supn m/ 2- 1d;;-m(ajO)
n-+oo 9ET
L Ig(a)(j,O)lm < 00. (7.4)

Clearly, for IV to be satisfied a sufficient condition is


IV'. lim supnl/2d;;-1(a;0) m~ Ig(a)(j,O)1 < 00. (7.5)
n-+oo 9ET 1$3$n

Let us introduce the sequence of numbers


Tn 2:: C. logk/2 n
such that

C.> 0 being a constant, the value of which will be defined below.


THEOREM 17: Let the conditions I~ (J.tm < 00 for some integer m ~ 3) be
satisfied, and the conditions II, III, IV and V (condition (4.1)) also. Then for
some TO > 0 and Co > 0
82 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

where hv(O), v = 0, ... , k-2 are homogeneous vector polynomials of degree v+l in
the random varaibles b(aj 0), lal = 1, ... , v + 1 with coefficients uniformly bounded
in nand 0 E T .
The proof of this Theorem 17 will be carried out according to the plan of the
work [59]. Let us first prove a few lemmas.
LEMMA 17.1: The condition {7.2} for the vectors a, 0 ~ lal ~ k -1, follow from
the relations {7.1} satisfied for a + ei, i = 1, ... , q.
Proof: The proof consists of the application of the finite increments formula to the
functions n 1al - 1 (d~(O))-l4i~a) (U1, U2), U1, U2 E vC(R) n [j~(O), and is analogous to
the proof of Lemma 12.1 of Section 4.
Let us denote

and let us write the Maclaurin expansion in terms of the variable u for the gradient
of the function n- 1L(O + n 1/ 2d;;;1 (O)u):

n- 1VL(O + n1/2d;;1(O)u)
= - 2n- 1/ 2B(Oj 0) + 2I(O)u - 2n- 1/ 2B(2)(O)U

+ L ~! (A(a j0) - 2n- 1/ 2B(aj 0)) u a + 2(u),


2::;laI9-1

(u) (7.7)

(i(U) L ~! (n- 1 L (G~a) (j, 0 + n1/2d;;1 (O)uT) - G~a) (j, 0)) ) u a ,


lal=k-1

luil ~ lui (7.8)

The analogous expansion for the function n- 1Lil, i, 1 = 1, ... , q has the form

n- 1Lil(O + n1/2d;;1(O)u)

= 2Iil(O) - 2n- 1/ 2bil(O)

+ L ~! (ail(aj 0) - 2n- 1/ 2bil (aj 0)) u a + (il(U), (7.9)


1::;laI9-2

(il (u)

(7.10)
7. STOCHASTIC ASYMPTOTIC EXPANSION OF LSEs 83

If k = 2 then the sums in (7.7) and (7.9) are absent, and the remainder term
in (7.10) is equal to

(il(U) = n- 1LiI(9 + nl/2d;;I(9)u) - n- 1Lil(9).

LEMMA 17.2: Let lui ~ 8 < 1 and let the event {s* ~ J.L2 + 1} be realised. Then if
condition II is satisfied we have

i, 1= 1, ... , q.

Proof: Let us show that for fixed lal = k

Let us note that

where

((1)(aju) = 2n(lo:l/2)-I(d~(9))-1

x L(ej(f(O:)(j,O) - f(O:)(j,u)) + f(O:)(j,u)(f(j,u) - j(j,O))) ,

(2)(aju) = n(lo:l/2)-I(d~(9))-1 L c((3,'Y)


f3+-y=o:
1f31,1-Y1~1

where the c((3, "I) are integer constants. Thanks to the conditions of the Lemma
and the statement of Lemma 17.1,

1(1) (aj u)1 ~ 2(s*)1/2n(lo:l-l)/2(d~ (9))-1 (~~) (u, 0))1/2

+2(n-l~n( u, 0))1/2 n (lo:l-l)/2 (d~(O))-1 dn(aj 0 + nl/2d;;1 (O)u)

(7.11)

in which it is possible to take


q

C3 =4 L c~(ei' 1).
i=1
84 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

Analogously, for fixed (3 and '"Y

n(l a l/2)-1 (d~ (8))-1 1L:(f(-r) (j, u)f(J3) (j, u) - f(-r) (j, O)f(J3) (j, 0)) I
< (n(I-rI- 1)/2(a,{(8))-1 (~~-r)(U,0)f/2)

x (n(IJ3I-1)/2(d~(8))-ldn((3; 8 + nl/2d~1(8)u))

+ (n(IJ3I-1)/2(d~(8))-1 (c)}f)(u, 0) f/2)

x (n(I-rI- 1)/2(a,{(8))-ldn ('"Y; 8 + nl/2d~1(8)u)

< (c!/2 C1 ((3, 1) + c~/2cd'"Y, 1))8, (7.12)

where one can take


q q
C4 = 4 L:c2h+ ei, 1), Cs = 4 L: c2 ((3 + ei, 1).
i=l i=l

The assertion of the Lemma follows from (7.8), (7.10)-(7.12), and from the obvious
inequality lual :::; lupa l .
Let us write
L(2) = (Lil)~,'=l
for the Hessian of the function L.
LEMMA 17.3: Let 8 E T and the events {s* :::; J.t2 + I}, {n- 1/ 2Ib(a,0)1 :::; 8},
lal = 2, ... , k, be realised, and let the conditions II and V (condition (4.7)) be
satisfied. Then a number ro = ro (T) > 0 can be found such that for n > no

inf _ Am in(n- 1L(2)(8 + nl/2d~1(8)u)) > AO'


uEv C (ro)nu~ (9)

Proof: Let us take advantage of the relation [212]:

IAmin(n- 1L(2)(8 + nl/2d~1(8)u) - Amin(2I(8))1

< q ~ax In- 1 Lil(8 + nl/2d~1(8)u) - 2Ii/(8)1. (7.13)


l~t,l~q

Since
n-1/2Ibil(8)1 :::; 8
in the right hand side of the expansion (7.9) by the condition of the Lemma,
and the sum of the subsequent terms does not exceed Cfj8, Cfj c6(T) < 00 =
7. STOCHASTIC ASYMPTOTIC EXPANSION OF LSEs 85

by the conditions of the Lemma and Lemma 17.2, then the right hand side of
(17.3) is not larger in value than q(2 + c6)8. Consequently the Lemma holds for
ro :::; 8 :::; Aoq-1 (2 + C6)-1.
If the event {lunUJ)1 :::; ro} is realised and the conditions of Lemma 17.3 are
satisfied, then the mapping
u -----t n- 1L(8 + n1/2d;;,1(8)u)

is convex on the ball v(ro), and the system of equations


(7.14)

has an unique solution coinciding with On. In this way


o n-hvL(On)

- 2n -1/2 B(O, 8) + n -1 L(2) (8 + n1/2d;;,1 (8)u*)u(8), (7.15)

lu*1 < IU n (8)1,


and by Lemma 17.3
2n- 1/ 21(n- 1L(2)(8 + n 1/ 2d;;,1 (8)U*))-1 B(O, 8)1

< 2n-1/2Ao1IB(O,8)1 (7.16)


.c
Let n- 1 / 2 = t. Let us denote by k - 1 (U,t) the expansion (7.7) without the
.c
remainder term, and then .coo(u,t) is a series obtained from k - 1 (U,t) by the
formal continuation of the summation to infinity:
1
L
00

.coo(u,t) = ,(A(a,8) - 2tB(a,8))u o . (7.17)


Ct.
101=0
Let
-
u(t) = th1 + ... + t k-1-h k - 1 + ...
be a formal expansion in a series of the solution of the equation

On substitution of the initial segment of


U(k-1)(t) = th1 + ... + t k- 1hk_1 (7.18)

into .coo (u, t) the terms containing ti, i = 1, ... , k - 1, vanish. Therefore it is
possible to write

.coo(u (k-1) (t),t) = "L.Jti-h i ,k-1. (7.19)


i~k
86 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

LEMMA 17.4: All

i = 1, ... ,k -1,
hi,k-1 = hi,k-1 (8), i ~k

from the representations (7.18) and (7.19) are q-dimensional vectors, the coordi-
nates of which are hom0geneous polynomials of degree i in b(a,O), lal = 1, ... , i,
with coefficients uniformly bounded in nand 8 E T.
Proof: The proof proceeds by induction on k. IT k = 2 then
h1 = A(8)B(0, 8)
and the assertion for hi,l, i ~ 2, is verified immediately.
Let the assertion be true for some k ~ 2. Then hk is defined by the condition
of the vanishing of the coefficient at t k in the expression

In (7.20) the discarded terms are of degree in t larger than k. From (7.19) we find
- 1 -
hk = 2 A(8)hk,k-1

and the assertion about hk follows from the induction hypothesis.


The quantity b(a; 0) with lal ~ k + 1 occurs in 00 (u(k) (t), t) only in terms of
the form tB(f3, 8)(U(k) (t))f3 /f3!, 1f31 = lal- 1, which contain t in powers not less
than lal. Therefore b(a; 0) does not enter hi,k if lal > i.
Let us note that on substitution into 00 (u(k) (t), t) of the quantities t- 1 b(a; 0)
instead of b(a; 0), the series 00 (u(k) (t), t) does not depend upon t; this follows
from (7.17) and the property of u(k)(t) established above. Therefore hi,k are
homogeneous polynomials of degree i in the variables b(a; 0), lal = 1, ... , i.
Clearly the function k-1 ('1.1., t) is obtained from 00('1.1., t) when

a(a; 0) = b(a; 0) = 0, lal ~k.


Let us keep the notations U(k-1) (t), hi, hi,k-1 as applied to k-1 ('1.1., t). In partic-
ular, instead of (7.19) we obtain the relation

(7.21)

in which the sum contains a finite number of terms.


Proof of Theorem 17: We show that if the inequalities

lun (8)1 < ro, s < J.L2 + 1,


Ib(a, 0)1 < C7T~/k, lal = 1, ... ,k (7.22)
7. STOCHASTIC ASYMPTOTIC EXPANSION OF LSEs 87

are satisfied, then

(7.23)

Hence the assertion of the theorem follows. In fact, let us set C;,1/k = C7' Then
from (7.23) we obtain

supPJ'{lun(lJ) - u(k-1)(t)1 ~ Tntk}


OET

(7.24)

Let us, for a fixed 0:, estimate the probability P;{lb(o:,O) ~ C7T;;/k}, setting

j = 1, ... ,n,
in the formulation of Theorem A.5. Thanks to (7.1), (7.3) and (7.4) the r.v.-s ejn
satisfy the conditions Theorem A.5 if

T1/k > max C (0: O)(m - 2 + ~?/2Iog1/2 n ~ > 0


C-7 1/1- 21/ 2 1::;lal::;k
n - 1 , ,.

In this case it is possible to write

sup PJ'{lb(o:, 0)1 ~ C7T;;/k} :::; x n (T)n-(m-2)/2 T;;m/k, (7.25)


OET

where Xn (T) is a bounded sequence.


Let us remark that if we replace IV by IV' in the conditions of the proved
theorem, then the relation is true for the r.v.-s ejn introduced (the quantity O'n(lJ)
is defined in Theorem A.5)

From (7.26) it follows that in (7.25)

xn(T) ----t 0
n-too

(see the Remark for Theorem A.5).


Passing on to the proof of (7.23) let us note that the inequality (7.22) and
the condition of the Theorem allow advantage to be taken of Lemma 17.3 and to
conclude that On is the solution of the system of equations (7.14), whence (7.16)
is satisfied. Let us set
88 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

By hypothesis Yn ~ 0; in addition, from (7.16) and (7.22) it follows that


n .... oo

(7.27)

Since the terms tiki and tiki,k_l in (7.18) and (7.21) are homogeneous in tbn(o:, 0),
then

Hence

(7.28)

From Lemma 17.2, (7.27), and the first of the inequalities (7.28) it follows that

(7.29)

From the second of the inequalities (7.28) and the second of the inequalities (7.29)
we obtain

(7.30)

Using (7.7), (7.29), and the inequality (14.12) of the book [33],

we estimate the quantity

t 2 IVL(On) - VL(() + t-ld~l(())U(k-l)(t)1

> ICk-l(Un(()),t) - Ck_l(U(k-l)(t),t)I-I((un(()))I-I((u(k-l)(t)1


> Iu n(()) - u(k-l)(t)1

x (2,xo - 2q ~ax
l$I,I$q
Ibil(()) I

(7.31)

Since V L(On) = 0, we obtain (7.23) from (7.30) and (7.31), whence we can take
Cs = (Cl3 + Cl4 + 2ClS),xOl .
7. STOCHASTIC ASYMPTOTIC EXPANSION OF LSEs 89

We find the exact form of the polynomials hi = hi+1 occurring in the formu-
lation of Theorem 17 by substituting u(k-1) (t) in (7.17) and equating to zero the
coefficients ofthe powers of t i (see Lemma 17.4). In order to write the vector poly-
nomials ho, h1 and h2 obtained in this way let us assume the tensor contraction
convention, as it is constantly used below. We shall consider that if in a product
of two or more factors any index appears twice, then it means that the summation
of all the values of this index is taken from 1 to q.
For example,

L
q

Aiil(0)Ai2 i 3(0)bi ti2(0)b i3 (l}) = Aiil (0)Ai2i3(0)bid2(0)bi3 (0)


it ,i2,i3=1

etc .. We shall write

etc.. With these notations

aij (0) 2II( i) (j) (0) = 2Iij (0), (7.32)

aijk (0) = 2 (II(i)(jk) (0) + II(j)(ik) (0) + II(k)(ij) (0)) , (7.33)

aijkl(O) 2 (II(ijk)(I) (0) + II(ijl)(k) (0) + II(ikl)(j) (0) (7.34)

Omitting the argument 0 we obtain

ho (A iil bil ) i=l'


q
(7.35)

(7.36)

(7.37)
90 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

Only the notation bili2i4 (0) needs clarification. In general, for r = 1, ... , k,
= 1, ... ,q, we assume
i 1 , ... ,ir

bil ... i~(O) = n(r-1)/2 (diln(O) ... di~n(O))-l LCjgil ...i~(j,O). (7.38)

Let us derive the general recursion relations for the calculation of the poly-
nomials hv in the case where the normalisation n 1 / 2 1q is used instead of the
normalisation dn (0). In this case the expressions for the functions ail ...i4 (0) and
the sums bil ... dO) are considerably simplified. For example, (7.38) turns into the
expression

(7.39)

Assuming that the functions g(j,O) are infinitely differentiable we can write
(c/., (7.7) and (7.17))

(7.40)

i = 1, .. . ,q.

Substituting in (7.40) the formal expansion


00

'it = n 1/ 2(On - 0) =L h v (0)n- V / 2 (7.41)


v=o
we find (where h~; is the i jth coordinate of the vector hOt;)

o = n
-1
Li(On)
A

= f: n(r~!1)/2
r=O
i~
(aiil ... (0) - 2n-1/2biil ... (0))i~ fI (f: h~;
3=1 Ot;=O
(0)n- Ot ;/2)

X h~l (0) ... h~~ (0)n-(r-1)/2-(OtI+ ... +0t~)/2, i = 1, ... , q. (7.42)

Equating the coefficients of n- v / 2 in (7.42) to zero we obtain the desired recurrence


relations:

i = 1, ... ,q. (7.43)


7. STOCHASTIC ASYMPTOTIC EXPANSION OF LSEs 91

It is important to-emphasise that in the sums (7.43) the integral vectors a(r)
contain r coordinates for each r.
The next Theorem is closely related to the one just proved.
THEOREM 18: Let the estimator On have the following property: for any r > 0
supP;{!un(lJ)! ~ r} = o(n-(m-2)/2). (7.44)
geT

Then under the conditions of Theorem 17

supP; { dn(lJ)(On - lJ) -


geT
I:
1'=0
n- v / 2h v (lJ) ~ c.. n-(k-l)/2Iogk/2 n}
= o(n-(m-2)/2. (7.45)

Proof: The relation (7.44) was established in Theorem 8 of Section 3. The condi-
tions of Theorem 17 are sufficient for the l.s.e.-s On to have the property (7.44) if
(3.11) and (3.12) are added to it. By virtue of Remark 8.2, for the normalistaion
n 1 / 2 1q and bounded e it follows that we should add (3.14) only instead of (3.11)
and (3.12). Since 11"(1) = o(n-(m-2)/2) in Theorem A.4, then (7.45) follows from
(7.6) if we take Tn = c.. log k/ 2 n.
In the process of proving Theorem 17 we obtained simultaneously:
THEOREM 19: Let the conditions of Theorem 17 be satisfied for k = 2, as well as
(7..44). Then for any fixed A > 0

sup p;{!un(lJ)! ~ 2Aolq(m - 2 + A)1/2 J-L~/2n-l/2Iog1/2 n}


geT

=o(n-(m-2)/2). (7.46)

Proof: The required estimate follows from (7.16) and Theorem A.5 applied to the
sum of the r.v.-s bi(lJ), i = 1, ... , q.
REMARK 19.1: The estimate (7.46) sharpens the estimate (4.11) of Theorem 12.
However, Theorem 12 was obtained without the assumption of the existence of the
second derivatives of the regression function g(j, lJ).

EXAMPLE 8: (See Example 4 of Section 4). Using Theorem 19, the bound (4.29)
for the function 9 (j, lJ) = lJl cos lJ2 j can be sharpened. Since now m = 3
1 O(n-l) )
(
J(lJ) = O(n-l) 1 '

then one can take x' > 4J-L~/2 in (4.29).


92 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

8 ASYMPTOTIC NORMALITY OF LEAST SQUARES ESTIMATORS:


FIRST RESULTS

This Section contains the statements about the rate of convergence of the distrib-
ution of normalised l.s.e.-s On to a Gaussian distribution. For a scalar 0 it is shown
(Theorem 21) that the rate of a Gaussian approximation is of the order O(n-/ 12 )
as n --+ 00. This and other results of the Section are of a preliminary nature and
later on they will be considerably extended at the cost of additional requirements
on g(j,O) and ej.
Let us write (!q C Bq for the class of all convex Borel subsets of IRq. Let CPK(X),
x E IRq, be the density of a Gaussian random vector (q > 1) with zero vector mean
and correlation matrix K (the density of the Gaussian r.v. (q = 1) with zero mean
and variance K),

CPIg (x) = cp(x),

THEOREM 20: Let us assume that the conditions of Theorem 18 hold for k = 2 and
some m ~ 3, for which (7.5) is satisfied instead of (7.4) for a = ei, i = 1, ... , q.
Then

sup sup Ip;{J.t;1/2 J 1/ 2(O)dn(O)(On - 0) E C} - ~(C)I = O(n- 1/ 2 Iogn). (8.1)


8eTcee: g

Proof: We shall assume that m = 3. From (7.45) and the positive definiteness of
the matrix 1(8) follows the existence of a constant c~ such that

(8.2)
For A E Bq and x > 0 we denote
Ax = {x:x E IRq,p(x,A) < x}

where
p(x, A) = yeA
inf Ix - yl

is an external set parallel to A, and

is an internal set parallel to A. Let us remark that the set Ax is open and A-x is
closed.
8. LSE ASYMPTOTIC NORMALITY: FIRST RESULTS 93

Let us set
Xn = c~n-1/2Iogn.
For any C E cr.q , from (8.2) there follows the inequality

p;{JL;-1/211/2(O)dn (O)(On - 0) E C}
~ p;{JL;-1/2 A1/2(O)B(O,O) E Cxnm} 'Yn (8.3)

The inequality '~' is already quite clear, as is the inequality':::;' if we take into
consideration that (C-Xn)Xn C C.
Let us apply Theorem A.9 to the sequence of random vectors

j = 1, ... ,n, n ~ 1.

It is easy to see that

Kn(O) = JL21(O) ,
n- 1/ 2 L K;1/2(O){jn = JL;-1/2 A1/2(O)B(O, 0),

P3,n(O) = n- 1 L E31{jn1 3

= p,n'/' L (t,01U,8)tI;;.'(8) 'I'

< p,q (t, (nl/'d;;'1(8>'~n 10;0,8)1),),,'


Therefore

lim sup P3,n(O) < 00


n-too 6eT

by condition (7.5) and

sup sup Ip;{JL;-1/2 A1/2(O)B(O, 0) E Cxn} - q>(Cxn )


6eTcee:Q
I : :; c1n- 1/ 2. (8.4)

On the other hand, by Theorem A.ll applied to the function

we have

(8.5)
94 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

Thus (8.1) follows from the inequalities (8.2)-(8.5).


COROLLARY 20.1: Let us assume that

1(8) ~ 1(8)
n-+oo

uniformly with respect to 8 E T. Then under the conditions of Theorem 20

where

A(8) = 1-1(8).

Proof: Clearly

~~~q Ip;{JL;-1/2Il/2(8)dn(8)(8n - 8) E C} - ~(C)I

= sup Ip;{dn(8)(8n - 8) E C} - ~1J2A(8)(C)I.


Gee:q

thanks to the positive definiteness of 1(8). Therefore it is sufficient to show that

(8.6)

uniformly in 8 E T.
When 8 E T is fixed (8.6) is an immediate corollary of two facts:
(1) the measure ~1J2<11(8) weakly converges as n --* 00 to the measure ~1J2A(8);
(2) <!q is an uniform class of sets for the Gaussian measure ~ 1J2A(8) ([33] p. 36).
However, both the direct proof of this and of the uniformness in 8 E T of the
variant (8.6) are almost obvious.
For a symmetric positive definite matrix A = (aij)~.j=l

det A $ au ... a qq .

Consequently

detJ(8) $ 1.

Therefore for any B E Bq and 8 E T

~1J2A(8)(B) $ (21rJL2)-q/2 L e->'olzI2/21J2 dx.


8. LSE ASYMPTOTIC NORMALITY: FIRST RESULTS 95

The same inequality is valid for <P 1'2A(II) (B). For an arbitrary t: > 0 let r = ret:) <
00 be a number such that

Then for C E (!q and some constant C3 (r) < 00

< t:+
det I(O)
( 1- ( detI(O)
)1/2) <P1'2 A(II)(CnV(r))

xl cnv(r)
IxI 2 e-(1/21'2)(I(II)X,x) dx.

COROLLARY 20.2: Let the l.s.e. On have the following property for some integer
m ~ 3

supP;{ldn(O)(On - 0)1 ~ H} ~ xH- m (8.7)


IIET

for sufficiently large H and some constant x < 00. Then under the conditions of
Theorem 20 and Corollary 20.1:

(1) (8.8)

for integral vectors a with non-negative coordinates, and lal ~ m- 1;

(2) (8.9)

for real ri ~ 0, r = L:f=l ri < m, the vector ~ = (e, ... , ~q) has the Gaussian
distribution <P 1'2A(II)

Proof: The conditions for which the inequality (8.7) hold were discussed in Sec-
tion 3. In Section 2 we obtained the exponential bounds of the probability of large
deviation for the r.v.-s Idn(O)(On - 0)1, from which (8.7) also follows. For (8.7) to
be satisfied for any r < m and n > no (see, for example, [80])
96 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

Since

IT Idin(l1)(8~
i=l
- Oi)lri ~ Id n (0)(8n _ O)l r ,

then (8.8) and (8.9) are variants of the theorems about convergence [151) pp. 196-
198.

From (8.8) it follows in particular that the mathematical expectation and the
correlation matrix of the vector dn (0)(8 n - 0) converge uniformly in 0 E T to 0
and J.t2A(0) as n ~ 00, respectively. In Section 12 the relation (8.8) is investigated
at greater length.
The estimate (8.1), which will be useful in the future, does not determine the
exact order in n of the rate of convergence of the distribution of the l.s.e.-s 8n to
a Gaussian distribution. It follows that we should expect that the exact bound
is O(n- 1 / 2 ). Below, just such a rate of approximation to the Gaussian law is
obtained for a scalar O.
Up to the end of the Section we assume that 0 E e, and that e is an open
finite or infinite interval on the real axis ]Rl, and that Tee is a finite closed
interval of ]Rl .
Let us denote

d~(O) = 1)g'(j,0))2,
d~(2, 0) = L (gil (j, 0))2,
j' (j, u) = g' (j, 0 + nl/2d;;I (O)u),
gil (j, 0 + n 1/ 2d;;,1 (O)u) = f" (j, u),

4'ln(Ul,U2) = L(f'(j,Ul) - j'(j,U2))2,

4'2n(Ul,U2) = L(f"(j,Ul) - f"(j,U2))2,

b(l)(O) = d;;,l(O) LCjg'(j,O),


b(2)(B) = nl/2d;;,2(B) ~Cjgll(j,B).
The conditions that we shall use are the conditions of Section 7 written for
q = 1 and k = 2.
II. For any R > 0 there exist constants C5 (R), C6 (R) such that

(1) sup sup (d;;,l (B)dn(B + nl/2d;;,1 (B)u)


OET uE[-R,Rlnu~(o)
8. LSE ASYMPTOTIC NORMALITY: FIRST RESULTS 97

~ c5(R),
(2) sup sup nd~4(8)~2n(Ul,U2)lul - u21- 2
8ET Ul,U2E[-R,Rjnu:;(8)

~ c6(R).
III. lim infnl/2d~2(8)dn(2j8) >0.
n-too 8ET

< 00.

THEOREM 21: (BERRY-ESSEEN INEQUALITY): Let the condition 13 :1'3 < 00 be


satisfied and conditions II, III, IV. Also let the l.s.e. On have the property (7...14)
for m = 3. Then

sup sup Ip;{J.L;1/2dn (8)(On - 8) < x} - ~(x)1 = O(n- 1/ 2). (8.10)


8ET zElR 1

Proof: Let us consider two possibilities.


(1) Ixl ~ 2(1 + 6)1/2Iog1/2 n, where 6 > 0 is a fixed number. Let us introduce
the event

xi m) = {lun (8)1 < 2J.L~/2(m - 2 + 6)1/2 n -l/2Iog1/2 n} .

By Theorem 19 (see relation (7.46)) for q = 1, Ao = 1 and integral m ~ 3


suP
8ET
P;{Xfi(8)} = o(n-(m-2)/2). (8.11)

On the other hand, by Theorem A.5 applied to the r.v.

ejn = ejnd~2(8)gll(j,8) for s = 3,

for the event

we obtain

supp;{x;(O)} = o(n- 1/ 2 Iog- 3 / 2 n).


8ET
98 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

If the event

X(O) = xi 3 ) n X 2(O) n {S* < J.L2 + I},


is realised, then by Lemma 17.3 the expansion (7.15) is valid for the l.s.e. 8n :
o (8.12)
t(u) = d~2(O)d;(O + n1/2d~1(O)u)
- d~2 2)Xj - j(j,u))!,,(j,u),
whence, as follows from further arguments,

inf t(u~.
lul<2/l;/2(1+.5)1/2n-l/21og1/2 n - 2

Let us estimate the quantity

It(u) - t(O)1 ~ d~2(O) Id;(O + n1/2d~1(O)u) - d;(O)1

+d~2(O) I~)j(j,u) - j(j,O))!,,(j,u)1

+d~2(O) II>j(j"(j,U) - !,,(j,0))1

< d~l (O)~~~2 (u, 0) (1 + d~l (O)d n (0 + n1/2d~1 (O)u))

+ (n1/2dn(2;O)d~2(O)) (n-1/2~;;2(u,0))

+ n1/2d~2(O)~~~2(u, 0)(8*)1/2 (mod Po).


Consequently if 8* ~ J.L2 + 1 then from the Theorem's conditions there follows the
existence of a constant C7 = c7(T) < 00 such that
(8.13)
The inequalities (8.12) and (8.13) show that for the realisation of the event X(O)
1- n- 1/ 2b(1)(O) + t(O)un(O)1 ~ C7U;(O). (8.14)

In particular,
(8.15)
Then
X n {J.L;-1/2 dn (O)(On - 0) < x}

~ {O < - b(l)(O) + J.L~/2t(0)X + J.L2n-1/2c7X2} . (8.16)


8. LSE ASYMPTOTIC NORMALITY: FIRST RESULTS 99

In fact,

and if the event X 2 ((}) is realised, then for n > no


1
t(O) ~ "2 +~,

where ~ E (O,!) is a fixed number. Therefore the points u n ((}) and J.L;/2n- 1/ 2x
lie on the ascending branch of the parabola
y+(x) = t(O)x + C7X2.
Consequently, from the inequality
- ((}) < J.L2n
Un
1/2 -1/2
X

and (8.15) we obtain


o < - n- 1/ 2b(1)((}) + t(O)u n ((}) + C7U;((})
< - n- 1/ 2b(1)((}) + J.L;/2n - 1/2t(0)X + J.L2n-1c7x2 .
Let us show that

(8.17)

For this it is sufficient to establish that


X(}) n {O < - b(1)(}) + J.L;/2 t (O)x - I-'2n-1/2c7X2}

n {J.L;1/2dn ((})(8n - (}) ~ x}

=0.
The points u n ((}) and J.L~/2n-1/2x lie on the ascending branch of the parabola

y-(x) = t(O)x - C7x2.

Therefore if
- ((}) >
Un
1/2 -1/2 X,
_ J.L2 n

then
o < - n- 1/ 2b(1)((}) + J.L~/2n-1/2t(0)x - J.L2n-1c7x2

< - n- 1/ 2b(1)((}) + t(O)u n ((}) - C7U~((}).


100 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

However, the inequality


- n- 1 / 2 b(l)(0) + t(O)un(O) > C7U;(0)
contradicts (8.14).
Let us denote
X(O,x) = {O < -b(l)(O) +JL;/2t(0)xJL2n-l/2c7X2} .
We show that
sup sup I
IpnX(O,x)} - <p(x) = O(n- 1/ 2). (8.18)
(JET Ixl~2(l+6)1/21og1/2 n
Then for
Ixl ::; 2(1 + 8)1/2Iog1/2 n
the assertion of the Theorem will follow from (8.16), (8.17), and the relation

sUPp;{X(O)} = o(n- 1 / 2 ).
(JET
Let us denote
17n(Z) = n- 1 / 2 L ~jn(z),
~jn(z) = j (d;;l (O)g' (j, 0) + JL;/2 d;;2 (O)g" (j, O)z) n 1 / 2 (8.19)
It is easy to see that

X(O,z) = {17n(Z) < JL;/2zJL2C7n-l/2z2}


and
Ip;{X(O,z)} - <p(z) I

< :~El Ip(Jn { (E;17;(Z)) -1/2 17n (Z) < Y} - <p(Y) I

+ 1<P((E;17;(Z))-1/2 (JL;/2zJL2C7n-l/2z2)) - <p(z)l. (8.20)


Let us apply Theorem A.10 to the sequence ~jn(z), where z E Zn = {Izl ::;
2(1 +8)1/2Iog1/2 n} defined by equality (8.19). Using the condition ofthe Theorem
being proved we find

r
a;(O,z) E;17;(Z)

L (d;;l (O)g' (j, 0) + JL~/2 d;;2(0)g" (j, O)z


r
= JL2

> JL2 (1 - JL~/2 (n 1/ 2d;;2(0)d n (2; 0)) n- 1/ 2z

> JL2 (1 - 2JL~/2(1 + 8)1/ 2c5(0)n-l/2Iog1/2 n) 2.


8. LSE ASYMPTOTIC NORMALITY: FIRST RESULTS 101

And so

lim inf a; (l~, z) ~ J.L2 > O. (8.21)


n-too (JET,zEZ"

On the other hand

P3,n(8, z)

n -1 J.L3 L 1( d~l (8)g' (j, 8) + J.L~/2 d~2(8)g" (j, 8)z) n1/213


< J.L3n1/2 l~~Jd~l (8)g' (j, 8) + J.L~/2 d~2 (8)g" (j, 8)z 1 J.L;-1 Eo1J; (8, z)
_3_

< J.L3 [n1/2d~1(8\Tltnlg'(j,8)1 +J.L~/2 (n1/2d~2(8)dn(2;8))

x n1/2d~1 (2; 8) 1Tltn Ig" (j, 8) 1(2(1 + 8?/2n -1/210g1/2 n)]

x (1 + 2J.L~/2 c5(0)(1 + 8)2n- 1/ 2 10g1/ 2 n) 2 .


From the condition of the Theorem it follows that

lim sup P3,n(8, z) < 00. (8.22)


n-too (JET,zEZ"

With the aid of (8.21) and (8.22) we conclude that

sup
(JET,ZEZ" VERI
sup Ip;{ (Eo1J2(z))-1/2 1Jn (Z) < Y} - 4>(y) 1 = O(n- 1/ 2). (8.23)

Let us estimate the second term of the right hand side of (8.20). Using Newton's
binomial expansion for (1 + W)-1/2 and the condition of the Theorem it is easy to
establish that

(Eo 1J;(z)) -1/2 (J.L~/2 J.L2C7n-1/2z) = 1 + n- 1/ 2zf(8, n- 1/ 2z),

where f+ and f- are functions having the following property:


sup sup If(8,n-1/ 2z) ~ Cs < 00.
(JET zEZ"

Therefore by the finite increments formula

14> ((E01J; (z) r 1/ 2 (J.L~/2 z J.L2 C7n- 1/ 2Z2)) - 4>(z) 1


= n-1/2z2If(8,n-1/2z)lcp(z*),
102 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

where

Since
Izl ~ 2(1 + 8)1/2Iog1/2 n,
then for n > no

Iz*l> I~I,
and consequently

1~(z(l + n- 1/ 2zf((),n-1/ 2z))) - ~(z)1 ~ cs(21l')-1/2 z 2e -Z 2 /S n -1/2. (8.24)

The relations (8.23) and (8.24) prove (8.18), and the case

Ixl ~ 2(1 + 8)1/2Iog1/2 n


is fully analysed.
Let us consider the second possibility:
(2) Ixl ;:: 2(1 + 8)1/2Iog1/2 n.
Let
x ~ 2(1 + 8)1/2 log1/2 n.
We have

Ip;{/L;-1/2 dn (())(On - ()) < x} - ~(x)1

< ~(-x) + P; {/L;-1/2 dn ()) (On - ()) ;:: x}

< ~(- 2(1 + 8)1/2Iog1/2 n) + P; {i]3) ())} .


Using the inequality

(8.25)

to estimate
8. LSE ASYMPTOTIC NORMALITY: FIRST RESULTS 103

Together with (8.11) the bound (8.26) leads to the the relation

sup sup
(JET "'~2(1+8)1/21og1/2 n
Ip;{J.l;-1/2 dn (0)(On - 0) < x} - !J>(x) I = o(n- 1/ 2).
The case
x ~ - 2(1 + 8)1/2log 1/2 n

is analysed analogously.
Let us set

a; = n- 1L(On).

and change the normalisation of On in (8.10), namely: instead of J.l2 and dn(O) we
substitute their statistical estimators a; and dn(On). In this case there holds:
THEOREM 22: Let J.lB < 00 and the l.s.e. On satisfy (8.11) for m = 6.. Then if
conditions II, III, IV are satisfied,

Proof: Let us introduce the event

where 8 > 0 is a fixed number. By Theorem A.5

p;{Z} = o(n- 1/ 2 log- 3 / 2 n).

As above we distinguish two cases.

Let us assume that the events xi B) and X3 are realised. By the finite increments
formula we find

and;; 1(Bn) = (S*)1/2d;;1(0) + ((U*)Un(O), (8.28)


lu*1 < lun(O)I,
((U) = (d;;l(O + nl/2d;;1(0)u)n-1/2 L 1/ 2 (0 + nl/2d;;1(0)u)) ~
d;;l (O)d;;l (0 + nl/2d;;1(0)u)L-l/2(0 + nl/2d;;1(0)u)
104 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

x ~)Xj - f(j, u)lf'(j, u)

- d;;1 (8)d;;3 (8 + n1/2d;;1(8)u)L1/2(8 + n1/2d;;1(8)u)

x "Lf'(j,u)!,,(j,u)

~1 + ~2.
Let us estimate ~1 taking into consideration that

lui ~ 2(11-2 (4 + 8) )1/2 n -1/2log1/2 n.


Firstly,

d;;1 (8) "L[Xj - f(j, u)lf' (j, u)

< ( d;;1 (8)d n (8 + n1/2d;;1 (8)u)) ((S*)1/2 - n -1/2~:!2 (u, 0)) n 1/ 2

< cgn 1 / 2

Secondly,
n- 1L 1/ 2(8 + n1/2d;;1(8)u) > (S*?/2 - n-1/2~:!2(u, 0)

> ClO > o. (8.29)

Thirdly,

Id;(8 + n1/2d;;1(8)u) - d;(8)1

< d;;1(8)~~~2(U, 0) (d;;1(8)d n (8 + n1/2d;;1(8)u) + 1) d;(8)

< clln- 1 / 2 log 1 / 2 nd;(8).

Therefore

dn (8 + n1/2d;;1(8)u) 2: (1- clln- 1 / 2 log 1 / 2 n)1/2dn (8), (8.30)

and consequently

~1 ~ C12 d;;1(8).

Proceeding to the estimate of ~2 we note that

L 1/ 2 (8 + n1/2d;;1(8)u) < (ns*)1/2 + ~:!2(U, 0)

< C13 n1 / 2.
8. LSE ASYMPTOTIC NORMALITY; FIRST RESULTS 105

Taking in account (8.30) we find next


n l / 2d;:;1 (0)d;:;3 (0 + n l / 2d;:;1 (O)u) 12: f' (j, u)f" (j, u) I
< C14 (d;:;2(0 + n l / 2d;:;1(0)u)dn (0)) (nl/2d;:;2(0)dn(2; 0 + nl/2d;:;1(0)))
< C15 d;:; 1(0),
or
~2 ~ C16 d;:;1(0).
And so, in the expansion (8.28)
1((u*)1 ~ c17 d;:;l(O),
and we obtain
{ a-;:; 1dn(On)(On -
A A

0) < x n Xl(6) n X3
}

= {J.t;-1/2 dn (0)(On - 0) < XJ.t;-1/2(S*)1/2(1 - xn- l / 2dn(0)((u*))-1}

nxi 6 ) n X 3
It is easy to see that if the events xi 6) and X3 are realised, then
(1 - xn- l / 2dn(0)((u*))-1 = 1 + xn- l / 2dn (0)((u*, x),
with
sup sup dn(O)I((u*,x)1 ~ C18 < 00.
lul~2tL;/2 (4+0)1/2 logl/2 n Ixl ~2( 4+0) 1/2x logl/2 n

Let us denote
X4(0,X)

rr
{p;-'i2dn(0)(Bn - 0)

< (1 (~i - 1) (1 H) lo~n


'f ( x - V2c"n-'i'x' } ,

X5(0,X)

~ {~2'i'dn(0)(Bn - 0)
106 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

where the upper sign is chosen if x ~ 0 and the lower one if x < o. Clearly, for
n > no

X4(8, x) n xi 6 ) (8) n X3 C {a;ldn(On)(On - 8) < x} n xi 6 ) (8) n X3


C X5(8,x) nxi 6)(8) nx3 .

By analogy with (8.20) we obtain

IP9{X4 (8,x)} - q>(x)1 (8.31)

< :~fllp;{JL21/2d~)(On - 8) < Y} - q>(y) I

- V2C18x'n-'i2) - 4>(x)

A similar inequality also holds for X 5 ( 8, x) .


The first term of the right hand side of (8.31) is, by Theorem 21, a quantity
O(n-l/2) uniformly in 8 E T. The second term, if it remembered that

admits the bound

Since the same bound holds for X5 (8, x) the assertion of the Theorem is established
for
8. LSE ASYMPTOTIC NORMALITY: FIRST RESULTS 107

Let us assume that


x 2: 2(4 + 8)1/2 x log1/2 n

and that the events xi 6 ) (0) and X3 are realised. Then by condition II
~ 1
supdn(On)d~ (0) ~ c4(1),
BET

and by (8.29)

Therefore if
X 2: JL21/2 C4 ()-1
1 clO
then

i.e.,

or

Consequently for

we have
Ip;{ a-~1dn(On)(On - 0) < x} - q>(x) I
< q>( -x) + PnB{ a-~ 1 dn(On)(On
~ ~ - 0) 2: x }

< q>( - 2(4 + 8)1/2 xlog1/ 2 n) + p;{ X3 } + p;{ if) (0) }


= o(n- 1 / 2 )
uniformly in 0 E T. The case
x ~ - 2(4 + 8)1/2 x log1/2 n
is considered analogously.
The direct extension of Theorem 21 to vector parameters 0 encounters consid-

erable difficulties. In Section 10 there is one general result about the asymptotic
normality of the l.s.e. On for the vector 0, from which follows a relation analo-
gous to (8.10) for q > 1; but unfortunately with more severe constraints than in
Theorem 21.
108 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

9 ASYMPTOTIC NORMALITY OF LEAST MODULI ESTIMATORS

In this Section we prove one theorem about the asymptotic normality of the l.m.e.
On, doing this by using a method of partitioning the parametric set, owed to Huber
[118,119]. Using the notation of the preceding sections let us assume the following.

(i) The set e is convex. The functions g(j,O), j ~ 1, are continuous on e c


together with all its first order partial derivatives, and the gi(j,O), i = 1, ... , q,
j ~ 1, are continuously differentiable in e, moreover for any R ~ 0
(1) sup sup nl/2dinl(O) m~ Ifi(j,u)1
fJET -
uEv(R)nU~(fJ)
l<J<n
- -

~ C(i) (R) < 00, i = 1, ... ,q, (9.1)


(2) sup sup nl/2dinl(O)d"lnldil,n(O + nl/2d~1/2(O)u)
fJET uEv(R)nU~ (fJ)

i,l = 1, ... ,q. (9.2)

(ii) The r.v. Cj is symmetric and has a bounded density p(x) = P'(x), satisfying
the condition
Ip(x) - p(O)1 ~ Hlxl, p(O) > 0,
where H < 00 is some constant.
From (9.1) there follow the inequalities (3.17) and (3.18) and the inequality
strengthening (3.16):
sup sup n-l~n(Ul,U2)lul - u21- 2 ~ c(R) < 00. (9.3)
fJET Ul ,u2EvC(R)nU~ (fJ)

And so if, in addition to (9.1), it is assumed that J-ts < 00 for some integer S ~ 1
and that (3.15) holds, then by Theorem 9 of Section 3 for any r > 0
supPJ'{ln- 1/ 2dn (O)(On - 0)1 ~ r} = zn(s), (9.4)
fJET

where
Zn(S) = O(n- s +1 ), S ~ 2,

zn(l) ---+ O.
n-+oo

From condition (9.2), which coincides with condition (4.8) as may be inferred
from Lemma 12.2, one can obtain the inequality
sup sup din2~~)(Ul,U2)lul - u21- 2
fJET Ul,U2Evc(R)nU~(fJ)

~ c(i) (R) < 00, i = 1, ... ,q, (9.5)


9. ASYMPTOTIC NORMALITY OF LEAST MODULI ESTIMATORS 109

coinciding with (4.3).


Let l be an arbitrary direction in IRq and r E 8. Then

:l R(r) = L)Vg(j,r),l)}(2x{Xj * g(j,r)} -1),


where '*' denotes '~' if (V9(j,r),l) = 0 and '<' if (Vg(j,r),I) < 0, and X{A} is
the indicator of the event A. Let ro be the distance between T and IRq \ 8. If the
event {IOn - 91 < r} is realised for 9 E T and r < ro, then for any direction l
a v

8l R(9 n ) ~ 0

holds. We shall use this remark in the proof of Theorem 23.


THEOREM 23: Let J.tB < 00 for some integer s ~ 1 (condition Ir:) and let the
l.m.e.en have the property (9.4). Then if condition V (condition (4.7)), (i), (ii)
are satisfied

sup sup Ip;{2P(0)I1/ 2(9)dn (9)(e n - 9) E C} - ~(C)I ---+ O. (9.6)


geT Cee: q n-+oo

Proof: We shall divide the proof into several steps. Let h, ... , lq be the posi-
tive directions of the coordinate axes. Let us consider the vectors R(9) with
coordinates

i = 1, ... ,q
and the vectors E3 R (r) with coordinates (i = 1, ... , q)

Clearly

by virtue of the symmetric nature of Cj. Let us denote

z;(9, u) = IR(9 + n1/2d~1(9)u) - R(9) - E3 R(9 + n1/2d~1(9)u)1

x (1 + IE 3R(9 + n1/2d~1(9)ul) -1.

LEMMA 23.1: Under the conditions of Theorem 23, for any c > o and sufficiently
small p > 0

supp;{ sup z;(9,u) > c} ---+ O. (9.7)


geT ue vC (p)nU;;(9) n-+oo
110 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

Proof: We carry out the proof for the quantity z;t(8, u). For simplicity we shall
assume that p = 1 and the inner supremum in (9.7) is defined in the cube

Co = {u: lulo = 1~~q IUil :S I} => v(I).


Let us cover the cube Co with No = O(logn) cubes C(1), ... , C(No) in the following
way. Let t E (0,1) be a number. We construct a concentric system of sets

{u: lulo E [(1 - t)m+1, (1 - t)m]), m = 0, ... ,mo - 1,


{u: lulo :S (1- t)mo}.
We cover each of the sets c(m) with identical cubes of side

and enumerate these cubes. They form the required covering

C (1),, C (No-1), C No defc(mo)


= .

Let us choose mo = mo (n) from the condition

mo = [mol, 'Y E (~, 1).

We note that I . 10 is the distance from C(j) to 0 equal to


p(j) = (1 - t)n-'Ym/~o,

and the I . lo-diameter of C(j) is equal to

a(j) = tn -'Ym/~o

for some m = m(j), j = 1, ... , No - 1. In fact, let the cube C(j) be an element of
the covering of the set c(m). Then

p(j) = t(1 - t)m+1 + ... + t(1 - t)m o-1 + (1 _ t)mo.


The number of cubes C(j) covering each set c(m) can be made not to depend upon
m, and consequently upon n. In order to be persuaded of this let us consider any
octant in IRq. The volume occuring in its part of the set c(m) is

and the volume of the set C(j) is equal to

aq(j) = t q(1 - t)mq.


9. ASYMPTOTIC NORMALITY OF LEAST MODULI ESTIMATORS 111

In this way there is no more 'room' in the given octant than for
(1 - t)mq - (1 - t)(m+1)q
~~--~~~----
t q(l - t)mq
= 1-
--~~~
(1 - t)q
tq

cubes. Since mo = O(logn), then No = O(logn) also. Let us choose 8 E T. Then

p;{ sup z;t(8,u) > e} ::; ~ p;{ sup z;t(8,u) > e} .


uECo j=-l uEC(j)
(9.8)

Let us estimate each term in (9.8). The general element of the derivative matrix
Dn(u) of the mapping
u~E9 R+(8 + nl/2d;;1(8)u)
has the form

= nl/2dinl(8)dk~(8) 'LlikU,U)(2P(jU,u) - !U,O)) -1)

+ 2n 1/ 2din1(8)dk~ (8) 'L !i(j, U)!kU, u)p(jU, u) - !U, 0))

= ID~k(u) + ~~k(u).
Taking into account (9.2), (9.3) and the inequality
sup p(x) = Po < 00,
xElR 1

for lui < p we obtain


n-l/211D~k(u) I < 2nl/2d~1 (B)dk~ (B)dik,n(B + nl/2d;;-1 (B)u)

x(n- 1 'L(P(jU, u) - !U, 0)) _ P(0))2Y/2

< 2C Cik )(p)c 1/ 2(p)Polul (9.9)


On the other hand

I~ n-l/2~~k(u) - P(0)Iik(8)1

< Po [dinl(8)din(8+nl/2d;;1(8)U)dk~(8) (4)>~)(u,0)r/2

+ dinl(8) (4)>~)(u,0)r/2]

+dinl(8)dk~(8) i'LgiU,8)9kU,8)(p(jU,u) - !U,O)) - p(O))i. (9.10)


112 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

By (9.1) and (9.5) the terms in square brackets are bounded by the quantity
PO((C{i) (p))1/2 + C{i)(p)C1/ 2(p))lul.
For the last term of (9.10), using condition (ii) and (9.1) with u = 0, we find the
majorant

(9.11)
Since by condition V the matrix
n- 1/ 2Dn(O) = 2p(0)J(0)


is positive definite, the arguments sketched show that for sufficiently small u (for
simplicity we assume that u E Co) and some Co >

(9.12)

Let k -# No, and v E C{k) be an arbitrary point. Then with (9.12) one can
write

sup z;t(O,u):::; (sup


UEC(k) UEC(k)
W~k)(O'U'V)+Y~k)(O,V)) (1+con 1/ 2p(k))-1,
4
W~k)(O,u,v) = LWi~(O,u,v) (modP;),
).=1

wi!) (0, u, v) = 2I d;:;-1(0) L 'V jU, u) (X{Xj * j(j, u)} - x{Xj < jU, v)}) 1 '

wi!)(o,u,v) = Id;:;-1(0) L('VjU,u) - 'VjU,v))(2x{Xj < jU,v)} -1)1,


{k) (ll
W3n f7,U,V )

= 2I d;:;-1(0) L 'V j(j, U) (P(J(j, U) - j(j, 0)) - P(JU, v) - jU, 0))) 1 '

w1!) (0, u, v)
= Id;:;-1(0) L ('VjU,U) - 'VjU,V)) (2P(JU,v) - jU,O)) -1)1,
y~k)(O,v)

= Id;:;-1(0) L('VjU,v)(2X{Xj < JU,v)} -1) - 'Vj(j,0)(2X{cj * O} -1)


- 'VjU,v)(2P(J(j,v) - jU,O)) -1))1 (modP;).
9. ASYMPTOTIC NORMALITY OF LEAST MODULI ESTIMATORS 113

By (9.5), for u, v E C(k) we obtain

n -I/'W!!) (0, U, v) < (t d;;,2.p~) (u, v) )1/'


< cla(k). (9.13)

Let us further note that in accordance with (9.1), (9.3) and (ii)

(9.14)

Analogously

< c3a(k). (9.15)

Let us estimate w1(!) (O,u,v). For any u,v E C(k)

Ix{Xj * f(j,u)} - X{X j < f(j,v)}1

< x{ inf f(j, u) - g(j, 0)


UEC(k)
~ ~ j sup f(j, u) - g(j, O)}
UEC(k)

= Xj (modP;).

Consequently by (9.1)

n -I/'W!!) (0, u, v) < n -II' (t (d;;'1 (0l.~n I/;U, U)I), )1/2 L Xj

(9.16)

Using the finite increments formula we find

= n- 1 L (p (sup UEC(k)
f(j, u) - g(j, 0)) - P ( sup f(j, u) - g(j,
UEC(k)
0)))
114 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

< Po (t
i=l
(nl/2dinl ((J) sup
UEO(It)
m~ IIi(j'U)I)2)l/2 a(k)ql/2
l~.1~n

< c5a(k). (9.17)


The estimates (9.13)-(9.17) show that there exist constants ctl and C7 such that

p;{ sup W~k)((J,u,v)(1 + eonl/2p(k))-l > ~}


uEC(It) 2

~ p;{ ctln- l L (Xj - E;Xj) > ~ p(k) - c7 a(k) } . (9.18)

The quantity
Cp(k) -
2 c7 a(k) = (C2 (1 - ) - >0
n-'Ym/mo
t) - C7t

if t is chosen sufficiently small. Therefore the probability (9.18) is, by Chebyshev's


inequality and (9.17), estimated by the quantity

(9.19)

Let us denote
YliU) = (fiU, v) -IiU, 0)) (2X{Xj < fU, v)} - 1),
Y2i(j) = 2fiU,0)(X{Xj < fU,v)} - X{Cj * OJ),
i = 1, .. . ,q.
Then
Pl = p;{YJk)((J,V)(1 +eonl/2p(k))-l > ~}

< 8(eo c )-2n- l p-2(k)

n (v , 0) ,
~(i) (9.21)

D; (LY2iU)) < 4L flU, 0) IP(fU, v) - fU, 0)) - P(O)I

< 4 m~ IgiU,(J)ldin((J)Po~;;2(v,0). (9.22)


l~.1~n
9. ASYMPTOTIC NORMALITY OF LEAST MODULI ESTIMATORS 115

The relations (9.20)-(9.22) and the condition of the Theorem show that
Pl < C9 n - l [(a(k) + p(k))2p-2(k) + (a(k) + p(k))p-2(k)]
= C9 n - l [(1- t)-2 + (1- t)- 2nl'm/;;:;o]

(9.23)

The bounds (9.19) and (9.23) show that, for k = 1, ... , No - 1 and some m =
m(k) < mo,

(9.24)

Let us consider the case k = No. Clearly,

p;{ sup
l.IEC(NO)
z;i(O,u) > }
< p;{ sup _ IR+(O +
il.lio<n-.,.n/",o
nl/2d~l(0)u) - R+(O)

Let us write the expression standing under the norm sign in (9.25) in the form of
a sum of vectors

f3l (0, u) + f32(0, u) + f3(0, u),


where

f3l(O,U) = d~l(O) L)'Vj(j,u) - 'Vj(j,0))(2X{Xj * j(j,u)} -1),


!32(0,U) 2d~l(0) L'Vj(j,O)(x{Xj * j(j,u)} - X{j * O}),
!33(0,U) = d~l(O) L 'Vj(j,u)(2P(f(j,u) - j(j,O)) -1).

It is easy to show that, for lulo ::; n-I'm/;;:;o,

(9.26)
116 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

x (2pO n ' 0)) <


g,1/2(U _ c 10 n / -"'(m /:;;"o
o
1 2 (9.27)

If 'Y >!,then for n > no the exponents in (9.26) and (9.27) are negative. And
so it remains to estimate the probability (e' < e)

(9.28)

Xj = X{ inf _ lU, u) - g(j, 9)


lulo:5n-..,mo /mo

~ ej ~ sup _ l(j,u) - 9U,9)}


lulo:5n-..,mo /mo

Since, by the Theorem's condition,

i = 1, ... ,n,
instead of (9.28) it is sufficient to estimate, for any e" > 0, the probability
p;{ n- L (Xj -
1/ 2 E;Xj) > ell} ~ (e")-2c12n-"'(mo/:;;"o.

As all the bounds are uniform in 9 E T and the case of z:;; (9, u) is investigated
analogously, Lemma 23.1 is then proved.
Let us set

LEMMA 23.2: Under the conditions of Theorem 23, for anye >0
supP;{IR+(9) + E;R+(8n ) I > e} --+ O. (9.29)
(JET n-+oo
9. ASYMPTOTIC NORMALITY OF LEAST MODULI ESTIMATORS 117

Proof: Let us introduce the events

i = 1, .. . ,q.

From (9.4) and the assertion of the preceding Lemma it follows that

inf P;{Ar(8)} ---+ 1,


BET n--+oo
i = 1, .. . ,q. (9.30)

For the event {IBn - 81 < p}, p < TO,

therefore the relation (9.30) is true, for the events

Br(8) = {Rr(8) + E; Rr(Bn) ~ - c: (1 + IE; R(Bn) I)}

:J Ar(8).

as well.
On the other hand,

and the events B;(8) are equi-probable with the events

Further, for c: < q-l

Bt(8) n Ct(8) = Dt(8)

= {IE;Rt(On) + Rt(8) I ~ c: (1 + IE;R+(Bn) I) },

i = 1, .. . ,q, (9.31)

n
q

Dt(8) c {IE;R+(Bn) + R+(8)1 ~ qc: (1 + IE;R+(Bn)l)}


i=l

c {IE;R+(Bn)1 ~ (1-qc:)-1(qc:+IR+(8)1)}

= X+(8),

i.e.,

(9.32)
118 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

We note that

P;{IE;R+(lJn)1 > M}

~ P;{X+(8)} + P;{IR+(8)1 > M(l- qc) - qc}. (9.33)

Let us denote

1}j= 2X{cj < O} - 1, j 2: 1,


Iin(8) = {1, ... ,n} n {j :gi(j,8) > OJ.

Then P;-a.c.,

Ri(8) - din1(8) Lgi(j,8)1}j = 2din1(8) L gi(j,8)x{cj = O}


jE/in (9)

= O.

Therefore by the Chebyshev inequality

P8{IR+(8)1 > M(l- qc) - qc} ~ q(M(l- qc) - qc)-2 ~ 0,


M-+oo

i.e., the vector R+(8) has a bounded probability. From (9.32) and (9.33) it follows
that the vector E;R+(8n ) is also bounded in probability uniformly in 8 E T.
According to (9.31)

supp;{ IRi(8) + E;Ri(8n)I > c (1 + IE; R+(8n)I) } ~ O.


9ET n-+oo

Therefore (9.29) holds. We remark that the boundedness in probability of the


r.v. IE; R+(8n)1 can also be obtained immediately from (9.4), the explicit form
E;R+(8n ), and from the conditions of the Theorem.

LEMMA 23.3: Under the conditions of Theorem 23, for any c >0
sup P;{IE;R+(8n )-2p(0)I(8)dn (8)(8 n -8)I>c} ~O. (9.34)
9ET n-+oo

Proof: If the quantity n- 1 / 2 Id n (8)(8 n -8)1 is small, then from the inequality (9.12)
and the boundedness in probability of the r.v.IE;R+(8n )1 it follows that the norm
of the vector d n (8)(8 n -8) has a bounded probability. The assertion of the Lemma
follows from (9.4) and the inequalities (9.9)-(9.11).
Proof of Theorem 23: The relations (9.29) and (9.34) show that for any c >0
supP;{1(2p(0))-lA(8)R+(8) + dn (8)(8 n - 8)1> c} ~ O. (9.35)
9ET n-+oo
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 119

As was remarked above,


R+(8) = d;;l LV'g(j,8)1/j (modP;).
Let us apply Theorem A.9 to the latter sum, assuming
~jn = n l / 2d;;l(8)V'g(j,8)1/j, j = 1, ... ,n.
The matrix [(8) is the correlation matrix of the sum n- l / 2 E ~jn' By condition
(9.1)
q
n- l L E~I~jnI3 < ql/2 L n- l L dfu3(8)lgi(j, 8)13 n 3/2
i=l

< Cl3 < 00.


uniformly in 8 E T. Consequently

sup sup Ipnrl/2(8)R+(8) E C} - Il>(C) I = O(n- l / 2 ). (9.36)


()ET CE<!q

From the relations (9.35) and (9.36) it follows that for any c > 0 and C E crq
- ~n + Il>(C- e ) < P;{2p(0)[l/2(8)dn (8)(8 n - 8) E C}
(9.37)
where Ce and C- e are exterior and interior sets parallel to C, and ~n ---+ 0
n-too
uniformly in 8 E T and C E crq The assertion of the Theorem follows from (9.37)
and Theorem A.11 (see (8.5)).

10 ASYMPTOTIC EXPANSION OF THE DISTRIBUTION OF LEAST


SQUARES ESTIMATORS

In this Section an a.e. is obtained for the distribution of normed l.s.e.-s On.
In Sections 10 and 11 instead of the normalisation dn (8) the normalisation
n l / 2 1 q is used. This simplification is not a fundamental one, it only makes it
easier to write the tedious formulae of those Sections. The normalisation n l / 2 1 q
leads to an alteration in the written forms of quantities introduced earlier, for
which the previous notations are kept. In particular, we immediately have
e- 8= U(8),

= (n- L gi(j, 8)gl(j, 8)


l r '
t,l=l
b~lt.,.. (8)
120 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

etc ..
We shall use the conditions of Section 7 in the following form.
II. For any R > 0 there exist constants Ci(a, R) < 00, i = 1,2, such that
(1) sup sup n- 1/ 2dn (aj9+u)
(JET uEvc(R)nUc(J)

~ cda,R), lal = 1, ... ,k, (10.1)

(2) sup sup n-124~a)(U1,U2)lu1 - u21- 2


uET Ul,U2Ev c (R)nU c (J)

lal =k. (10.2)

III. lim inf n- 1 / 2 d (a 9)


n~oo (JET n,
>0 (10.3)

for aliial = 2, ... , k for which g(a)(j, 9) ~ o.

IV. lal = 1, ... ,k. (lOA)

V. lim inf Amin(I(9)) > Ao > O. (10.5)


n~oo (JET

For r = 1, ... , k and i 1 , , ir = 1, ... , q let


Vi!l... i,. (9) = Aill(9)bli2 ... i,. (9).
If the function g(j,9) is a polynomial (in particular a linear function) in any
variable 9i of the set (9 1 , . ,9 q ) = 9, then it turns out that, for any set of indices
i1, i 2, ... , i r ,
P;-a.c ..

Omitting the r.v.-s Vi!l... i,. (9) that vanish for this reason, let us consider the vector
Vr (9) consisting of all different r.v.-s Vi!l... i. (9), ordered in the natural order, i.e.,

vA,, vl vl2 ,, vl v31, ... ,vg...


q, q, q ).
~
r-1

The dimension of the vector Vr is estimated by the quantity


r-1
dim(Vr) ~ q + q LPs,q,
s=l
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 121

where Ps,q = C:+;-:-l


is the number of different partial derivatives of order 8 of a
function of q variables. Since
r-l
"'" Cq-l
~ q+s-l -
- Cq+r-l,
q

s=o
then

It is easy to see that

where the vectors Wr (j, 8) are composed of the quantities

Ai1 1(8)gU2 ... i. (j, 8), 8 = 1, ... ,r.


Let us assume that

where

is the dimension of this vector

W(8) =L Wj (8)wj (8).

Then the P x P matrix

is the arithmetic mean of the correlation matrices of the vectors wj(8)cj. Let us
assume that

VI. lim inf Amin(Kn(8)) > A* > O. (10.6)


n-+oo 9ET

Let 1jJ(A) be the characteristic function (c.f.) of the r.v. Cj.


Then

is the c.f. of the random vector wj(8)cj.


VII. There exists an integer u > 0 such that the function

I'
m+u
wm(O, t) = II l1jJj (K;1/2(O)t) 0 ~ m ~ n - u, n ~ u + 1,
j=m+l
122 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

satisfies the condition

sup sup
O<m<n-u (JET iRP
r w (9, t) dt <
m 00, (10.7)
n~u+1

and for any number b > 0


sup sup sup wm (9, t) < 1. (10.8)
O<m<n-u Itl~b (JET
n~u+1

The condtion VII imposes restrictions simultaneously upon the r.v. Cj and the
function g(j,9), and therefore the inequalities (10.7) and (10.8) have no obvious-
ness. We shall find such requirements as must be imposed individually on the r.v.
Cj and functions g(j, 9) in order to guarantee the fulfilment of VI and VII.

VIII. There exists an integer h ~ p such that amongst any h vectors from the
totalities
{Wj(9),j=m+1, ... ,m+h}, O$m$n-h, n~h+1

there can be found p vectors Wjl' ... ,Wjp such that the matrix
p

W~)(9) =p-l LWj,(9)wJ,(9)


i=l

is uniformly positive definite, namely:

inf (JinefT Amin(W~)(9)) ~ ~ > O. (10.9)


O<m<n-h
n~h+1

IX. ~= r 11/J(AW dA <


iRl
00 (10.10)

for some p ~ 1.
Let us show that (10.6) follows from (10.9). We shall use the following facts,
which arise from the Courant-Fisher Theorem on the minimax representation of
eigenvalues ([29] p. 143-144).
Let A and B be symmetric matrices, with the matrix B being non-negative
definite. Then:

If A and B are symmetric positive definite matrices, then

For n > no let us write


10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 123

where
[*)-1
A =~ L W;~)(O),
8=0

It is easy to see that the matrix B is non-negative definite. Therefore by the


inequality (1)

Using next the inequality (2) first for two terms of the sum

[*)-1
L W;~)(O),
8=0

then for three, etc., we obtain

~ r~' w);)
Ami. (B) ) >

> ~ [~]~~ (t - ~)~>o.


Let us establish that VII follows from VIII, IX, and the conditions introduced
earlier. Let us assume that u = ([p) + 1)h, where h, p are the numbers in conditions
VIII and IX.
Then by Holder's inequality

[p]H
= II ~ l/([p]+l)
8 ,
8=1

where the indices ji correspond to the vectors w31 ' ... ,Wjp of condition VIII. In
the last integral we substitute the variables
(WJi (0) , K-
n / (O)t) -
1 2 -u,
i i = 1, ... ,po
124 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

The Jacobian of this transformation is equal to

where Dn(O) is the matrix with columns Wj; (0), i = 1, ... ,po Since by (10.9)

then

sup Idet D n (O)I- 1 :::; (P~)-P/2.


BET

On the other hand,


p
det Kn(O) :::; II Kii(O),
i=l

where the Kii(O) are the diagonal elements of the matrix Kn(O), and by conditions
(10.1) and (10.5) we have uniformly with respect to 0 E T

Kii(O) = /l2 n - 1 L (A i l l (O)gli 2 dj,O))2

< 00

(see the proof of Lemma 24.1 below). And so

uniformly in 0 E T and (10.7) is satisfied.


Let us verify that (10.8) is satisfied. By condition IX the distribution p*([P]+1)
is absolutely continuouis and has a bounded density. Consequently by Theo-
rem A.12, for any b > 0,

sup 11/J('x) I < 1.


1>'I?:b

On the other hand


[P]+1 p

Wm(O, t):::; II II l1/Jj; (K;1/2(O)tl


8=1 i=l
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 125

For the same collection of vectors wh, ... ,Wjp we obtain for It I > b

L: ((K~I/2((J)Wji ((J), t) )
P 2
= p (K~I/2((J)W~) ((J)K~I/2((J)t, t)
i=1

> P~(Amax(Kn((J)))-lltI2 ~ PX,

x = ~ (sup Amax (Kn ((J)))-1 b2 > 0,


(JET

since from the preceding argument it follows that

sup Amax(Kn((J)) ::; Ca < 00.


(JET

Therefore among the numbers (K~I/2((J)Wji ((J), t), i = 1, ... ,p, a number can be
found that has the property

Therefore by Theorem A.12


p
II 11/J((K~I/2((J)Wj. ((J), t))1 < 1,
8=1

and consequently (10.8) holds if JL2 < 00.


THEOREM 24: Let conditions If+! be satisfied, and IJ- VII or IJ- V, VIII, IX. Also
let the l.s.e. On have the following property: for any r > 0
supP9{IOn - (J)I ~ r} = o(n-(k-l)/2).
(JET

Then

-fa 'P.,A(') (y) (1 + ~ Mv(8, y)n -V/2) dy


= O(n-(k-l)/2Iogk/2 n), (10.11)
where MII((J, y) are polynomials of order 3v in the variables yl,. " ,yl with coeffi-
cients uniformly bounded in nand (J E T.
The proof proceeds according to the plan of action of Pfanzagl [176] and Michel
[158]. The fundamental complexity of the proof of Theorem 24 separates itself into
three lemmas, the proof of which we now enter upon.
126 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

Let G be the c.f. of the probability measure G on (BP, ]RP), v = (VI, ... , v p )
being a multi-index. Then the quantity
_ 1 ~ (II)
XII - i\lI\ (log G) (0)

is called the cumulant of order v of the measure G [33].


Let us consider the polynomials

Xs(Z) = s! ' " XII ZII


L...J v!
\II\=S

with respect to the variables zl, ... , zp. Let us define the polynomial Ps(Zj {XII})
in the variables zl, ... , zP, equating two formal power series in the variable u:

1 + ~p- ( { }) s
~ s Zj XII U = exp
{~XS+2(Z)
~ (s + 2)! u
s}
.

To obtain the general form of the polynomial Ps(Zj {XII}) we use one fact about
the derivative of the exponents of a power series ([173] p. 169):

where l:* denotes the summation over all integral non-negative solutions k1, ... , ks
of the equation kl + 2k2 + ... + sks = s,

-. - ",*
Ps(z, {XII}) - L...J
II X +2
s km
m
m=1 km!((m + 2)!)k
(
Z
)

m
(10.12)

From formula (10.12) it is seen that Ps(Zj {XII}) is a polynomial in ZI, ... , zP of
order 3s and that its coefficients depend upon the cumulants XII, where Ivl ~ s+2.
Let us consider a sequence of independent vectors ~j with values in JRP and
zero means. Let G j be a distribution of ~j. The c.f. I1j=1 Gj (tn- 1 / 2 ) corresponds
to the sum n-l/2l:~j. Assuming that moments of any order exist for the vectors
~j, j = 1, ... , n, we obtain formally

10gII
.)
n
G(tn- 1/ 2) = -! (K t t) + ~ Xs+2(it) n- s/ 2
2 L...J (s + 2)!
n, '
tE ]RP,
)=1 s=1

where Kn is the arithmetic mean of the correlation matrices of the distributions


G1 , .. ,Gn ,

- ()
XS Z = s., 'L...J
" IXII
V.
ZII ,
\II\=S
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 127

and XII are the arithmetic means of the cumulants of order v of the distributions
G l , ... , G n . Consequently we formally obtain the a.e.
n
II Gj(tn- l/2 ) exp { - 2"1 (Kn t , t) } exp {~XS+2(it)
~ (8 + 2)! n
-sI2} (10.13)
j=l

The first term in the a.e. (10.13) is the c.f. of the Gaussian distribution q; K,,' The
function

Ps(itj {XII} exp { - ~ (Knt, t)) }, t E ]RP,

is the Fourier transform of the function Ps ( -<P K" ; {XII}) formally obtained by
substituting (-1)11I1<p~~ in place of (it)" for each v in the polynomial Ps(itj {XII})'
In other words, we have the equality

where the written form Ps ( - \7 j {XII} )0K" is understood as the application of the
differential operator Ps ( - \7 j {XII}) to the function <P Kn' In fact
-
<p~~(t) = (-it),,0K:(t), tEJRP,

which is obtained by taking the vth derivative with respect to x of both parts of
the inverse Fourier transform

<PKn(X) = (21l')-P r exp{-i(t,x)}0K:(t)dt,


JR."
x E ]RP.

We shall denote by Ps ( - q; Kn j {XII}) the signed measure with density


Ps ( - <PK" ;{XII})'
The first problem we must solve consists in the construction of the a.e. for a
sum of random vectors of the special form

j = 1, ... ,n.
Let

"/s = is1 (dd.V log'ljJ


S
)
(0)

be the cumulant of order 8 of the r.v. j. Since the c.f. K;;l/2(0)~jn(0) has the
form
128 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

then provisionally assuming that ej has moments of any order, we find formally
n
log IT G (tn-
j 1 / 2 ,()) (10.14)
j=1

And so

XB(it) = 'YBBBn(it, ()), 8 = 3,4, ....


From formula (10.12) it follows that

-. -
PB(~tj {X" (())})
~*
= L...J
IT
B k Bkm (t ())
m
'Ym+2 m+2,n ~ ,
m=1 km!((m + 2)!)km ' 8;::: 1, (10.15)

where

In particular,
X3(it)
3!

X4(it) 1 X~(it)
= 4 ! +2 (3!)2

= "14 L \ (it)/L (n- 1 L(K;1/2(())Wj(()))/L)


11'1=4 J.t
1 -. _
+ 2 (P1 (~tj {X,,(())}) .
2
(10.17)

Let us remark that

"13 = Ee~ = m3,


In accordance with (10.14) and (10.15) the functions e- 1t12 /2 PB(itj {X,,(())}) is
the Fourier transform of a signed measure with density

_ ~* ITB 'Y~+2B~+2 n ( - \7, ())


PB( - c,oj {X,,(())} )(x) = L...J m=1 km!((m ~ 2)!)k m c,o(x), (10.18)

8 = 1,2, ....
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 129

From (10.16) and (10.17) we find


P1 (- cpj {Xv(O)} )(x)

= -1'3 L \ cp(lL) (X) (n- 1 L(K;1/2(0)Wj(0))1L) , (10.19)


11L1=3 J.t

P2( - cPj {Xv(O)} )(X)

= 1'4 L ~cp(IL)(X) (n- 1 L(K;1/2(0)Wj(0))1L) (10.20)


11L1=4 J.t.

'Yi '" 1 (1L(1) +1L(2 ( )


+ 2" L.J "(1)1 (2)1 cp x
11L(1) I, 11L(2) 1=3 J.t .J.t

x (n- 1 L(K;1/2(0)Wj(0))1L(1)) (n- 1 L(K;1/2(0)Wj(0))1L(2)) .


The inequalities (10.19) and (10.20) can be written in another form. Let us
define the Chebyshev-Hermite polynomial of order s by the equality

H 8 (Z ) -- ( - 1)8 e%2 /2 ~
d _%2/ 2
e , - 0 , 1, 2 , ....
s-
Z8

Then for x E IRP and a multi-index J.t = (J.t1, ,J.tp) let us set
HIL(s) = HILl (X1) ... HlLp(xP ).
Clearly

Therefore
P1(- cPj {XII (0) } )(X)

= - 1'3 ( L \ HIL(x) (n- 1 L(K~-1/2(0)Wj(0))1L)) cp(X) (10.21)


11L1=3 J.t

P2(- cpj {Xv(O)} )(X)

= (1'4 L \HIL(x) (n- 1 L(K;1/2(0)Wj(O))IL) (10.22)


11L1=4 J.t
2 1
+~ L (1)1 (2)1 HIL (1)+1L(2) (x)
11L(1)1=3;11L(2) 1=3 J.t .J.t
130 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

LEMMA 24.1: Let J-Lk+l < 00 and let the conditions IV-VII (or IV, V, VIII, IX)
be satisfied. Then for a distribution Qn((J) of the sum of vectors

n- 1 / 2 I: K;;I/2(0)Wj(0)Cj
we have the a. e.

sup sup
(JET BEBp iB
I rQn(O) (dx)

-L (cp(x) + ~n-r/2Pr(-CP;{X"(0)})(X))dX

O(n-(k-l)/2). (10.23)

Proof: The proof consists in the verification that the conditions of Theorem A.13
are satisifed for the random vectors ejn = Wj(O)Cj, j = 1, ... ,n. However, condi-
tions (1) and (2) of Theorem A.13 coincide with conditions VI and VII. Therefore
it follows that only condition (3) needs to be verified. Let us remark that

For the square of each coordinate of the vector Wj(O) we obtain


q q
IAill (0)gli2 ... dj, 0) 12 ~ I: (Ail (0)) 2I: gt 2... )j, 0).
I i i
1=1 1=1

Therefore
n- 1 I: IWj(0)1k+ 1

< qlk+ 1 )/' C~~l IA" (0)1 )'+1 In-1 ~ (t, I~O (g,l.) (j, 0' ) 1k+1)/'

With regard to condition IV it remains to show the uniform boundedness of the


elements of the matrix A(O). But by the condition V
(detI(O))-1 < Aoq.
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 131

On the other hand, by condition IV

IIil(O)1 < (n-l/2din(O))(n-l/2dln(O))

< (n- L 19i(j, O)Ik+ r/(k+ (n- L 191(j, oW+1 r/(k+


1 1 1
) 1 1
)

< 00.

Let us denote by Q~(O) the distribution of the sum n- 1 / 2 L,Wj(O)Cj. The


result of Lemma 24.1 remains true for Q~ (0) as well if we bring into the expression
(10.18) for the polynomials Pr the following alterations. In the expression (10.14)
and subsequent formulae replace the sums Bsn(it, 0) by the sums

and let us consider the polynomials

s = 1,2, .... (10.24)

COROLLARY 24.1: Under the conditions of Lemma 24.1

sup sup I r Q~(O) (dx)


(JET BE13 p JB

(10.25)

Proof: Let us note that

sup
BE13p J B
rQn(O) (dx) - r ('P(X) + I: n- r/ 2Pr(-
JB r=l
'Pj {Xv(O)} )(X)) dx

= sup
AE13 P
rQ~(O) (dx)
JA
132 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

x L (ct'(K;1/2(8)X) + ~ n- r/ 2Pr (- ct'; {Xv(8)} )(K;1/2(8)x)) dx

= L (ct'Kn(9) (x) + ~ n- r/ 2 P;( - ct'Kn(9); {Xv(8)} )(X)) dx.

Since

then to the signed measure


k-2
~Kn(9) + Ln-r/2p;(-~Kn(9);{Xv(8)})
r=l

corresponds the c.r.

In particular we find

P:( - it; {Xv(8)})

= "/3 L ~ (it)1' (n- 1 L wj(8)) , (10.26)


11'1=31"
P;( - it; {Xv(8)})

= "/4 L \(it)l'(n-lLwj(8))+~(P:(-it;{Xv(8)}))2, (10.27)


11'1=4 I"
P:( - ct'Kn(9); {Xv(8)} )(x)

= -"/3 L \ (n- 1 Lwj(8)) ct'~~(9)(X), (10.28)


11'1=3 I"
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 133

P;( - CPK.. (9)i {XII(9)} )(x)

= 74 L \ (n- l Lwj(9)) CP~~(9)(X)


1/-11=4 j.t
2 1
+~ L j.t(l)!j.t(2)! (n- l Lw()(9))(n- l Lw{2)(9))
1/-1(1) 1,1/-1(2) 1=3

(",(1) +",(2) ( )
X CPKn(9) X. (10.29)

Let us define the polynomials Ps(x), x E JRP by the equalities

and let us denote

Qn(9,x) = (1+ ~n-r/2Pr(9,X)) CPK.. (9)(X). (10.30)

Then the relation (10.25) assumes the form

sup sup
9ET BEBt'
IrQ~(9)(dx) - rQn(9,x) dxl = O(n-(k-l)/2).
1B 1B
(10.31)

From (10.28) and (10.29) it follows that

P l (9, X)cpK .. (9) (x)


p

= -~ L (n- l Lw;(9)w~(9)wj(9)) (cpK.. (IJ) (Xilk , (10.32)


i,l,k=l

= ;: L (n- l L w;(9)w~(9)wj(9)wj(9)) (CPKn(9)(Xilks


i,l,k,s=l

(n- L w}(9)w~(9)wj(9)) (n- L wj(9)wj (9)w} (0) )


2 P
+ ;~. L l l
',I,k,s,r,t=1
X (CPK.. (9) (Xilkrst , (10.33)
where w;(O) is the ith coordinate of the vector Wj(9).
REMARK 24.1: Let us assume that the r.v.-s

(10.34)
134 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

are observed, where fJ E IRP is an unknown parameter, and the vectors gj E ]RP
satisfy the same conditions as the Wj(fJ) in Lemma 24.1. IT 9n in an l.s.e. of fJ,
obtained from the observations Xj, j = 1, ... ,n, then

1/ 2,,-1/2(9 _ fJ) - G- 1/ 2,,-1/2,,", g'c'


G n,..2 n - n "'8 L..J J J'

Consequently Lemma 24.1 gives, in particular, the a.e. for the normed l.s.e. dis-
tribution of the vector parameter of the linear regression (10.34).

The following Lemma is an extended variant of an assertion of Pfanzagl [176].


LEMMA 24.2: Let the mapping fn(x) : ]RP ~ ]RP be defined in the following
way
k-2
yi = f~(x) = xi + L n-r/2h~(fJ,x), i = 1, ... ,p, (10.35)
r=l

where the h~(fJ, x) are polynomials in x = (Xl, . .. ,xP) with coefficients uniformly
bounded in nand fJ E T. Then if the conditions (10.1), V, and VI are satisfied,

where

(10.37)

where Pr(fJ, y), r = 1, ... ,k-2, are polynomials in y = (y1, . .. ,yP) with coefficients
uniformly bounded in nand fJ E T.
Proof: It is easy to see that the polynomials Pr(fJ, y) are defined from the expansion
of the functions

Let us note that

(10.39)

with
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 135

thanks to the bounds for the diagonal elements Kii(f)) of the matrix Kn(f)) obtained
above. Consequently for f) E T

(10.40)

Therefore

(10.41)

Let us set

The restriction Inlve(logn) of the mapping In to the sphere VC(log N) is one-to-one


if n > no. Let

be the inverse function of InIVe(logn)' Expanding the functions g~(y), i = 1, ... ,p


into a Taylor series about y = (yl, ... , yP), yi = I~(x), we establish the existence
of the polynomials Qo(f), y) and Q~(f), y), i = 1, ... ,p, r = 1, ... , k - 2, with
coefficients bounded in n and () E T, such that

(10.42)

The first terms of the expansion (10.42) can be obtained in the following way.
Let us formally write

gn(Y) = Y + ~y, ~y = L n- r/ Qr(f), y),


2

r:?:l

where

are vector polynomials, and let us consider the identity


k-2
Y = In(gn(Y)) = gn(Y) + L n- r/ 2 hr (gn(y)),
r=l

where
136 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

or
k-2
0= 6.y + L n- r / 2hr (y + 6.y). (10.43)
r=1

Equating to zero the coefficients of n- I / 2 and n- I in the identity (10.43) we find

(10.44)
p

- h~ + L (hi) j hi, i = 1, ... ,p,


j=1

where

(hi.) . = -8
8
hi
J Yj

From (10.42) follows the existence of the polynomials R_I(B,y), Ro(B,y),


i\(B, y), r = 1, ... , k - 2, and the functions R~(B, y), i = 1, ... ,p, bounded by
polynomials, such that

n(k-I)/2I Qn (B; g;(y), ... ,g~(y)) - 'PK,,(IJ)(Y) (1 + ~ n- r / 2Pr(B,Y))

< R_I(B,Y)'PK,,(IJ)(Y)

+Ro(B, Y)'PK,,(IJ) (YI + n- I / 2R;(B, y), ... , Yp + n- I / 2R~(B, y)) . (10.45)

For the proof of (10.45) it is sufficient to write the expansion of the quantity

using the expansion (10.42). In particular we obtain


P
PI = PI + LtliQi, (10.46)
i=1

P
P2 + L (tliQ~ + (tli PI + (PI)i) (1)
i=1

(10.47)
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 137

where

= - 2: (K,:;-l (8)) ii yi, (10048)


i=l

2: (K,:;-1(8))i8 (K,:;-1(8))it y8 yt _ (K,:;-1(8))ii.


p
= (10049)
8,t=1

Since the functions R~ (8, y), i = 1, ... , p, are bounded by polynomials, then for
n > no

sup sup n-l/2IR~(8, y)1 ~ ~.


IJET yEFn 2

Consequently

and according to (10040)

'PKn(IJ)(yi + n- 1 / 2 R~(8, y), ... , yP + n- 1 / 2 R~(8, y))

~ (27rA*)-p/2 exp { - 4~* (lyl2 - p)} . (10.50)

And so for y E Fn we obtain from (10045)

(10.51)

where R*(8, y) is a polynomial with coefficients uniformly bounded in 8 E T and


n.
The Jacobian 8f;;1(y)/8y is the determinant of the p x p matrix that has the
general element
138 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

where Dii is the Kronecker symbol. In this way, for y E Fn and n > no

8J~:(Y) > O.
We find the expansion of the Jacobian 8J;;1(y)/f!y in powers of n- 1 / 2 . The rela-
tion (10.42) shows that there exist polynomials Qi(f), y), i = 0,1, ... ,k - 2 with
coefficients uniformly bounded in n and f) E T such that for y E Fn

n('-')/' laf~(Y) - (1 + ~n-'I'Q,(O,Y)) I,; Qo(O,y). (10.52)

The first polymonials of the expansion (l0.52) can be found starting from the
following considerations. The polymial Ql is the sum of polynomials of order
n- 1 / 2 of the diagonal elements of the Jacobian matrix of the mapping J;;l(y), i.e.,

(10.53)

The polynomial Q2 consists of the terms

p p

- L (h;)i + L ((hOiih{ + (hOi (h{)i) ,


i=l i,j=l

(2) L(Oi)i(Oi). = L(hOi(h{)i,


i<i J i<i
(3) - L (Oi). (Oi). = - L (hOi (hjf)i.
i<i J , i<i
The terms (1) represent its sum of polynomials of degree n- 1 disposed along the
principal diagonal of the Jacobian matrix. The terms (2) appear as the result
of the pairwise multiplication of the polynomials of degree n- 1 / 2 , lying on the
principal diagonal. The terms (3) apear in the multiplication of polynomials ~of
degree n- 1 / 2 placed symmetrically about the principal diagonal. They enter Q2
with the sign 'minus', since their permutations, corresponding to these elements
of the determinant OJ;;
1 (y)/8y, contain only one inversion.

The polynomial Q2' as is not difficult to be persuaded, admits a more compact


representation

(10.54)
10. ASYMPTOTIC EXPANSION OF LSE DISTRlBUTION 139

The expansions (10.51) and (10.52) show that there exists a polynomial Po(9, y),
with coefficients that are uniformly bounded with respect to 9 and n, for which
(see formula (10.38))

n(k-l)/2IQn(9,J;I(y)) - On(9,y)1

= n(k-l)/2 Qn(9, J;I(y)) - IP K n(9)(Y) (1 + ~ n- r / 2Pr (9, y))

< Po(9,y), (10.55)

with
r _ ~

Pr = LPIIOr-II' r = 1, ... ,k - 2,
11=0

if we adopt

In particular,

L (1l ihl + (hDJ,


~ - P
PI = 01 +P1 = PI - (10.56)
i=1

or, in a more conveniently written form,


p

PIIPKn(9) = P 1 IPKn(9) - L (hlIPKn(9)i . (10.57)


i=1

We further find
~

P2 =02+PI0l +1'2
-

p p

-L (1l i P I + (Pt}i) hl + ~ L 1l ij hlhi


~1 0=1
140 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

Regrouping terms, the expression for P 2 can be rewritten in the form


p

P2'PKn = P 2'PKn - L ((h;)i 'PKn + h;('PKn)i)


i=1
p

-L ((P1n)ihi'PKn +P1 (hDi'PKn +P1hi('PKn)i)


i=1

p p

= P 2'PKn - L
i=l
(P1hi 'PKn + h;'PKn) i + ~ L (hi hi 'PKn) .. (10.58)
i,j ~J

Thanks to the bounds obtained in the course of the proof of the Lemma

sup sup
IJET BEBp
/,
vC(Jogn)n/-l(B)
Qn(f}, x) dx - r
lFnnB
Qn(}, y) dyl

= sup sup j
IJET IJEBP
r
1FnnB
Qn(}, f;1 (y)) j af~1(y) j dy -
y
r
1FnnB
Qn(}, y) dyj

O(n(k-l)/2). (10.59)
The relation (10.36) is now a consequence of (10.41), inclusion

(W'\Fn)C (RP\VC(~IOgn))
and the bound of the form (10.41) for the a.e. Qn(), y).

The following Lemma is a sharpening of Theorem 18 of Section 7 which is
useful in the proof of Theorem 24.
LEMMA 24.3: Let conditions 1f+l' 11- V be satisfied, and let the l.s.e. On have the
following property: for any r > 0
supP9{jOn - (}j ~ r} = o(n-(k-l)/2).
IJET

Then for some constant c. >0

supP; { n 1/ 2(On - (}) -


IJET
I:
11=0
n- II / 2h ll (}) ~ c.n-(k-1)/2Iogk/2 n}
(10.60)
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 141

where h,,(O) are vectors, the coordinates of which are polynomials in the coordinates
of the vectors V"H (0), v = 0, ... , k - 2, with coefficients that are uniformly bounded
in 0 E T and n. Then

ho(O) = Vl(O), (10.61)

h1(O) = H1(Vl(O)), (10.62)

= (V!'I (O)Xil - ~ .. (O)X i1 x i2


4 Ai8(O)a8'1'2 )Q ,
i=l

(10.63)

Proof: The relation (10.60) repeats (7.45). Clearly

ho(O) = Vl(O).
Let the assertion of the Lemma hold for hi(O), i = 0, ... , l - 1. Let us substitute
u(l+l)(t) = u(l)(t) + t(l+l)hl
(see (7.18)) into the equality (7.17):
.coo (u(l+l) (t), t)
- 2tB(0; 0) + 2I(O)( u(l) (t) + t(lH) hi)
- 2tB(2)(O)(u(l)(t) + tl+1h1)

+ L 1
,(A(a,O) - 2tB(a,O))(u(l)(t)
a.
+ t1Hh1)'''. (10.64)
1"'1~2

For the definition of hi we equate to zero in (10.64) the coefficients of t1H :

2I(O)hl - 2B(2) (O)hl-l

+ L ~
a.
A(a, O)Q~l + L ~
a.
B(a, O)QhBl = 0, (10.65)
2~1"'I::;ZH 2~1"'1::;Z

where QhAl and Q["El are polynomials in the coordinates of the vectors ho, ... , h1- 1.
The statement about the hi is justified by the induction hypothesis and by the
presence in the expression for the hi in (10.65) of the vector

-~ L A(O)B(a, O)Q["El .
1"'1=1
142 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

.
It is also easy to establish the uniform boundedness of the coefficients of the
polynomials hv(B) by induction, basing ourselves upon the equality (10.65). The
identity of the polynomials (7.35)-(7.37) and (10.61)-(10.63) is verified immedi-
~~

Proof of Theorem 24: Let us consider the distribution of the sum of random vectors
n- 1/ 2 'L, Wj((})cj

Q~((})(B) = (P e 0 Vk-l(B))(B),
where B E BP, and the mapping (10.35)

Let us introduce on (lRP , BP) the measure

(Q~(B) 0 fn(' jB))(B) = (PrJ 0 Vk-l((}) 0 fn(' j (}))(B).

Lemma 24.1 and the inequality (10.31) show that

sup sup f (Q~((}) 0 fn( . j B)) (dx) - f Qn((}' x)(dx)


9ET BEBP iB i /;;1(B;9)
(10.66)

By Lemma 24.2

sup sup
9ET BEBP
IiffB(Q~(B) 0 fn(' jB)) (dx) - fB Qn(B,y) dyl
if
= O(n-(k-l)/2), (10.67)

where the first polynomials Pt((}) and P2((}) of the expansion Qn((},Y) are given
by the equalities (10.57) and (10.58).
By Lemma 24.3 there exists a constant c* and a vector function

hv( . , (}) : lRP ~ lRP , v = 0, ... , k - 2,

such that

sup P(J{lnl/2(On - (}) - Hn(Vk-l(B)j (})I ~ c*n-(k-l)/2Iog k/2 n}


9ET

= o(n-(k-l)/2),

k-2
Hn(xj (}) = L hv(x, B)n- v/ 2. (10.68)
v=o
10. ASYMPTOTIC EXPANSION OF LSE DISTRIBUTION 143

Let us set
x = c.n-(k-l)/2Iogk/2 n.
Then from (10.68) it follows that

P;{n 1 / 2(9n - 8) E C} < (Q~(8) 0 Hn(' j 8 (C,.) + o(n-(k-l)/2), (10.69)


(Q~(8) 0 Hn(' j 8 (C_,.) < P;{nl/2(9n - 8) E C} + o(n-(k-l)/2. (10.70)

uniformly in 8 E T and C E ct.q


Let Z(C,.) be the cylinder on C,. in ]RP. In Lemma 24.2 let us set

. { H!(xj8) = Eh~(X,8)n-II/2, i = 1, ... ,q,


f~(x) = 11=0 (10.71)
xi i = q + 1, ... ,po

Then from Lemma 24.2, (10.67), (10.69) and (10.70) it follows that

P9{n 1 / 2(9 n - 8) E C} < (Q~(8) 0 fn(' j8(Z(C,.) + o(n-(k-l)/2)

< ( Qn(8, y) dy + O(n-(k-l)/2), (10.72)


iz(c H )

P9{n 1 / 2(9 n - 8) E C} > (Q~(8) 0 fn( . j 8(Z(C- oo + o(n-(k-l)/2)


(10.73)

uniformly in 8 E T and C E ct.q


Later we shall use the following property of the multi-dimensional Gaussian
distribution. Let us denote

(10.74)

_ (Yq+l , ... ,y.


z- P)

Then [6]

(10.75)

The positive definiteness of the matrix S follows, for example, from the equality

det (J.t2" 1 Kn) = det A det S.


144 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

By the property (10.75) we obtain from (10.37) for the polynomials Pr


r
iz(cx)
<PKn UJ)(y)Pr (8,y)dy = r
icx
<P/12 A(O)(u)Mr(8,u)du, (10.76)

M r (8,u) = r
iRrq
<P1'2 S(O)(Z - E 21 I(8)u)Pr (8,y) dz, r = 1, ... , k - 2. (10.77)

The functions M r (8, u) are polynomials in u, the degree of Mr coincides with the
degree of Pr, which is equal to 3r, and the coefficients of Mr are uniformly bounded
in 8 E T and n.
It is easy to be persuaded of the existence of constants a = a(T) < 00 and
b = b(T) > 0 such that for r = 1, ... , k - 2

sup sup
OET CEe: q
r
ic,,\c
<P1'2 A(O)(u)IMr(8, u)1 du ~ a sup
CEe: q
r
ic,,\c
<Pblq(U) duo (10.78)

Applying Theorem A.ll to the function <Pblq(U) we find that the right hand part
of (10.78) is of order
O(x) = 0(n-(k-l)/2Iog k/2 n).

Consequently

+0(n-(k-l)/2Iog k/2 n). (10.79)

The opposite inequality with the same uniform bound for the remainder term, as
in (10.79), can be obtained starting from the inequality (10.73).

COROLLARY 24.2: Let the conditions of Theorem 24 be satisfied for k = 3. Then

(10.80)

Proof: The relation (10.80) follows immediately from (10.11).



11 CALCULATION OF THE FIRST POLYNOMIALS OF AN
ASYMPTOTIC EXPANSION OF THE DISTRIBUTION OF A
LEAST SQUARES ESTIMATOR

This Section is closely related to the preceding one. Using the notations introduced
earlier we shall not bother to indicate dependence upon nand 8 in the formulae.
11. FIRST POLYNOMIALS OF AN ASYMPTOTIC EXPANSION 145

Let us make some preliminary remarks. It is easy to see that

r
JRP-q
Z'P1'2 S (Z - ~21Iu) = ~21Iu, (11.1)

r
JRP-q
ZZ''P1'2S(Z - ~21Iu) = J.L2~22 - J.L2~21I~12 + ~21Iuu'I~12' (11.2)

From (11.1) it follows that for i,j = 1, ... ,q

r
JRP-q
y iq+j 'P1'2 S (Z - ~21Iu) dz = AirII(rj)(a)Ua . (11.3)

For

t = q2 + q + 1, ... , 21 q(q + l)(q + 2)

let the indices i, j, 8 = 1, ... , q be chosen so that the tth coordinate of the vector
Vk-l is
v;k-l
t -
-
AirbrjB'

Then

(11.4)

From (11.2) it follows that

r
JRP-q
yiq+jylq+m'P1'2s(z - ~21Iu) dz

= J.L2Air AIBII(rj)(Bm) - J.L 2A ir AlB A a,8II(rj)(a)II(Bm)(,8)

+A irAIBII (rj)(a) II (Bm)(.8)U aU,8 . (11.5)


Let us find the polynomial Ml(U) of the a.e. (10.11).
From (10.57), (10.75) and (10.77) it follows that

M 1(u)'P1'2 A (U) = Lp-q PdY)'PK(Y) dz (11.6)

= kp-q P1(Y)'PK(y)dy - t, kp-q (hi (Y)'Pk(Y))i dz.


Taking advantage of the definition of the mapping (10.71) and the relations (10.61),
(10.62) we find

. u, z -- 1, ... , q,
Yiq+a Ua - 4"1 Air ara,8U a,8'
hl(Y) = { (11.7)
o i=q+1, ... ,p.
146 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

Let us further note that from (11.3) and (11.7) it follows that

Lp-q (ht (Y)<PK(Y)) i dz = (<P1'2 A(U) Lp-q ht (Y)<P1'2S(Z - E 21 lu) dZ)i

= Air (II(ra)(.B) - ~ ara.B) (u au.B<P1'2A(U))i .


Since, by formula (7.33),
ara.B = 2(II(r)(a.B) + II(a)(r.B) + II(.B)(ra),
then

~ kp-q (ht(Y)<PK(Y))i dz
= - ~ (A ir II(r)(a.B)U au.B<P1'2A(u)L
= (- A.BrII(r)(a.B)Ua + 2~2 II(a.B)('Y)Uau.Bu'Y) <P1'2A(U). (11.8)

On the other hand, using (10.32) we obtain

( -
JRP-q PI
()
Y <PK ()
Y dz = 'Y3 AiaAj.BAhii (a)(.B)('Y) <P1'2A U )) ijl
- "6 (11.9)

=
6'Y33 II(a)(.B)('Y)u'Y(uau.B - 3JL2 Aa.B) <P1'2A (u).
JL2
Combining the equalities (11.8) and (11.9), from (11.6) we find

M()
1 U = ('Y6JL~3 II(a)(.B)('Y) - III (a.B)('Y) )a.B'Y
2JL2 u u u

= +A.B'Y ( II(a.B)('Y) - 2~~ II(a)(.B)('Y) u a . (11.10)

Calculation of the polynomial M2(U) is considerably more laborious. According


to (10.58) and the definition of the mapping (10.71),

M2(U)<P1'2A(U) = Lp-q P2(Y)<PK(Y) dz (11.11)

= Lp-q P 2(Y)<PK(Y) dz
+
q
j=1
8 [12 L (ji8
L 87
u
q
i=1 U
h . .
llV-q
hi(y)hHy)<pK(Y) dz
11. FIRST POLYNOMIALS OF AN ASYMPTOTIC EXPANSION 147

The equality (10.33) and direct calculation show that

Lp-q P2(Y)'PK(Y) dz
1'4
24
AiO: Aj,8 Ak"Y A l6 II ( ( ))
(0:)(,8) (')')(6) 'P1J.2A U ijkl

2
+ 1'3
72 AiO:Aj,8Ak"YAI6AmeArvII (0:)(,8)("Y) II (6)(e)(v) ( ( ))
'P1J.2A U ijklmr

'P1J.2A (u) [7;~~ II{0:)(,8)("Y)II{6)(e)(v) uo:u,8u"Y u 6u e u v


1'4 1'~ i'
+ ( 24J.t~ II{0:)(,8)(')')(6) - 12J.t~ A JII{i)(j)(o:) II{,8) (')')(6)

1'3 2
ij kl 1'32 ij )
+ -4 4 A A II{i)(k)(o:)II(j)(I)(o:) + -84 A A
kl
II{i)(j){o:)II{k)(I)(,8)
0:
U U
,8
J.t2 J.t2
2
1'4 ij kl 1'3 ij kl mr
+( 8J.t~ A A II{i)(j)(k)(l) - 12J.t~ A A A II{i)(k)(m) II (j)(l)(r)

-
1'~
8J.t~
AijAklAmrII
(i)(j)(k)
II
(l)(m)(r)
)]
. (11.12)

From (11.3), (11.5) and (11.7), for i,j = 1, ... ,q we obtain

(11.13)

+ J.t2Air Ajk (II{ro:)(,8k) - A ls II{ro:)(I)II{k,8)(s) uo:u,8] 'P1J.2A(U).

After differentiation, from (11.13) we find


148 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

= III
[ 8Jt~ II Ot{3"(6EV
(Ot{3)("() (6E)(V)U U U U U U

5 ArkII
- -8 II
(r)(Ot{3) (k)("(6) + -2
1 II
(Ot{3)("(6)
)
YOt Y{3 Y"( Y6
1'2 1'2

+ (- ~ A rk II(rOt)(k{3) - A rk II(rk)(Ot{3)

+A rk Aim (~ II(rOt)(I)II(k{3)(m) + II(rk)(/)II(Ot{3)(m) + ~ II(I)(m{3)II(k)(rOt)

+ ~ II(I)(mr)II(k)(Ot{3) + ~ II(I)(r{3)II(k)(mOt)) uOt u {3


+ ~2 Ark AlB (II(rk)(ls) + II(rs)(k/) - AmtII(rs)(m)II(kl)(t)

- AmtII(rk)(m)II(ls)(t) ] 'P1'2 A (u). (11.14)

From the formula (10.63) it follows that

(11.15)

In the second integral of (11.15) the variable yt corresponds to the r.v.

j -- AjrbrOt{3,
VOt{3 t = q2 + q + 1, ... , ~ q(q + l)(q + 2).
In accordance with (11.4)

(11.16)

On the other hand, (7.34) shows that

(11.17)
11. FIRST POLYNOMIALS OF AN ASYMPTOTIC EXPANSION 149

And so

-_ - Ajr 2
(1 II (ra:)({3'"Y) +"61 II (a:{3'"Y)(r) ) Ua: U{3 U'"Y 'P/l-2A (U) . (11.18)

Let us further remark that

Lp-q hI (y) (hi (y)) i 'PK(Y) dz (11.19)

_!Ajrara:iUa:U{3
2
r
JRp-q
y iq +{3'PK(y)dz+ua:
JRP-q
r yiq+a:yjq+i'PK(y)dz.

Effecting the calculation of the integrals in (11.19) by the forrimlae (11.3) and
(11.5) and collecting similar terms, we obtain

Lp-q hI (y) (hi (y)) i 'PK(Y) dz


= (A irAjsII (r)(a:{3) II (i)(s'"Y)U a: U{3 U'"Y

+ J.t2Air Ajs (II(ra:)(si) - A{3'"YII(ra:)({3)II(Si)('"'t)) ua:) 'PJ.l2A(U). (11.20)


After differentiation, from (11.18) and (11.20) we obtain

[ ( - -I II
6 (a:)({3'"Yo) + -1 ArsII
(r)(a:{3)
II
(s)('"'to) - 1 II
- 2 (a:{3) ('"'to)
)
Ua: U{3 U'"Y U0
J.t2 J.t2 J.t2

+ ( A rs ( (2II(ra:)(s{3) + ~ II(rs)(a:{3) + ~ II(r)(sa:{3))


- 3A1k II(ra:)(I)II({3s)(k) - A 1k II(r)(a:{3)II(s)(lk)) ) ua: u{3

+ J.t2Ars Alk (A mtII(rk)(m)II(sl)(t) - II(rk)(sl)) ] 'P/l-2A (u). (11.21)

Let us consider

Lp-q hi (y)P 1 (Y)'PK(Y) dz (11.22)


150 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

The first term on the right hand side of (11.22) is calculated by formula (11.9).
Resorting to integration by parts we obtain

kp-q y iq+QP l (Y)'PK(Y) dz


= - "'{,63 Ai'Y A k6 A Be IIb')(6)(e) ( r
~p-q
yiq+Q'PK(Y) dZ)
ikB

(11.23)

From the equalities (11.3), (11.9) and (11.23) we find after differentiation

kp-q hi(y)Pl(Y)'Pk(Y) dz (11.24)

= [- 1'3 Airii II Q {3 'Y 6 e


12JL~ (r)(Q{3) (-y)(6)(e)U U U U U

1'3
+ JL~ Air (1 II
"2 (rQ)({3)('Y) + "41 AiBII (r)(Q{3)II(i)(B)('Y)

= - "21 AiB II (rQ)(i) II (B)({3)('Y) ) UQU{3 U'Y

+ 2~2 Air AiB (II(rQ)(i)(B) - AklII(rQ)(k)II(I)(i)(B) U Q] 'P1'2A(U).

From (11.24) we obtain

~ a r .-
- ;:. aui JR.p-q hi (y)P l (Y)'PK(Y) dz

1'3 Q{3'Y6ev
= [- 12JL~
II
(Q)({3)('Y)
II
(6e)(v)U U U U U U

+ (2~~ II(Q)({3)('Y6)

+ 1'3 Air (III


JL~ "4 (Q)(i)(r) II ({3'Y)(6) -"41 II (Q)({3)(i) II (-y6)(r)

+ ~ II(Q)({3)('Y)II(6i)(r)) UQU{3U'Y U6

+ (- :~ AiB ( II (Q)(i)({3B) + ~ II(Q)({3)(iB) + ~ II(Q{3)(i)(B)


11. FIRST POLYNOMIALS OF AN ASYMPTOTIC EXPANSION 151

')'3 Ais Ajr


+ 2"
J.L2
(1
4 II (i)(s)(j) II (a,B)(r) + II (a)(i)(j) II (,Bs)(r)

- ~ II(a)(i)(s)II(,Bj)(r) + ~ II(a)(,B)(i)II(jr)(s)) uau,B

+ 2~2 AklAis (II(kl)(i)(S) - AjrII(k)(I)(j)II(iS)(r)] CPj.l2A(U). (11.25)

From (11.11), (11.12), (11.14), (11.21) and (11.25) after collecting similar terms
in the expressions (11.14) and (11.21) we obtain, at last, the expression for the
polynomial M2(u):

[7;!g II(a)(,B)(-y)II(6)(e-)(v) - 1;~~ II(a)(,B)('Y)II(6e-)(v)


+ 1
8J.L~
II
(a,B)('Y)
II
(6e-)(v)
] a,B'Y 6 e-v
U U U U U U

')'4 ')'3 1
+ [ 24 4 II(a)(,B)('Y)(6) + -23 II(a)(,B)('Y6) - -6 II(a,B'Y)(6)
J.L2 J.L2 J.L2

+Ajr ( - 1;!~ II (a)(,B)('Y) II(6) (j)(r) - 8~~ II(a)(,B)(j)II(-y)(6)(r)


')'3 ')'3
+ 4 3 II(a)(j)(r)II(,B'Y)(6) - 4 3 II(a)(,B)(j)II(-y6)(r)
J.L2 J.L2

')'3 1
+ -63 II(a)(,B)('Y)II(6j)(r) - - 8 II(a,B)(j)II('Y6)(r)
J.L2 J.L2

- 2~2 II(a,B)('Y)II(6j )(r) ] u a u,Bu'Yu 6


1 II 1 II 1 II
+ [ AiS ( - ')'4 II
4J.L~ (a)(,B)(i)(s) +2 (a,Bi)(s) - 2 (a,B)(is) + 2 (ai)(,Bs)

')'3 ')'3 ')'3 )


- J.L~ II(a)(i)(,Bs) - 2J.L~ II(a)(,B)(is) - 2J.L~ II(a,B)(i)(s)

+ AisAjr (')'~ TI TI ')'~


4J.L~ (a )(,B)( i) (s) (j)( r) + 8J.L~ TI( a)( i)( s) TI(,B)(j)( r)

I~ 13
+ 4J.L~ TI(a)(i)(j) TI(,B)(s)(r) + 4J.L~ TI(i)(s)(j) TI(a,B)(r)

13 ')'3
+ 1/2 TI(a)(i)(j)TI(,Bs)(r) - -2 2 TI(a)(i)(s)TI(,Bj)(r)
r2 J.L2
152 CHAPTER 2. APPROXIMATION BY A NORMAL DISTRIBUTION

+ [Akl Ais (1'4 II


8J.l~ (k)(l)(i)(s) + 1'3 II
2J.l2 (kl)(i)(s)

+ "2
J.l2
II(kl)(is) -
J.l2
"2 II(ki)(ls)
)

2 2
kl is jr ( 1'3 1'3
+A A A - 8J.l~ II(k)(l)(i)II(s)(j)(r) - 12J.l~ II(k)(i)(j)II(l)(s)(r)

(11.26)

The polynomial M2(U) contains 40 terms, each of which in turn is a sum. For the
symmetric r.v. Cj the cumulant 1'3 = 0, and the written form of the polynomials
Ml(U) and M2(U) becomes less cumbersome. For example, in M2(U) there remain
18 terms. For the Gaussian r.v. Cj, 1'3 = 0 and 1'4 = 3J.l~.
From the expressions (11.10) and (11.26) it is easy to obtain the form of the
polynomials Ml (u) and M2 (u) for q = 1. For this it is sufficient to note that

n- 1 ~gll(j,O)gl(j,O),

etc.. Let us denote

II12 = n- 1 ~gl(j,O)gll(j,O), II13 = n- 1 ~g'(j,O)glll(j,O),


11. FIRST POLYNOMIALS OF AN ASYMPTOTIC EXPANSION 153

Then for q = 1

MI (u) = (6~~ Illll - 2~2 Il12) u3 + A(Ill2 - 2~~ Illll) U, (11.27)

M2(U) = [8~~ (3~~ Illll - Ill2 y] u6


1'4 1'3 1
+ [ 24J1.24 Illlll + -23 Illl2 - -6 Ill3
J1.2 J1.2

5'Y~
+ A ( - 24J1.~ (
Illll
)2 1'3
+ 6J1.~ IllllIll2 -
5 2)] 4
8J1.2 Ill2 U

+ [A ( - -1'4 Illlll + -1 Ill3 - -21'3 Illl2 )


4J1.~ 2 J1.~

(11.28)
Chapter 3

Asymptotic Expansions Related to the Least


Squares Estimator

In this Chapter we find the a.e. of the moments of the l.s.e. and the a.e. of the en
distributions of a series of functionals of the l.s.e. used in mathematical statistics.
In this Chapter the assumptions of Chapter 2 about smoothness of the regression
functions g(j,8) are kept: for each j there exist derivatives with respect to the
variables 8 = (8 1 , .. , 8q ) up to some order k ~ 4 inclusive that are continuous in
e c , where e ~ IRq is an open convex set, The assumption of Section 10 about the
normalisation n 1 / 2 1 q instead of d n 8 is also used.

12 ASYMPTOTIC EXPANSION OF LEAST SQUARES ESTIMATOR


MOMENTS

This Section contains the a.e. of mixed moments of coordinates with a normed
l.s.e.en as n -+ 00. In particular the first terms of the a.e. of the bias vector and
correlation matrix 8n are indicated.
Let
m ~ max (3, k).
We shall assume that the l.s.e. en has the property (3.4):
supP;{n 1 / 2 Ien - 81 ~ H} ~ cH- m . (12.1)
9ET

Sufficient conditions for (12.1) to be satisfied are contained in Sections 2 and 3.


Let us assume that
lim supn- 1
n~oo 9ET
L Ig(a)(j,8)l m < 00, lad = 1, ... ,k.

LEMMA 25.1: Let the conditions II, III, V of Section 10, lVI, and J.tm+1l. < 00
for some ~ > 0 (the condition Im+ll.) be satisfied. Then if the l.s.e. 8n satisfies
the relation (12.1), then for some c. > 0

sup
9ET
p;{ n 1 / 2(8 n - 8) - ~ n-
11=0
II / 2h ll (8) ~ c.n-(k-l)/2logk/2 n}

155

A. V. Ivanov, Asymptotic Theory of Nonlinear Regression


Springer Science+Business Media Dordrecht 1997
156 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

= O(n-(m-2)/2Iog -m/2 n), (12.2)


where hv, v = 0, ... , k - 2 are vector polynomials of degree v + 1 in the random
variables b(aj 0), lal = 1, ... , v + 1, with coefficients that are bounded uniformly
in 0 E T and n.
Proof: The Lemma is a corollary of the relation (7.6) of Theorem 17, of Theo-
rem A.4, and of the inequality (12.1) for
H = ronl/2.
The first three polynomials ho, hl, h2 are given by equalities (7.35), (7.36), and
(7.37), using the normalisation n l / 21 q instead of dn(O).
Clearly (12.2) means that
k-2
n l / 2(9n - 0) = L n- v/ 2hv(O) + ek-l (O)n-(k-l)/2, (12.3)
v=o
where ek-l(O) is a vector having the following property:
supp;{lek-l(O)1 ~ c.log k/ 2 n} = O(n-(m-2)/2Iog-m/2 n). (12.4)
9ET

LEMMA 25.2: Let the conditions III of Section 10, IVl , and J.Lm < 00 be satisfied.
Then for the r. v.

lal = 1, ... ,k,


there hold the relations
supP;{lb(ajO)1 ~ anJ.L~/2n-ldn(ajO)} $ XTn-(m-2)/2a~m, (12.5)
9ET

where XT < 00 is one and the same constant for any sequence
an ~ (m - 2 + o)l/2Iogl/2 n,
in which 0 > 0 is an arbitrary fixed number.
Proof: The Lemma is a rephrasing of Theorem A.5 for the r.v.
ejn = Cjg(a)(j,O).
For

let us assume
(hv(O), >.} = hv(>'}, v = 0, ... , k - 2,

(ek-l (0), >.) = ek-d>'} ,


(n l / 2(9 n - 0), >.} = On(>'}.
12. ASYMPTOTIC EXPANSION OF LSE MOMENTS 157

Let us fix the integral-valued vector


r = (rl, ... ,rq ), Irl = s ~ 1.
Let us consider the set of integer-valued vectors with non-negative coordinates

and the set of matrices of dimension (1+1) xq with non-negative integer coefficients

i = 1, ... , q, (;0, ... , i,) E ai, } .

LEMMA 25.3: Let the conditions of Lemma 25.1 be satisfied, and let s ~ 1 be an
integer. Then
k-2
(1) O~ (A) = L hI B(A)n- I/ 2 + hk_ l ,B(A)n-(k-l)/2,
1=0

I
= Ls!IT
a,. ~h~(A)'
11=0 ~II
1= 0,1, ... , k - 2,

where :E al is the sum over the set of vectors alB i


(2) the coefficients hl,r(O), Irl = s of the polynomial hIB(A) of degree Ar =
AP ... A~qhas the form

IT IT
I q
h-I,r = 'L..J
" s.I 1
~ (hj)d~j
II , 1= O, ... ,k - 2,
A,.(r) II=Oj=l "3

and where :EA,.(r) is the sum over the set of matrices AIB(r)i
(3) the coefficients hk-l,r(O), Irl = s, of the polynomial hk-l,B(A) have the
following property: a number b > 0 can be found such that for some constant
Cl = cl(T) < 00
there holds

supp;{max
9ET Irl=B
Ih k -l.r (0) I ~ ClI0gb n }
= O(n-(m-2)/21og- m/ 2 n). (12.6)
158 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Proof: The proof of (1) is evidentj (3) follows from Lemma 25.2 and equality
(12.4). The assertion (2) follows from the equality

where I:BI is the sum over the set of matrices

BI = {(d vj ) : t
3=1
dVj = iv, v = 0, ... , l} .
Let us denote by Ml+s, l = 0, ... ,k - 2, the set of integer-valued vectors J.L
with coordinates J.LOI ~ 0, lal = 1, ... ,l + 1, for which
1+1
L J.LOI = l + s.
1011=1
The assertions (1) and (2) of the preceding Lemma show that for coefficients
hl,To Irl = s, of degree >.r = >.? ... >.;q of the polynomial h (>'} we have the ,s
representation
1+1
hl,r = L c!"r(9) II b!,a (aj 0), (12.7)
!'EM,+. 1011=1

where the coefficients c!"r(9) are uniformly bounded in 9 E T and n, and some of
which may possibly be zero. Indeed, the quantities are polynomials of degreehi
v + 1 in the variables b(aj 0), lal ~ v + 1, and

I q

L L(v + l)d vj =l + S
v=Oj=1

Let us introduce the following sets of matrices with non-negative integral ele-
ments
12. ASYMPTOTIC EXPANSION OF LSE MOMENTS 159

K,(p) = {(Xaj ) :Xj = ~ Xaj ~ 2, i = 1, ... ,P},


lal=1

Kl p) = {( Xaj) : Xj =f:. 1, i = 1, ... , p} ,


K,(P)(i!, ... ,it) = {(xaj) :Xj ~ 2, i E (j1,,it)jxj = 0, i (j1,,it))

t $. p.

Let us note that


K(n)
p.,l
n K(n)
I
=0
if I = 0, s = 1, and
[L]
K~~) n Kl n) = U U (K~~l n K ,(n)(j1, ... ,it)) .
t=1 1~h< ... <it~n

For t = 1, ... , [~(l + s)]let us set

Gp.,n(Oj t, I, s) =

LEMMA 25.4: There holds the equality


t-1
Gp.,n(}jt,l,s) = Lq,}f.~(}jt,1,8)n-P,
p=o

where the q,}f.~ are polynomials in the moments of the r.v.-s ej and the arithmetic
means over all indices i = 1, ... , n of the products of different powers of different
partial derivatives of the junctions g(i,O) (of the types introduced earlier of the
quantities II(it}(i2)(is)' II(h)(i2 i s) etc.).
Proof: The quantity
1+1
n- 1 L II (g(a) (i, 0)) ".. 1 m"i
lal=1

is a polynomial of the stated form. Therefore for G p.,n (OJ 1, I, s) the assertion of
the Lemma is justified. Let us consider the case t > 1. Let
160 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

be a fixed matrix, and

D~n) = {1 $. ie t= i6 $. n, c t= 6j c, 6 = 1, ... , t}
a set of indices. Then

where 6u:. is the Kronecker symbol. If t > 2 let us write the analogous equation
for the sums over the sets of indices Dt)1 etc .. In the result we obtain the required
representation for the sum corresponding to the matrix (Xo:i).
Let us set
-(p)
wr ,n(8jt,l,s) =
where the cJ',r(8) are quantities bounded uniformly in 8 E T and n. Let us assume
that L:~=1 = O. Let us also introduce the set of indices

Bp(k, s) = {(l, u) : 2l + s - 2u = pj 1 = 0, ... ,k - 2j u = 1, ... , [l ~ s] } .


Clearly
Bp(k,s) =0
if s and P are numbers of different parity, or 1 + s = l.
THEOREM 25: Under the conditions of Lemma 25.1, for integral s ~ 1
k-2
Eo8~()'} = ~ Eohls().}n-I/2 + h~8().}' (12.8)
1=0

with, moreover,
(1) The coefficients h~,r(8) of the polynomial h~8().} of degree
\r _
" -
\rl \rq
"1 ... "q , Irl = s,
have the following property
max sup Ih~ r(8)1 = o(n-(k-2)/2),
Irl=8 9ET '
12. ASYMPTOTIC EXPANSION OF LSE MOMENTS 161

for which
(a) s:S; m - 1 for k = 2 and m ::::: 3,
(b) s:S; m - k-1 for k > 2 and m ::::: 2k - 2,
(c) s:S;m-k for k > 2 and m < 2k - 2;

[~] t
(2) E'e'hl,r(O) = L L W~,t:u) (0; t, I, s)n-((I+S)/2)+U, (12.9)
t=1 u=1

1 = O, ... ,k - 2;

(3) "
Sl /
,ns 2 EO'
II (O~ - OT'
q
A.

rl ... r q . i=1

k-2 [ill]
2
) +hn,r(O),
= Ln
p=o
_ /2
P ( L
Bp(k,s)
L Wr,n
t=u
-(t-u)
(O;t,l,s)
I
(12.10)

where the functions h~,r(O) has the property of the coefficient h~,r(O), and the
coefficients of n- p / 2 in (12.10) are uniformly bounded in 0 E T and n. If s is an
even number then the sum in (12.10) is carried out over even p, and if s is odd
then over odd p.
Proof: For 0 E T let us introduce the event

Wn(O) = {max
Irl=s
Ihk-1,r(O)1 < c1log b n},

where Cl and b are the constants from (12.6). Let X{A} be the indicator function
of the event A, and X = 1 - X.
From Lemma 25.3 it follows that
k-2
EO'X{Wn(O)}O~(A) =L EO'X{Wn(O)}hls(A)n-s/2 + h~J(A), (12.11)
1=0

where h~l} (A) is a polynomial of which all the coefficients are quantities that are
of O(n-(k-l)/2Iog b n) uniformly in 0 E T.
Let us estimate the mathematical expectation E O' X{Wn (O)}h ls (A), I = 0, ... ,
k - 2. Let I be fixed. The equality (12.7) shows that it is sufficient to estimate
the quantities of the form
1+1
EO'X{Wn(O)} II (b(n; O))lta, JL E M!+s'
lal=1
162 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

(I:
Since

IT (b(aj 0))/1<> ::; (b(0)(B))2)(l+S)/2


101=1 101=1
it then follows that we estimate E;X{Wn(Bnlb(aj O)ll+s for fixed a, 1 ::; lal ::; l+l.
For some r > 1 and 8 > 0 let us set

"(jn = r j (m - 2 + 8)1/2Iog1/2 n, j = 0,1, ... , ,

W~?1(B) {lb(ajO)1 < 'YonJL~/2n-ldn(ajBn, (12.12)

W~~ (B) = bj-l,n::; Ib(aj 0)IJL;-I/2nd;;1 (aj 0) < 'Yjn}, j = 1,2, ....
Clearly

(12.13)

+L
00

::; E;X{Wn(BnX{W~?1(Bnlb(aj O)ll+s E;X{W~~(Bnlb(aj O)ll+s.


j=1
From the conditions of the Theorem and Lemmas 25.2 and 25.3 it follows that

E;X{Wn(Bnx{W~?1(Bnlb(aj O)ll+s
= O(n-(m-2)/2 (log n)(l+s-m)/2), (12.14)

L E;X{W~~(Bnlb(aj O)ll+s
00

j=1

L 'Y~!S P;{lb(aj 0)1 ~ 'Yj-l,n JL;/2 n- 1dn (aj Bn


00

< (JL~/2n-ldn(aj O))l+s


j=1

< c2(T) (t
3=1
r j(I+S)-U-l)m) n-(m-2)/2(logn)(l+s-m)/2. (12.15)

uniformly in 19 E T. The series on the right hand side of (12.15) is convergent if


s::;m-k+l.
Let us estimate E;X{Wn(BnIBn(,XW, where s ~ 1 is not obliged to be an
integer. Clearly it is sufficient to estimate the quantity n S/ 2E;X{Wn (BnIO n -Bls.
For some f3 E (0, ~) and r > 1 let us set

WOn (B) = {IOn - 191 < n-.Blogl/2n},


Wjn(B) = {rj-ln-.Blogl/2n::;IOn-BI<rjn-.Blogl/2n}.
12. ASYMPTOTIC EXPANSION OF LSE MOMENTS 163

Then, uniformly with respect to () E T, we have

(12.16)

(}I B+ L
00

~ n B / 2 E;X{Wn ((})}X{WOn ((})} 19n - n B / 2 E;X{Wjn ((})}19 n _ (}IB,


j=1

n B / 2 E;X{Wn((})}X{Won ((})}19 n _ (}I B

= O(n-((m-2-B)/2)-.e B(logn)-(m-B)/2), (12.17)

Ln
00
B/ 2 E;X{Wjn ((})}19 n - (}IB
j=1

< nB/2-.e BlogB/2 n L riB P6{19n


00

- (}I ~ ri- 1n-.e logl/2 n}


j=1

< c (tTjB-mU-l) n-(!-.e)(m-B)(logn)-(m+B)/2. (12.18)


3=1

In the latter bound the inequality (12.1) was used for

H = T j - 1 n 1/ 2 -.e logl/2 n.
For the majorisation of the right hand side of the relations (12.17) and (12.18)
by a quantity O(n-(k-2)/2(logn)-(m-B)/2) it is sufficient to take an integer 8 and
a number (3 E (0, !)
such that

(i) 1~ 8 ~ min (8*,m -1),


(ii) m - 8*(1- 2(3) = k,
(iii) (m - 8*)(1 - 2(3) = k - 2.

The system of equations (ii) and (iii) has the solution

(3 = -m1 and 8* =m - k + 2(m - k)


m-2
.
For k = 2 the 8 ~ m - 1 satisfy (i)-(iii). Let

1 <_ 2(m - k) 2
m-2 < ,
i.e., k > 2 and m ~ 2k - 2. Then

8~m-k+1.
164 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Finally, let
2(m - k) 1
2 < ,
m-
i.e.,
m < 2k- 2.
In this case
8 ~ m- k,
and the assertion (1) is proved.
Let us fix J.t E Ml+s. We obtain successively
1+1
E'9 II (b(aj O)l'a
lal=1

= n-(l+s)/2 E'9
1+1
II
lal=1
(Lg(a)(j,8)cj r a

[~] 1+1
= n-(l+s)/2
L
t=1
L
1~h<<jt<n
L
K(n)nK(n)(h j)
En
(J
II J.ta!
lal=1
- p,' I "." t

X
II _1_ ( g (a) ('.J" 8))
t

. f
Haj,
c3~j,
,
i=1 "a3,'

[~] 1+1
= n-(I+s)/2
L L
t=1 K(t)nK(t)
L
1~h<<jt~n
E'9 II J.ta!
lal=1
1',1 I

X II -1
t

X .f
(g<a)(j. 8))
"
Hai
c~,
3,
i=1 al'

[~]
= n-(l+s)/2
L
t=1
ntGI',n(8j t, I, 8).
12. ASYMPTOTIC EXPANSION OF LSE MOMENTS 165

Taking into account the equality obtained and bringing in the result of Lemma 25.4
we obtain assertion (2) of the Theorem.
To prove (3) let us note that
k-2
"L..J En(J hI,r (B)n -1/2
1=0

k-2 [ill]
2 t
" " "\[I (t-u) (B t l s)n -1-(s/2)+u
L..J ~ L...J r,n '"
1=0 t=l u=l

k-2 [I] [I]


= "" " \ [ I (t-u) (B t l s)n -1-(s/2)+u
~ L...J ~ r,n '"
1=0 u=l t=u

The cases s = 1,2 are especially interesting for applications. For s = 1 Theo-
rem 25 gives the a.e. for the bias vector of the l.s.e. en. For s = 2 we obtain the
a.e. of the moments of the second order of the en coordinates, in order to find the
a.e. of the en correlation matrix. Let us denote

Let us find the first terms of the a.e. of the bias vector of en and matrix Cn(B).
The first terms of the correlation matrix of the a.e.

can be found making use of the equality

COROLLARY 25.1: Let the conditions of Theorem 25 be satisfied for k ~ 4. Then

EllBn (),) = - ~2 Aiil (B)Ah i3 (B)II(il)(i2 i3) (B),i n -1/2 + In (),), (12.19)

and the coefficients of the linear form In (),) of order o(n- 1 ) uniformly in BET.
166 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Proof: For s = 1 in the equality (12.8)

Clearly

We further find

(12.20)

Lastly let us note that the coefficients of the form E;h2(>") are O(n- 1 / 2). The
relation (12.19) now follows from (12.20) and (7.33).
The relation (12.19) can be rewritten in the form

(12.21)

i = 1, ... ,q.

For k = 3 the equalities (12.21) hold, with the remainder terms being o(n- 1 ). For
k = 2 it is only possible to state that

E(Jn()~in -- ()i + 0 (-1/2)


n , Z. -- 1, . , q. (12.22)

COROLLARY 25.2: Let the conditions of Theorem 25 be satisfied for k 2: 4. Then

(12.23)

furthermore:
(1) the elements of the matrix A (2) (() have values of order o(n- 1 ) uniformly
in () E T;

( Aiil Aiil Ai2h ( m3 [II(ili 2)Ud(h) + II(ili2)(id(h)]

+ J.t~ [II(il i2)(ilh) - II(hid(hh)


12. ASYMPTOTIC EXPANSION OF LSE MOMENTS 167

x ( - ~3 [IICit)(h)(is) (IICh)(i2 is) + II(i2)(ids) + IIUs)(ilh))

+ II(1)(h)( is) (II(id Ci2is) + II(i2)( ids) + II(3)( il i2)) ]

+ JL~ [IICh)(iSis) (~IICidCi2jd + ~IIUdCili2) +II(i2)(idl))


1 1
+ "4 II(idCi2h)IIUdCi3is) + "2 II(il)(i2is)IIUdCisi2)
+ II(il)(i2js)IICis)(ilh) + IIUdCi2is)II(is)(ilh)

- IICh)(jds)IICi2)(idS)]) q . (12.24)
',J=l

Proof: Evidently,

h02 ()..) = h6()..),


h12 ()..) 2ho()")hd)..) ,

h22 ()..) = hi()..) + 2ho()")h2()..).


It is easy to see that

E'O'h6()..) = JL2 Aij )..i)..j, (12.25)

(12.26)

x (2II(ili2)(h)(h)

- AisisIIUdCh)(is) (ilCidCi2is) + II(i2)(ids) + IIUs)(ili2))) )..i)..jn- 1 / 2

Equally simple, but more cumbersome, calculations result in the equality

(12.27)

+ (II(idCi2is) + IICi2)Cids))

x (~IIU!)ChiS) + ~ IICis)(ilh) - IICh)(jl is))


168 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Table 3.1: Minimal Values of m.


k=2 k=3 k=4
8=1 3 4 5
8=2 3 4 6

Taking advantage of relations (7.33) and (7.34), analogously to (12.27) we obtain

+ II(i2)(ida)II(jI)(i2 i a)) AiAj + O(n- 1 ). (12.28)

The result of Corollary 25.2 follows from the relations (12.25)-(12.28) after the
collection of similar terms and the symmetrisation of the expressions obtained.

For 8 = 2 and k = 3, thanks to the relations (12.25) and (12.26) equality (12.8)
can be rewritten in the form

(12.29)

where by o(n- 1 / 2 ) we denote a matrix, the elements of which decrease with the
same degree of n.
For k = 2 it is possible to state that

i,j = 1, ... ,q. (12.30)

In Table 3.1 the minimal values of m are shown which are determined by
Theorem 25 for k = 2,3,4; the relations mentioned above for the bias vector and
matrix of second moments of the l.s.e. On already hold.

13 ASYMPTOTIC EXPANSIONS RELATED TO THE ESTIMATOR


OF THE VARIANCE OF ERRORS OF OBSERVATION

In applications a typical situation is that in which the variance

IL2=0'2>0

of the errors of observation Cj in the model (0.1) is unknown. The rigorous statis-
tical treatment of the observations Xj, j = 1, ... , n, in the model (0.1) provides
a means of obtaining the estimators both for () and 0'2. Therefore, along with the
13. AEs RELATED TO THE VARIANCE OF ERRORS 169

problem of the estimation of the parameter () there arises the problem of estimat-
ing the parameter (72. As an estimator of the variance (72 of the observations X j ,
j = 1, .. . , n, let us take the statistic

c7~ = n- 1 L(8n ),
where 8n is the l.s.e. of the parameter O. The estimator of c7~ already appears in
the formulation of Theorem 22 of Section 8.
In this Section the a.e.-s will be obtained for the normed estimate of c7~ and
its first two moments, and also the a.e. of the distribution of the estimator of c7~.
Let us assume that
fLm+fl. < 00
for some m ~ max (3, k) and ~ > O. Assuming the conditions of Lemma 25.1 of
Section 12 to be satisfied, let us consider the quantity

(13.1)

and let us write the Taylor expansion in the powers

u a = (U1yl<1 ... (uq)a q


ofthe functions n- 1 / 2cpn(O+n- 1 / 2u, 0) and n- 1 / 2(b(O+n- 1 / 2u) -b(O)). We obtain

n- 1 / 2cpn(O + n- 1 / 2u,O)

+n- 1 / 2 L ~!(cp~a)(O+n-l/2u*,O)_cP~((},(}))ua, (13.2)


lal=k-l

where lu*1 < lui. The derivative cp~), 10:1 ~ 2, has the following form:
cp~a) (0 + n- 1 / 2u, 0)
= n- 1al / 2 L c({3,"{) Lg(!3)(j,O +n- 1 / 2u)g('Y)(j,O +n-l/2u)
!3+'Y=a

where {3 - 0, "{ - 0 are multi-indices, and the c({3, "{) are constants. Therefore the
remainder term in the expansion (13.2) can be written in the form
170 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Rk-l ((})

= n -k/2 L ~! ( L c({3,,..,,) L (g(,8) (j, () + n -1/2 u *)g(-r) (j, () + n -1/2 U *)


lal=k-l ,8+y=a

_ g(,8) (j, (})g(-r) (j, ()))

+ 2L: (g(j,8 + n -'/V) - g(j, 8) g(.) (j, 8+ n -'1'.') ..


If condition II of Section 10 is satisfied, then Rk-l((}) admits the bound
IRk-l ((}) I ~ Cl n -(k-l)/2Iul k . (13.3)
From Lemma 25.1 of Section 12 it is not difficult to deduce that there exists a
constant C2 > 0 such that
sup P;{ n 1/ 218n - (}I ~ C2log1/2 n} = O(n-(m-2)/2log -m/2 n). (13.4)
()ET

This relation, close to (13.4), was mentioned earlier in the statement of Theorem 19
of Section 7. And so under the conditions of Lemma 25.1, for
u = n 1 / 2 (8n - ())

from (13.3) and (13.4) we obtain


Rk-l ((}) = n-(k-1)/217i~1 ((}),

with
~~~ p;{ l17i~l ((}) I ~ c3 logk/ 2 n} = O(n-(m-2)/2(10g n)-m/2). (13.5)

Consequently

+ 'I'l(1)
'k-1
((})n -(k-1)/2
, (13.6)
where
rr(,8)(1')((}) = n- 1 Lg(,8)(j,O)g(-r)(j,O).
Analogously we find
k-2
L ~ b(a; 0)(n 1 / 2 (8 n
a.
- (}))a n - 1a l /2
lal=l
+ 17i~1 ((})n-(k-1)/2, (13.7)
13. AEs RELATED TO THE VARIANCE OF ERRORS 171

where the r.v. 71i~l (B) has the property (13.5) with some constant C4. Therefore
nl/2(&~ _ 0'2)

= n- 1/ 2 ~)c~ - 0'2)

+ L ~! L C(,8, 'Y)II(.B)('Y) (B)(n 1 / 2 (9 n - B))e.) n- v / 2


1e.I=v+l .B+'Y=e.
+ 71k-l (B)n-(k-l)/2, (13.8)
where the r.v. 71k-dB) has the property (13.5) with some constant C5.
Instead of the quantities (n 1 / 2 (9n - B))e. let us substitute in (13.8) their a.e.-s
obtained in Lemma 25.3 of the preceding Section. For this, with fixed a, let us
assume r = a and s = v, v + 1. Then for lal = v, for example,
(13.9)

When s = v + 1 we obtain an analogous formula. The substitution of (13.9) in


(13.8) after some simple transformations leads to the the a.e.
n1 / 2 (&! _ 0'2) = n- L(c; - 0'2)
1/ 2

+ L 1
'I'
a.
L c(,8,'Y)II(.B)('Y)(B)(v + I)!
1e.I=v+1 .B+'Y=e.

(13.10)
172 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

where the r.v. (k-l(O) has the property (13.5) with the constant C6.
Let us denote by Pp the polynomials of the sum m b(a; 0) which in the a.e.
(13.10) are the coefficients of the powers n- p / 2 , p ~ 1, and let us set

The a.e. (13.10) determines the order of decrease of the remainder term, but
is of little use in the calculation of the polynomials Pp (0). Let us determine
the formulae giving a visible form for the polynomials Pp(O) of the a.e. (13.10).
Assuming that the functions gU,O) are infinitely differentiable, we find formally

(13.11)

where the functions ailoo.i.(O) and the sums of the r.v.-s bi1oo.i.(0) are defined in
Section 7. Substituting in (13.11) the formal expansion

L
00

= h~t (0)n-0.t/2, t = 1, ... , r, (13.12)


o.t=O

we find
n 1 / 2 (o-; _ 0- 2 ) (13.13)

= PoCO) + f: (
v=l
L
r+lo.(r)l=v+l
~ ailoo.dO)h~l ... h~)O)

The summation in l:r+lo.(r)l=v is carried out over the integer-valued r-dimensional


vectors a(r) = (al' ... , a r ) with non-negative coordinates.
For v = 1,2, ... let us set

(13.14)

,,1 . .
L..J I bi1oo.dO)h:;1 (0) ... h:;JO) , (13.15)
r.
r+lo.(r)l=v

and we may state as proved the following assertion:


13. AEs RELATED TO THE VARIANCE OF ERRORS 173

THEOREM 26: Under the conditions of Lemma 25.1 of Section 12 there exists a
constant C7 > 0 such that

sUPp;{lnl/2(&~ _0- 2 ) - ~pv(o)n-v/21 ~ c7n-(k-l)/2l0gk/2n}


9ET v=o

(13.16)
where

Pv(O) = Av(8) - 2Bv(8), v = 1, ... , k - 2, (13.17)


are homogeneous polynomials of degree v + 1 with respect to the quantities b(aj 0),
lal = 1, ... , v,
with coefficients uniformly bounded in 8 E T and n.

REMARK 26.1: If in the conditions of Lemma 25.1 the condition (12.1) is replaced
by a weaker condition, for example for any r > 0
supP9{IOn - 81> r} = o(n-(m-2)/2), (13.18)
9ET

then the conclusion of Theorem 26 remains true with the right hand side of (13.16)
replaced by a quantity that is O(n-(m-2)/2). For this it would be sufficient for the
moment J.tm to be finite.

Let us find the first polynomials of the a.e. (13.10), or, what is the same thing,
the polynomials of the a.e. (13.16). Using the relations (7.32), (7.33) and (7.35),
(7.36), from (13.14) and (13.15) we obtain
1 .. ..
Al = 2! a'1 '2. h'lh'2
0 0 -
- A'Jbb
, J'

1 . ..
Bl = I! bihe, = A'Jbibj ,
PI = Al - 2Bl = - Aijbibj, (13.19)

A2 = 1 hh hi2 his
3! aid2is 0 0 0
+ 2 hh hi2
2! aid2 0 1

(13.20)
174 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

More cumbersome, but equally simple computations with the invocation of the
equalities (7.34) and (7.37) lead to the expression

x [(~II(id(i2iSi4) + ~II(ili2)(iSi4) bhbhbjsbj4


+ 2II(j4)(jds)bhia bh bi2 bi4 + II(j4)(h3a)bili4bh bi 2 bi a]

(13.21)

Having available the a.e. (13.16) and using the method of Section 12, it is
possible to obtain the a.e. for the moments of any order of the r.v. n 1/ 2(u; - 0'2).
However, we concentrate on the considerably more special problem (but the most
interesting for applications) of the determination of the initial terms of the a.e. of
the first two moments El!nl/2(u~ - 0'2) and El!n(u~ - 0'2)2. Thus we shall start
from the expansion (13.16) for k = 4:
2
n 1/ 2(u; - 0'2) = L PI.' ())n- V/ 2 + (3 ())n- 3/ 2, (13.22)
1.'=0

sup P;{1(3()) I ~ cs log 2 n} = O(n-(m-2)/2(logn)-m/2). (13.23)


(JET

THEOREM 27: Let the conditions of Lemma 25.1 be satisfied for k = 4 and m ~ 6.
Then

m=6,7,
(13.24)
m~8.

Proof: The proof is close to the proof of Theorem 25. Let us introduce the event
13. AEs RELATED TO THE VARIANCE OF ERRORS 175

Then we have
2
E;n1/2(o-; - 0'2)x{On(0)} = L E; P (0)x{On(0)}n-"/ 2
I

11=0

+ O(n- 3 / 2 Iog2 n) (13.25)


uniformly in 0 E T.
Let us estimate E; PII(O)X{On(O)}, II = 0,1,2. Let us denote by Mil the
collection of integer-valued vectors J.t with coordinates J.to. ~ 0, 10:1 = 1, ... , II, such
that II
L
J.to. = II + 1.
10.1=1
Then, in correspondence with Theorem 26, the polynomials PII , II ~ 1, admit the
representation
II
PII(O) = L
cll(O) II
blla (a; 0),
IlEMv 10.1=1

where the cll(O) are coefficients (some of which may be zero) that are bounded
uniformly in 0 E T and n. Therefore for the estimation of E; PIIX{On} it is
sufficient to estimate the quantities E;lb(a; O)l"+1X{On(O)}, lal = 1, ... , II. Fixing
a and using the notation (12.12), by analogy with (12.13)-(12.15) we obtain

E;lb(a; O)l"+1X{On(O)}

< E;X{On(O)}X{W~<;l(O)}lb(a; 0)1"+ 1 + L E;X{ w~{1} Ib(a; 0)1"+1


00

j=1

L
00

< C9n-(m-2)/2(logn)(II+1-m)/2 Tj (II+1- m). (13.26)


j=O

The bound (13.26) is non-trivial, since m > II + 1 in the conditions of the Theorem
being proved. Let us further observe that the r.v.-s (J.t4 - 0'4)-1/2(c~ - 0'2), j =
1, ... , n, have finite moments of order [m/2] ~ 3. Therefore the application of
Theorem A.5 to the sum of the r.v.-s (J.t4 - 0'4)-1/2 Po analogously to (13.26) gives
the bound
E;!Polx{On(O)} ~ clQn-(m-2)/2(logn)-(m-1)/2

+ clln-!([m/2l-2) (logn)-!([m/2l-1). (13.27)


Let us further note that, thanks to (13.1),

n 1/ 2Io-; - 0'21 < 21Pol + 2n- 1/ 2<pn(8n ,fJ) + n 1/ 20'2


< 21Pol + c12n 1/ 218n - 01 2 + nl/20'2. (13.28)
176 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Therefore in order to estimate n i / 2E;la; - (72IX{On(0)} it is sufficient to estimate


the quantity n i / 2E;IO - 012 X{On(0)}. By analogy with (12.16)-(12.18) we find
(with (3 = 11m)

n i / 2 E;IOn - 012"X{On(0)}

01 2 + :L n i / 2E;X{Wjn (O)} IOn - 01 2


00

< n i / 2E;X{On (O)}X{WOn (O)} IOn -


j=i

< ( C13 + C14 ~


L....t T 2j -mU-i)) n _1(m-3)-2/m(1
2 og n )-(m-2)/2 . (13.29)
j=i

The estimates (13.15)-(13.29) show that we have uniformly in 0 E T

n i / 2 E;(a~ _ (72)

= n- i / 2 E; Pi (0) + n- i E; P2(0) + rn(O) + rn,m(O), (13.30)

where

Let us further note that

uniformly in 0 E T, and

In this manner we have


O(n-i/2(logn)-1), m = 6,7,
~~~IE;ni/2(a~_(72)+q(72n-i/21= { O(n- i (logn)-3/2), m=8,9, (13.31)
O(n- 3 / 2 Iog2 n), m 2: 10.

The relations (13.31) are a more precise form of (13.24).


If k = 3 then instead of (13.31), using a similar argument, we obtain

(13.32)

For k = 2 it is possible only to state that

(13.33)
13. AEs RELATED TO THE VARIANCE OF ERRORS 177

THEOREM 28: Let the conditions of Lemma 25.1 be satisfied for k = 4 and m ~ 8.
Then
sup IE;n(a! - a 2)2 - J1.4 + a4
8ET

- [a 4(q2 + 4q) - 2qJ1.4


+ 2m3a2 Aidl (9) (Ai2h (9)II(idl)(i2) (9)II(h) (9) - II(idl) (9))] n-ll
= o(n- l ). (13.34)

Proof: From the expansion (13.22), on raising to the square power we obtain
n(a! - a 2)2 = P~(9) + 2Po(9)P1(9)n- I/ 2 + (2Po(9)P2 (9) + pf(9))n- 1
+ (3 (9)n- 3 / 2 , (13.35)
where
(3 = 2PO(3 + 2P1P2 + 2P1(3n-I/2 + 2P2(3n-1 + (;n- 3/ 2.
The r.v. (3 has the following property: a number Cl5 > 0 can be found such that
supP;{1(3(9)1 ~ cI5{logn)2.5} = O(n-(m-2)/2 (log n)-m/2). (13.36)
8ET

The bounds analogous to (13.25)-(13.29), lead to the conclusion that we have,


uniformly in 9 E T,
E;n(a!-a 2)2 = E;P~+2E;PoPI(9)n-I/2

+ E;(2Po(9)P2(9) + pf(9))n- 1 + o(n-l). (13.37)


Consequently it remains to find the mathematical expectation entering into
(13.37). Simple calculations show that

E;P~(9) = J1.4 - a 4,
E; Po (9)PI (9) = - q(J1.4 - a4)n- I/ 2, (13.38)
E;pf(9) = a 4(q2 + 2q) + O(n- l ).
We are somewhat delayed by the calculation of the expectation
178 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

It is not difficult to see that

El = m30'2 Aidl Ai2h (2ITC ili2)(h)ITCM + ITCith)(i2)ITCh)) + O(n- l ),

E2 = m30'2 (2AilhAi2hITCili2)(h)ITCM + AidlITCitM) +O(n-l).

And so,

+O(n- l ). (13.39)

Substituting (13.38) and (13.39) in (13.37) we obtain (13.34).


COROLLARY 28.1: If k = 4 and m ~ 8 then we have, uniformly in 9 E T,

De(n l / 2(iT; - 0'2))


= Een(u; - 0'2)2 - (Ee nl / 2(u; - 0'2W

= /14 - 0'4 + ((40'4 - 2/14)q

+ 2m30'2Ailh(9) (Ai2h (9)ITCidt)Ci2) (9)ITCh) (9) - ITCi l M(9)))n- l

(13.40)

Proof: Equality (13.40) follows from (13.24) and (13.34).



COROLLARY 28.2: Let k = 4 and Cj be Gaussian r.v.-s. Then we have, uniformly
in 9 E T,
E'8n(u; - 0'2)2 = 0'4(2 + (q2 _ 2q)n-l) + o(n-l), (13.41)

De(nl/2(u; - 0'2)) = 20'4(1- qn- l ) + o(n- l ). (13.42)

Let us prove one assertion about the a.e. of the distribution of the estimate u;.
We assume that:
X. The r.v. Cj has density p(x), which has a bounded variation on .!Rl

Let [a, b], a> 0, b < 00, be an arbitrary, but fixed, interval,

T* = [a,b] x T, 9* = (0'2,9),
the coefficient of the excess of the r. v. Cj .
13. AEs RELATED TO THE VARIANCE OF ERRORS 179

THEOREM 29: Let us assume satisfied the conditions II-V, VIII of Section 10, X,
and IL2(k+1) < 00. Also let the l.s.e. en
have the following property: for any r > 0

(13.43)

Then

sup sup
e*ET* zERl

= O(n-(k-1)/2Iog k/2 n), (13.44)

where R v (}*, z) are polynomials in z of degree 3v with coefficients that are uni-
formly bounded in (}* E T and n.
The property (13.43) is the property (13.18) for m = k + 1.
Let G be the d.f. of the vector (c~ - 0'2, Cj) and 0 be its c.f..
LEMMA 29.1: If ILl < 00 and condition X is satisfied, then for any 6 > 0
10(A1,A2)1 5 (l+O) ~ C16(1 + IA11 1+O)-1(1 + IA211+O)-1. (13.45)

Proof: Let us consider the c.f. of the vector (c j, c~),

(13.46)

1/12(A1, A2)

= e -i,xV 2,x2121T d<p (Xl pe i ,x2p 2p (p cos <p _ A1) P (p sin <p _ A1) dp.
o Jo 2A2 2A2
The inner integral in (13.46) is equal to

-1
.-
2tA2
1 0
00
P ( pcos<p - -A1) p (psm<p
2A2
. - -A1)',x
2A2
de' 2P 2

(13.47)
180 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

It = 1 o
00 e'">.2P 2 p ( pcos<p - -AI) dp ( psin<p - -Al )
2A2 2A2

+1 00
ei >'2p2 p (p sin <p - 2;J dp (p cos <p - 2;2)

= 12 +13.
Let us estimate the integral 12, and the integral 13 is estimated in just the same
way:

12 < Po 1 00
IdP(PCOS<P - 2;J I
< Po 1IIF
Idp(p) I, (13.48)

Po = sup p(x).
zElR l

From (13.46)-(13.48) it follows that

11/I(AI,A2)1 2 ~ 1I"(p~ + 2Po kl IdP(P)I) IA21- I . (13.49)

On the other hand,

(13.50)

Multiplying the inequality (13.49), raised to the second power, by the inequality
(13.50), for IA21 ~ 1 we obtain

1\lI(AI' A2)1 5

::; 11"2 (P5 + 2Po kl IdP(P)IY (2JLl + kl IdP{p)l) IAIA21- 1 . (13.51)

The relation (13.45) now follows from (13.51).


Let us introduce the vector

-
V(O ) =
* ( Vo
Vk-I (0)
)
, Vo = Po,
where Vk-l (0) is the vector introduced in Section 10,

dim (V(O*)) = 1 + p,
13. AEs RELATED TO THE VARIANCE OF ERRORS 181

The correlation matrix of the vector V(8*) is

Let Qn(8*) be the distribution of the sum of the random vectors

LEMMA 29.2: Under the conditions of Theorem 29, for the distribution Qn(8*)
we have the a. e.

sup sup
()*ET* BEBp+l

= O(n-(k-l)/2),

where the polynomials Pr ( -<P; {;~1I(8*)}) were introduced in Section 10, and XII(8*)
are the arithmetic means of the cumulants of order v of the vectors ~jn(8*), j =
1, ... ,no
Proof: We show that the conditions of the Theorem to be proved guarantee that
the conditions of Theorem A.13 are satisfied.
Let us show that

lim inf Amin(Bn(8*)) > O. (13.52)


n-+oo ()*ET*

From condition VIII, as demonstrated in Section 10, (10.6) follows. Let us intro-
duce the (p + 1) x (p + I)-dimensional matrices

1 0 8
o
R

o
182 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

~ E(K;;-1/2(8)wj(8)1

o
o
~ E(K;;-1/2(8)wj(8P 1
Then

det B.(8) = det K.(8) (I'< - ~ - min-, .~, (K;;'(8)Wj(8), W'(8)) ,


L (K;;1(9)Wj(9),Wi(9)
n
n- 2 = n- 2(W(9)K;;1(9)W'(9)e n , en},
i,j=l

where en is an n-dimensional vector all the coordinates of which are equal to


unity, and W(9) is an n xp-dimensional matrix composed of row vectors wj(9),
j = 1, ... ,n. It is not difficult to see that the matrix J.t2n-1W(9)K;;1(9)W'(9) is
idempotent, and consequently that

J.t2n-1 (W(9)K;;1(9)W'(9)e n , en) ~ n.

And so,
n
J.t4 - J.t~ - m~n-2 L (K;;1(9)Wi(9),wj(9) > J.t"21 (J.t4J.t2 - m~ - J.t~)
i,j=l

> 0,
since J.t4J.t2 - m~ - J.t~ is the determinant of the correlation matrix of the vector
(1, cil c~). Consequently (13.52) is true.
Let us set u = rh, where r ~ 6 is an integer, and h ~ p is taken from the
condition VIII of Section 10, and let T = (TO; t), to E ]Rl, t E JRP. Then for
o ~ m ~ n - u, n ~ u + 1, and
m+u
\J!m(9*,B~/2(9*)T) = II 18(tO, (t,Wj(O)})1 '
j=m+l

we obtain
13. AEs RELATED TO THE VARIANCE OF ERRORS 183
r
= II a s 1/ r ,
s=1

(13.53)

where Wj;(s) (9), i = 1, ... ,p, are p vectors from condition VIII. Let us make the
substitution of variables
i = 1, ... ,p
in the integral (13.53). The Jacobian of this transformation is equal to det Ws ,
where Ws is the matrix with columns Wj;(s) (9), i = 1, ... ,po From condition VIII
it follows that
det(Ws W;) ~ (P~)P > 0
uniformly in m, n, and 9 E T.
Therefore

as < (P~)-P/2 kp+1 glo(xO,xi)r dx

= (P~)-P/2 L1 [L1 IO(xO,yO)l dyOr dxo. r


(13.54)

From (13.53), (13.54), Lemma 29.1, and the conditions of Theorem 29 there follows
the finiteness of the integral a and the validity of the relation

sup sup ( WnW", T) dr < 00. (13.55)


O<m<n-u BET iR.P+1
n~u+1

Let us write, further,

Wm(9*,B~/2(9*)T) ~ 11 (gIO(t O, (t'Wj;(s)(9)})1) . (13.56)

Let s be fixed. Then if ITI ~ b > 0,


P
p(tO)2 + ~)(t,Wj;(s)(9)))2 > p(tO)2 +p~ltI2
i=1

> pmin (1,~)ITI2


> pmin(1,~)b2.

There an index i can be found such that


184 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

and consequently, by Lemma 29.1,

This implies

sup sup sup wm(B*, B~/2(B*)T) < 1. (13.57)


O<m<n-u ITI~b fJET
n~u+l

It is easy to pass from (13.57) to the required relation

sup sup sup wn(B*, T) < 1, (13.58)


O~m~n-u ITI~b fJET
n~u+l

relying on the condition of Theorem 29.


The relations (13.52), (13.55), and (13.58) mean that the conditions (1) and (2)
of Theorm A.13 are satisfied. Condition (3) easily follows from the condition of
Theorem 29.

Let
Qn(B*) = P; 0 V(O*)
be the distribution of the vector V(B*). Then by the change of variables

x ~ B;;1/2(B*)x

in the a.e. of Lemma 29.2, we obtain

sup sup
fJET AEBv+ 1

O(n-(k-l)/2, (13.59)

the polynomials P(-IPBn(IJ.}; {:~AO*)}) being analogous to those introduced in


Section 10.
Let us denote

Q,(O', x) ~ I'B.WI(x) (1 + ~ n-,/2 P,(O', X)), x E JRP+'.

Let us define the mapping


13. AEs RELATED TO THE VARIANCE OF ERRORS 185

in the following way:


k-2
f;,(x) = Xi + L n- r / 2 P;((/*, X), i = 0,1, ... ,p,
r=l

where P; are polynomials in x with coefficients that are uniformly bounded in


()* E T* and n.
LEMMA 29.3: If the conditions (10.1), V, and VIII are satisfied, then

(13.60)

where Pr (()* , y) are polynomials in y with coefficients uniformly bounded in ()* E T*


andn.
In particular
p

P1'PBn - L (PI'PBn)i ' (13.61)


i=O

P 2 'PB n - L (P1PI'PBn + P4'PB..)i


i=O

(13.62)

Proof: The proof of the Lemma is identical to the proof of Lemma 24.2 of Sec-
tion 10.

Proof of Theorem 29: From (13.59) and (13.60) there follows the relation

sup
O'ET' AEBp+l
sup IJrA (Qk(()*) 0 fnC )) dx -
JA
rQk((}*, y) dyl = O(n-(k-l)/2. (13.63)

We further note that from (13.14), (13.15), (13.17), and Lemma 24.3 it follows
that the polynomials Pv ((}), v = 1, ... , k - 2, of the a.e. (13.16) of Theorem 26 are
polynomials of the coordinates of the vectors Vv ((}), v = 1, ... , k-2, i.e., Pv ((}) are
polynomials of the coordinates of the vector v((}*). Let us introduce the mapping
186 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

/n(X) of the special form:


k-2
/~(x) = xo+ LPr(9;Xl, ... ,xP)n-r/2,
r=l

i = 1, ... ,p, (13.64)

and let us write

p = c7n-(k-l)/2Iogk/2 n, Z(p) = (-00, z p) x JR.p.


Then from Theorem 26, Remark 26.1, and (13.63) it follows that

= ( Qk(9*,y)dy + O(p). (13.65)


iz(o)

uniformly in 9* E T* and z E JR.l .


Let

y -_ (0
y ,y') , Y, -_ (y,
1 ... ,yP) ,

Ll n = Kn(9) - rn~ 2 (n- l LWj(9)) (n- l LWj(9))'.


1-'4 -1-'2

Then by the property of the multi-dimensional Gaussian distribution (10.74)

where (c/., (10.76))

R:(9*,yo) = { P (9*,Y)'Pa
II n (y' - rn3 2 n- l LWj(9)YO) dy', (13.66)
iRP 1-'4 -1-'2

v = 1, ... , k - 2, are polynomials of degree 3v in yO with coefficients that are


uniformly bounded in 9* E T* and n. The Theorem is proved, because

v = 1, ... ,k - 2.

13. AEs RELATED TO THE VARIANCE OF ERRORS 187

The calculation of the first polynomials of the a.e. (13.44) can be carried out,
for example, by the following method. For the mapping (13.64) it follows from
formulae (13.61) and (13.62) that
(13.67)

(13.68)

It then follows that we should integrate (13.66) taking into account the equalities
(13.67) and (13.68). In the calculations there arise no new difficulties in comparison
with the calculations of Section 11. Therefore we present only the final result.
Let H(s), s ~ 1, be the Chebyshev-Hermite polynomials:

Po = (J.t4-J.t22) 1/2 ,

PI = AijII(i)II(j),

P2 = Aitil Ai2h (2II(ili2)(h)II(h) + II(idd(i2)II(h)


i
- 2A ljl Ailh II (hia)(h) II (js) - Aidl II (ith) ,

Ps = Aidl Ai2h A i siaII(iIi2)(is)II(h)II(h)II(js)

- A idl Aith II (h) II (hjs) II (js)'


Then

Rl(8*,z) = Po-s (J.t6 J.t4 u2 0'6


"'6--2-+"3-mSP1 Hs(z) 2)
+pC;lqu2HI (z), (13.69)

R2(8*, z) = 712 pC;6(J.t6 - 3J.t4U2 + 20'6 - 6m~pl)2 H6(Z)

{ -4 2(J.t6
+ Po qu
J.t4 u2
-"'6 + -2- -
0'6
"3 2)
+ m sPl
+ pC;4[(6u2m~ - mSmS)pl + mgps]
+ Po-4 (J.t8 J.t6u2
- - -
24 6
- J.t4u4
+- 4
- - -0'8)
8
- -8 I} H4(Z)
+ Po
-2
(q(2u 4
- J.t4) + msu2 P2 +
24)
q 0'
-2- H2(Z). (13.70)

COROLLARY 29.1: If the conditions of Theorem 29 are satisfied and the ej are
symmetric r.v.-s, then the first polynomials of the a.e. (19 ..1,4), Rl and R2 are
independent of the parameter 8.
The polynomial (13.69) was, in fact, obtained in [217].
188 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

14 ASYMPTOTIC EXPANSION OF THE DISTRIBUTION OF THE


VARIANCE ESTIMATOR OF OBSERVATIONAL ERROR IN
GAUSSIAN REGRESSION

In this Section a result (Theorem 30) is obtained that coincides with Theorem 29
of the preceding Section for Gaussian (0, (12) errors of observation Cj. The sense of
the separation of such a fact in a separate statement consists in this, that a specific
Gaussian character enables the third polynomial R 3 (8*, z) ofthe a.e. (13.44), which
has a remarkable property considered in Chapter 4, also to be found.
THEOREM 30: Let us assume the conditions 11- V, VIII of Section 10, that Cj are
Gaussian (0, (12) r.v.-s, and that the l.s.e. en
has the property (19.49). Then
sup sup
9"ET" zERl

Po { (i) 1/2 (A2


:; - 1) r
< z } - J-00 cp(z) (k-2 )
1 + ~ Rv(8*, z)n- V / 2 dz

= O(n-(k-1)/2Iogk/2 n), (14.1)


where the Rv (8* , z) are polynomials of order 3v in the variable z, with

v'2 3 q+2
Rd8*, z) = R 1 (z) = ""3 z - v'2 z, (14.2)

R2(8*, z) = R 2(z) (14.3)

= 1 6-
-z
9
(
-+-
q
36
7) z 4 + ( -+-q+2
q2
42
3 ) z2 - (
-+-+-
q2 q
426'
1)

R3(8*, z) = v'2 (Z9


81 -
( 18q + 185) z 7 + (q212 + 43 q + 47)
30 z
5

(q3 7 2 37) 3
- 24 + 12 q + 2q + 18 z

+ (1"8 q3 + 21 q2 + 12
7 1 n ))
q + "6 + "8 Y(8) z (14.4)

nY(8) = (12 Aidl (8)Ai2h (8) (2II(hh)(hh) (8) - II(hh)(i2h) (8)) (14.5)

- Aigis(8) (2II(ig)(ili2) (8)II(jg)(hh) (8) - II(ig)(i2h)II(jg)(hh)) .

Proof: The relation (14.1) was obtained in Theorem 29 in coincides with (13.44).
Therefore we shall turn out attention to those details of proof related to the method
of obtaining in an explicit form the initial terms of the a.e. (14.1).
14. AEs DISTRIBUTION OF THE VARIANCE ESTIMATOR 189

Let
r = (to; t), t=(tl, ... ,tP).
Then the c.f. of the vector

has the form


~
Gj(r) ~
= G(t (t,Wj(O))).
0,

Using the normality of the r.v. Cj it is possible to show (see, for example, [142]
p. 381) that

~ it O/T2 e- 2 2 1 {u
}
Gj(r) = v'1_2it0u2exp -"2((t,Wj(O))) 1-2itOu2 . (14.6)

Consequently the c.f. of the sum of random vectors B;; 1/2 (O")V(O")
-
are

W' n (r ) = II G. (
n
J
to . m- 1/ 2K-
"2='2' n
1/ 2(0)t)
j=1 V..:inu

Let us observe in passing that for the function


m+u
W'm(O", t) = II IE; exp{ i(B;-1/2(0")~jn(0"), r) } I
j=m+1
it follows from (14.6) that

f W'm(O", r) dr
J11V+l

= f dt\l
JRl 1 + 2(tO)2)u/4

X
f {
JRP exp -
1
2(1 + 2(tO)2)
m+u
j=~1 (UK;;1/2(0)Wj (0), t)
2} dt
= (21r)P/2 det Kn(O) f dtO
(14.8)
u2p det W,hu) (0) JRl (1 + 2(tO)2)u/4-p/2 '
where
m+u
W~u)(O) = L Wj(O)wj(O).
j=m+1
190 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

The integral (14.8) is convergent if u ~ 2p+3. From the conditions of the Theorem
it also follows that for u ~ p there exists a constant c > 0 such that

O)2)-U/4 { cltl 2 }
\11 m ()
r :S (1 + 2(t exp - 2(1 + 2(tO)2) < 1 (14.9)

if Irl ~ b> O.
Let us find the first terms of the a.e. (13.51). Taking (14.7) into account we
obtain for the polynomials Pr (lrj {Xv}), introduced in Section 10, the following
expressions:

A(irj {Xv})

= V; (itO)3 - V; ItI 2(ito), (14.10)

P2(irj {Xv})

= ~ (itO)6 + ~ (itO)4 - ~ (itO)4ItI 2 + ~ (itO)2ItI 4 - (itO)2ItI 2 (14.11)


9 2 3 4 '
P3(irj {Xv})
_ 2 (it O)9 _ (it)1jtI 2 (itO)5ItI 4 _ (itO)3ItI 6 (itO)7
- .;2 81 18 + 12 24 + 6

- !.- (itO)5ItI 2 + ~ (itO)3ItI 4 + ~ (itO)5 - (itO)3ItI2) (14.12)


12 2 5 .
Since p

Itl 2 = - ~)itj)2,
j=1
the polynomials Pr ( -<pj {Xv} )(x), r = 1,2,3, of the a.e. (13.51) are immediately
defined from (14.10)-(14.12). The passage to the polynomials of the a.e. (13.59)
is obtained by the substitution of variables x -+ B~/2(8*)x in the polynomials
Pr ( -<Pj {Xv} )(x):

- (')
PI 8*, x =
(xO)3 P+2
6a 6 - 2a 2 x
+ 2aXO2 ( K n -1 ( )
8 x ,x
I ')
, (14.13)

(14.14)

+ (121a 8 (XO)4 - 2~4 (X O)2) ((K;1(8)X', x') - p)

+~ C;;~2 - 1) (( (K;I(8)X', x') - p)2 - 4(K;1(8)X', x') + 2p) ,


14. AEs DISTRIBUTION OF THE VARIANCE ESTIMATOR 191

1 09 5 (0)7 47 (0)5 37 ( 0)3


P3(8*, X) = 12960'18 (X) - 1440'14 X + 1200'10 X - 360'6 x

37 0)3 1 0
- 360'6 (X + 60'2 X
1 07 7 05 1 03 1 0)
+ ( 1440'14 (X) - 480'10 (x ) + 20'6 (X) - 120'2 X

X ((K;I(8)x',x') - p)

1 (0)5 1 (0)3 1 0)
+ ( 480'10 X - 60'6 X - 80'2 .x

1 (0)3 1 0)
+ ( 480'6 X - 80'2 X

x (((K;I(8)x',x') _ p)3 -12 ((K;I(O)X',x') _ p)2

+ (-6p+ 24) ((K;I(O)x',x') - p) + 16p). (14.15)

For the transformation (13.64) the first polynomials of the -ap (13.63) are
given by formulae (13.67) and (13.68). We must find the polynomial P3' For this
it is necessary to turn to the details of the proof of Lemma 24.2 of Section 10 for
the special transformation (13.64). From the reasoning of Lemma 24.2 it follows
that
- ~

P3 =]53 + ]5201 + ]5102 + 03' (14.16)


-
0" polynomials of the
~

where the ]5" are polynomials of the a.e. (10.51), and the
a.e.(10.52).
From the identity (10.43) it is easy to find

Q
-i
r
-
-
0, i = 1, .. . ,p. (14.17)

Since, as follows from (14.17),

aJ;;I(y)
ay = det ((8" + ~
'3 ~
n- 1/ 2 (Qi)
r j
)P ) -- 1,
r=1 i,j=1

in the a.e. (10.52)


r = 1,2, ....
192 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

From (14.16) it follows in such a_case that

Since now (in the notation of Sections 10 and 13)

Qk(8*, g~(y), . .. , g~(y)) = Qk(8*, g~(y), yl, ... , yP),


g~(y) = yO _ L n- r/ 2Pr(8*, yl, ... , yP),
r~l

then the polynomials j5 r of the a.e. (10.45) are the polynomials of the a.e.

CP2tr4 (
~
Y0 - L..Jn -r/2p,r (1I*
17 ,y 1 , ... ,yP)) (14.18)
r~l

x ( 1+ Ln
k-2 -r/2- * 0
P r (8,y - Ln -r/2 Pr(8,y
* , ... ,y ),y , ... ,y) )
1 P 1 P .
r=l r~l

Fraom (14.18) we find

Pl CP2tr4 =
= (14.19)

P2CP2tr4 =
= (14.20)

P3CP2tr4 =
= (p 3 - (P l )OP2 - (P2)OPl ) CP2tr 4 + ~ Pf(Pl CP2tr4)OO + PlP2CP~tr4

P 2 CP2tr4 - 6 P l CP2tr4.
+ P2 + Pl - )' 13111
- ( P3 (14.21)

Formulae (14.19) and (14.20) coincide with (13.67) and (13.68).


Let us illustrate an application of the formulae obtained for the calculation
of the first polynomials of the a.e. (14.1). It is easy to notice (see the formulae
(13.19), (13.20)) that

Pl (8,y') = _Lyi
'3 y ,
j (14.22)

P2(8,y') (14.23)
14. AEs DISTRIBUTION OF THE VARIANCE ESTIMATOR 193

By formulae (14.13), (14.19), and (14.22)

~ (y O)3 (p + 2)yO + yO (K-1y', y') _ yO I . yiyj. (14.24)


P1 = 60'6 - 20'2 20'2 n 20'4 .J
Since

(14.25)

then from (13.66), in view of m3 = 0, we obtain, coinciding with (14.2), the


expression
R1 (() * , Z ) V2 3 qV2
Z = 3 Z -
= R*1 (()* , Vtn2.t. 0'2) + 2 z.

Let us integrate the polynomial P2' Since

(14.26)

then from (14.14), (14.25) and (14.26) we obtain


r
JRP C{JKn (y
,)P2(()
- * , y) dy, = 721 (yO)6
0'12
7 (yO)4
- 24 ~ +
(yO)2
~
1
- '6' (14.27)

Let us find the integral

- j RP P1 (_P 1C{JD'4) C{JK-n(y") dy

We observe that

h,il = 1, ... ,q,


and the calculation of the integral (14.28) amounts to the calculation of the sum

S r YilYi 2YilYhC{JKn (y') dy',


= (K;l )idl (K;l )i2h JR.P
where the summation of the indices i1,il is carried out from 1 to q, and over
the indices i 2 , J - 2 from 1 to p. By the formula for the mixed moments of the
194 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

fourth order of the centralised Gaussian vector (see, for example, [89] Chapter 1)
we obtain

= a 2 (pq + 2q). (14.28)

Consequently the integral (14.28) is equal to

(yO)4
q ( - 12a8 +
(yO)2)
2a4 'P2u 4 (y ).

Let us further note that from (14.23) there follows the equality

{ P2 'PKn (y') dy' = o.


lap
By analogy with (14.28) we obtain

1(
2 lap p21 'P K.. (')
Y
d''P2u" (0) = (q2"2 + q) ((yO)2
Y 4 Y
1) (0)
4a4 - 2 'P2u 4 Y .

And so

'Q*(8*, z)
""2
= 1(yO)6 - 1(q + 27) "7
72 a 12 12
(yO)4

q2 3 ) (yO)2
+ ( -+-q+l -- - (q2 q
-+-+-1)
(14.29)
84 4 a 426'

whence we obtain (14.3) by the formula

R2 (8* , z) = R2 (8* , y'2 a 2 z) .


The calculation of the polynomial Rs(8*,z) is analogous, but the complex-
ity of its calculation is commensurate with the complexity of the calculation in
Section 11. Therefore we do not carry out this unwieldy calculation.
There exists another, but alas no less unwieldy method of calculating the
first polynomials of the a.e. (14.1), which consists of the use of the so-called 8-
method (see, for example, [32]) of obtaining the a.e.. The 8-method is suitable
to apply to obtaining the a.e. of the dJ. of a scalar statistic (our statistics are
(~n)1/2((u~/a2) - 1)). Its essence consists in the calculation of the a.e. of the
cumulants of the asymptotically normal statistics and the subseqent reconstruc-
tion by them of Edgworth's expansion of its dJ.
The calculation by the a-method of the initial terms of the a.e. of the dJ. of
the normed estimator u~ is carried out in the following way. Let us assume (see
Theorem 26)
M = Po + P1n- 1/ 2 + P2n- 1 + Psn- S/ 2.
14. AEs DISTRIBUTION OF THE VARIANCE ESTIMATOR 195

Let us find the cumulants kj, j = 1, ... ,5 of the quantity M / ';2 q 2 without spec-
ifying terms of the order o(n- S / 2 ). We shall thus look for a representation

kj = kjo + ki1 n -1/2 + kh n -1 + kjs n -S/2 . (14.30)

For this one should find the mixed moments EC pJP[ pJp; which enter into the
expression for the moments of M up to fifth order inclusive (terms o(n- S / 2 ) not
being taken into account), then passing by the standard formulae (see, the Ap-
pendix, Subsidiary Facts) from the moments to the cumulants. Using this plan
we arrive at the matrix with coefficients kjl' j = 1,2,3,4,5, l = 0,1,2,3, from the
representation (14.30):

0 -qV2 0 V2 nY(9)
8
1 0 -q 0
(kjz) = 0 2V2 0 -2V2q (14.31)

0 0 12 0
0 0 0 48V2

The large number of zeros in the matrix (kj/) is a consequence of the normality of
the errors of observation j.
Having the matrix (14.31) available it is not difficult to obtain the a.e. of the
c.f. of the r.v. (!n)1/2((a~)/q2) -1), from which in turn, using the inverse Fourier
transform, we find R 1 , R2 , and Rs.
To conclude the Section we shall indicate the representation of the polynomials
R1, R2 and Rs in terms of Chebyshev-Hermite polynomials:

(14.32)

(14.33)

V2n V2
Rs(9*, z) = -8- Y(9)H1 (z) + 24 (- qS + 6q2 - 8)Hs(z)
bs + V2 (q2 _ ~ q + ~)H5(Z) + v'2 (_.!l... + !)H7(Z)
12 12 5 18 6

(14.34)
196 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

15 JACK KNIFE AND CROSS-VALIDATION METHODS


OF ESTIMATION OF THE VARIANCE OF ERRORS
OF OBSERVATION

The 'jack knife' and 'cross-validation' methods have been long and widely used
in applied statistics works and thoroughly investigated in their theories (see, for
example, [82,215]). Both methods belong to the methods of statistical estimation
linked to resampling methods. The basic idea of these methods consists of using
a special method of treating experimental data to obtain an estimate of an un-
known parameter, namely: some function is calculated (most often the arithmetic
mean) of estimators obtained by reduced samples). As a result the probabilistic
characteristics of the estimator are altered by the comparison with the standard
estimators: for example, the bias of the estimator decreases.
In this Section of the book the 'jack knife' and 'cross-validation' methods are
used for the estimation of the variance of the errors of observation in the non-linear
regression model: first the stochastic a.e.-s of these estimators are obtained, then
the initial terms of the a.e. are found and their first two moments.
If, for the estimation of the variance of the errors of observation, the statistic

A2
O'n = n -lL(()An,
)

is usually used, then in the 'jack knife' method is replaced by the statistic

(15.1)

and in the 'cross-validation' method by the statistic

(15.2)

where B(_j) are the l.s.e.-s of the parameter obtained by sampling, from which the
observation Xj is removed, and

is the estimator analogous to 0';for such a reduced sample. We shall further mark
by the index (-t) the quantities relating to the truncated sample.
Let us assume that to the functions g(j, ()) for each j there also exist in e c
all partial derivatives with respect to the variables () = (()l, ... ,()q) to order k + 4
inclusive, k ~ 2.
For the proof of theorems on the stochastic a.e. of the functionals I n and
Cn we require conditions which are modifications of the conditions for obtaining
the stochastic a.e. of the l.s.e. On and the variance estimator 0'; of the errors of
observation j.
15. JACK KNIFE AND CROSS- VALIDATION ESTIMATORS 197

11(1). For an arbitrary R > 0 there exist constants


i = 1,2,
such that
(1) sup sup n-1d2(a;O+u)~C1' lal=I, ... ,l;
BeT uevc(R)nUc(B)

(2) sup sup n-1~~a)(U1,U2)lu1 - u21- 2 ~ C2, lal = 1.


BeT Ul,U2eV c(R)nUc(B)

111(1). (1)

for all larl = 1, ... ,8, for which g(a r ) (j, 0) ~ 0 and 8 = 1, .. . ,1:
(2)

for all lal = 1, ... ,1 + 2, for which g(a) (j, 0) ~ O.

IV(l, m) (1) _
lim supn- 1 L Ig(a)(j,O) I m (l-l)
< 00,
n-+oo Bet

lal = 1, ... ,1 - 2,

(2) lim supn- 1


n-+oo BeT
L Ig(a)(j,o)l m < 00,

lal = 1-1,1,
V. lim inf m~n Amin(J(O) - n- 1 J(j, 0)) > AO > 0,
n-+oo BeT 1$3$n

where
J(j,O) = (gi(j,O)gr(j,O)):,r=l'
For certain sets of indices
ks = ( .(s) , , ,(S) '
~1 ~r. 8 = 1, ... ,1, 1 = 1,2, ... ,

we shall denote
I
rb(kl) ... (k,) (0) = n- 1 / 2 L c:j II g(k2)(j, 0),
s=l
I
rb(kl) ... (k,) (0) = n- 1 / 2 L(c:j - m r) II g(k.) (j, 0),
s=l
ob = 1,
198 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Let us first formulate a result about the functional I n .


THEOREM 31: For some integer m ~ max (6, k + 2) let there be satisfied the
conditions I~([(k-l)/21+l)(Jl.m([(k-l)/21+1) < 00), II(k+2), III(k), IV(k + 2,m), V,
and

supP;{n l / 2 19 - 91 ~ H} ~ C3H-m (15.3)


(JET

with one and the same constant C3 < 00 for 9n = 9, 9 = 9(-t), t = 1, ... , n. Then

= O(n-(m-4)/2Iog -m/2 n), (15.4)

where
Gt = n- 1/ 2 ~)e1- (12),
and G~, v = 1, ... , k = 2, are polynomials of degree v + 1 with respect to the
quantities lb(kl) ... (kl)' I = 1, ... , [v/2] + 1 with coefficients uniformly bounded in
9 E T and n.
In particular

Gf (15.5)

G~ = , )(''3 )b31 b32 b33


Ai 1ilAhhAi 3iaII(,'1 '2

It follows that we should stress that in contrast to the s.a.e. of the estimator
q~ (see Section 13) the polynomials G v now are not homogeneous with respect
to the sum of r.v.-s lb. The conditions for which (15.3) holds are mentioned in
Section 2 of Chapter 1.
Proof: The regularity conditions of the Theorem being proved ensure that The-
orem 26 of Section 13 holds not only for the original but also for the truncated
samples. Therefore the application of this Theorem to the 'jack knife' functional
I n results in the s.a.e.

L {A (9)n-
k
= Gt + v V/ 2+l - 2Bv (9)n- V / 2+l - C v (9)n 1 / 2(n - 1)-(V-l)/2
v=l
15. JACK KNIFE AND CROSS-VALIDATION ESTIMATORS 199

n
+ n- Ck - 1)/2 Rk+1 (0) - n- 1 L n 1/ 2(n - 1)-k/2 Rk+1,C-t) (0), (15.7)
t=l

where
(1) G~ = Po, A" - 2B" = P"
are the polynomials (13.17) of the expansion (13.16) of the functional a~j
n n
(2) C" = n- 1 LA"C-t), D" = n- 1 LB"C-t),
t=l t=l

(3) The r.v.-s Rk+1 and R j+1,C-t) have the properties

sup P;{IRk+1 (0)1 ~ c5(1ogn)Ck+2)/2}


8ET

= O(n-Cm-2)/21og-m/2 n), (15.8)

supp;{IRk+1,C-t)(0)1 ~ c6 (1ogn)Ck+2)/2}
8ET

< c7 n-Cm-2)/21og-m/2 n ,
_ (15.9)
with the constants C6 and C7 not depending upon t
The next statement gives important information about the structure of the
= 1, ... , n.
polynomials of the expansion (13.16).
LEMMA 31.1: The polynomials P,,(O), v = 1, ... , k - 2, are linear combinations of
quantities of the form

([{ Airjr ) (g ITCk~Xk~) (g bCkr) , o ::; JL ::; v-I, (15.10)

where (k~), (k;), (kr ) are sets of indices from {ilo ... ,i,,+~}U{il, ... ,j,,+~}, and
,,+1
U((k~) U (k;)) U(k
~

r) = {i b ... ,i,,+~} U {il, ,j,,+~}.


r=l r=l

A similar structure is possessed by the polynomials of the expansions obtained from


the truncated samples by the replacement of the quantities A, IT and b with their
truncated versions.
Proof: It is possible to show by induction on v that the polynomials h", v =
0, ... , k - 2, of the s.a.e. of the l.s.e. (12.2) are linear combinations of quantities of
the form

(15.11)
200 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

and with
~ v+1
U((k~) U (k;)) U(k r) = {i1, ... , iv+~+d U {i1,'" ,iv+~+d
r=l r=l

In doing so, the recurrence relations (7.43) are used. Then we also obtain (15.10)
by induction using the relations (13.14), (13.15) and (15.11).
To obtain the general formula for the polynomial G~ of the stochastic a.e.
(15.4) first of all the quantities containing the index (-t) should be got rid of.
Let us carry out the following substitution:

II~;S(k2) n: 1 [II(kl)(k2) - n- 1g(kl) (t, 8)g(k2) (t, 8)] , (15.12)

b~~t) = (n: lY/2 [b(O) - n- 1 / 2 etg(o)(t, 8)] , (15.13)

A(-t) = r1 - 1 [J _ n- 1J(t,
(-t) -- n n 8)]-1
. (15.14)

Representing the last power in the form of a series, we obtain

= (15.15)

+ n- 2 Aidl Ai2h Aiaja gil (t, 8)9h (t, 8)gi2 (t, 8)gia (t, 8) + ....
As follows from (15.10), on substituting the expressions (15.12)-(15.15) (retain-
ing in the series (15.15) only a finite number of terms) into the terms of formula
(15.7) containing C v and Dv we obtain some polynomials in n- 1 / 2 The coeffi-
cients of this polynomial in degrees of n- v / 2 are polynomials in the sum of r.v.-s
IbCk1) ...... (k/)' 1 = 1, ... , [v/2] + 1, v = 1, ... , 2k, and each monomial of the latter
polynom1als contains no more than one factor lb" with 1 ~ 2. Centring the quan-
tities lb", 1 ~ 2, about the power n- v / 2 , i.e., a conversion to sums of lb, leads to
the appearance of additional terms in the coefficients of n- v - 1 / 2
Performing the centring, let us gather together all coefficients for the powers
n- v / 2 , v = 1, ... , k. These are just the polynomials G~ which have the form

(15.16)

where Gv is a polynomial of degree v-I in the variables lb, 1 = 1, ... , [v/2] + 1.


Let us clarify the way in which the polynomials Pv make their appearance
in (15.16). Upon substitution of the expressions (15.12)-(15.15) in Cv and Dv
of formula (15.7) from the terms containing II(kl)(k2)' b(o), and Ai 1 i2 , after aver-
aging over t we obtain the quantity Pv n-(v-2)/2 precisely, which is cancelled by
15. JACK KNIFE AND CROSS-VALIDATION ESTIMATORS 201

A ll n- II / 2+1 - 2Bll n- II / 2+1. In its turn, from the terms containing only one of the
quantities
- n-1g(kt} (t, 9)g(k2) (t, 9), - n-1ctg(a) (t, 9), n- 1 Aidl gjl (t, 9)gh (t, 9)Ai2h.

after averaging with respect to t we obtain - PII , whence in view of the signs of CII
and DII we now obtain PII. The quantity PII emerges here thanks to one property
of the polynomial PII , that each term entering PII has one more factors of II and
b than of factors of A.
We can now rewrite the expression (15.7) in the form
n 1 / 2 (Jn _ (12)
k
= LG~(9)n-Il/2 + G k+1(9)n-(k+l)/2 +n-(k-l)/2Rk+1(9)
11=0
n
_n- 1 Lnl/2(n -1)-k/ 2Rk+l,(_t)(9), (15.17)
t=l
where G k+1 (9) is a polynomial in the variables A, II and b, moreover the maximal
degree of this polynomial in lb and the maximal value of I are equal to k + 1.
Let us estimate the remainder terms of the s.a.e. (15.17). The remainder term
Rk+l (9) is estimated by the formula (15.8). On the other hand, by using (15.9)
we find

n
:::; L sup p;{I R k+l,(-t)(9)1 ~ ea (log n)(k+2)/2}
t=l BET

= O(n-(m-4)/2(logn)-m/2). (15.18)
Let us further observe that
n-(k+1)/2G k+1 = n-(k-l)/2(n- 1 G1 + n- 3 / 2G2 + ... ), (15.19)

and each term Gi of the finite sum (15.19) has the following property: there exists
a constant Cs such that

(15.20)

i = 1,2, ....
In fact, Gi is a linear combination of the products

r = 0, ... , k, 1 = 0, ... , k + 1.
202 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

In the worst case

k
~ L P; { Ib( kf) I ~ C~h(k+l) logl/2 n}
i=l

+P;{lk+lb(kn ... (k~+l)1 ~ c~h(k+l)nIOgn} .

In this way (15.20) is a corollary of the conditions of the Theorem being proved
and Theorem A.5. We obtain similar bounds also for the polynomials Gf-l and
n- 1 / 2Gf
Close to the Theorem just proved is the following:
THEOREM 32: For some integer m ~ k + 4 let the conditions

Itt+3)([m/3]+l)(JL(k+3)([m/3]+l) < 00), II(k + 4), III(k + 2), IV(k + 4,m), V,


and (15.9) be satisfied. Then the variance estimator en of the errors of observa-
tions obtained by the 'cross-validation' method admits the s.a.e.

= O(n-(m-4)/2Iog-m/2 n), (15.21)

where G8 = Go = Po and G~, v = 1, ... , k - 2, are polynomials of degree v + 1


with respect to the quantities ,b(kt) ... (kl)' 1= 1, ... , [v/2] + 1, with coefficients that
are uniformly bounded in () E T and n. In particular,

G~ = - Aijbibj + 217 2 q, (15.22)


GC2 = '1 '2 )("'3 )b"31 b"32 b"33
Ai ti1A i 2i2Ai 3isII(."

(15.23)

Proof: Let us outline the proof of the Theorem as formulated. From the technical
point of view it is expedient to represent the functional (15.2) in the form
n
n -1,", A2
Cn = n Qn - -n- L.J17(-t), (15.24)
t=l
15. JACK KNIFE AND CROSS- VALIDATION ESTIMATORS 203

where
n
Qn = n- l E n- E [Xj - g(j, 9(-t)W,
l (15.25)
t=l

i.e., the statistic (15.25) plays for C n the same role as the statistic a~ for I n . The
latter gives grounds to use of Qn as an estimator of the variance 0- 2 of the errors
of observation. In fact there holds:
LEMMA 32.1: Under the conditions of Theorem 31

supp;{ n l / 2(Qn - 0- 2) - ~ n-II/2G~(()) ~ CI2 n -(k-I)/2Iog(k+ 2)/2 n}


BET 11=0

= O(n-(m-4)/2Iog-m/2 n), (15.26)

where the polynomials Gfj have the properties of the polynomials G~ and Ge, and
furthermore

G~ = Po, (15.27)

Proof: The Lemma is proved analogously to Theorem 31.


On the basis of (15.1), (15.24) and (15.25) let us recast the functional (15.2)
in the form
(15.28)

and the s.a.e.-s of all the statistics of the right hand side of (15.28) are already
obtained. Taking advantage of these expansions, in (15.28) let us equate the
polynomials with the same degrees of n- II / 2:
/I = 0, ... , k - 2. (15.29)

Analogously, for the remainder terms of the s.a.e.-s (13.16), (15.4), (15.21) and
(15.24) the relation

is satisfied, moreover for some constants C13, C14, and Cl5

SUpP;{IRk+l1 ~ CI3 Iog(k+ 2)/2 n } = O(n-(m-2)/2Iog -m/2 n ),


(JET

~~~P;{IR~+l1 ~ c14 10g(k+4)/2 n } = O(n-(m-4)/2Iog-m/2 n ),

sup
BET
p;{ IRf-ll ~ CI5Iog(k+2)/2 n} = O(n-(m-4)/2Iog -m/2 n),
204 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

i.e., (15.21) holds.


Let us observe that from equality (15.29) it follows that for v = 1 the difference
between the s.a.e.of the functionals Qn and c1~ becomes apparent, starting with
the third term:
G~ = P3 + a 2 qn- 1
Let us consider the question of the a.e. of the moments of the first two orders
of the r.v.-s n 1/ 2{Jn - a 2) and n 1/ 2{Cn - a 2), starting from (15.4) and (15.21) for
k = 4:

L n-II/2G~ (8) + n-
2
n 1/ 2(Jn - a 2) = 3 / 2Rf (8), (15.30)
11=0

2
n 1/ 2(Cn - a 2) = L n-II/2G~(8) + n- 3 / 2Rf(8), (15.31)
11=0

supP;{IRf(8)1 ~C16Iog3n} = O(n-(m-4)/2Iog-m/2 n), (15.32)


(JET

supP;{IRH8)1 ~ C17log4n} = O(n-(m-4)/2Iog-m/2 n ). (15.33)


(JET

We shall write
B(Jn ) = E(fn 1/ 2(Jn - a 2), S(Jn ) = E(fn(Jn - a 2)2,
D(Jn ) = D(fnl/2(Jn - a 2).
Analogously, B(Cn ), S(Cn ), D(Cn ) are the bias, the mean square deviation, and
the variance of the normed estimator n 1 / 2 (Cn - (12).
THEOREM 33: Let the conditions of Theorem 32 be satisfied for k = 4. Then we
have, uniformly in 8 E T,
O{n-l/2Iog-3 n), m = 8,
(1) B(Jn ) = O(n- 1 log- 7/ 2n),
{ m = 9, (15.34)
O(n- 3/ 2 Iog3 n), m ~ 10,
(2) S(Jn ), D(Jn ) = a 4 (32 + 2) + 2qa4 n- 1 + 0(n- 1), m ~ 9, (15.35)

O(n-l/2Iog-3 n), m = 8,
(3) B(Cn ) = qa 2 n- 1 / 2 + { O(n- 1 log- 7 / 2 n), m = 9, (15.36)
O(n- 3/ 2 Iog4 n), m ~ 10,
(4) S(Cn ) = a 4 (32 + 2) + n- 1a 4 (q2 + 2q(32 + 3) - 2(31aZ(8))
+ o(n-l), m ~ 9, (15.37)

(5) D(Cn ) = a 4 (32 + 2) + n- 1a 4 (2q(32 + 3) - 2(31aZ(8))


+ o(n-l), m ~ 9, (15.38)
15. JACK KNIFE AND CROSS-VALIDATION ESTIMATORS 205

where
J.t4
/31 = -3
m3
(J' and /32 = 4"
(J' - 3
are the coefficients of skewness and the excess of the distribution of the r. v. C j,
and

(15.39)

(cf., (19.94)).
Proof: The proof is close to the proofs of Theorems 27 and 28. Therefore let us
direct our attention only to certain details. Let us consider first the estimate I n .
Instead of the event On ((J) of Theorem 27 let there be introduced the event

Since 7([m/3] + 1) > 2m, instead of the bounds of order


O(n -([m/2]-2)/2 (log n) -([m/2]-1)/2)

we obtain the bounds that in the degrees n and log n contain the exponent m
instead of [m/2]. Instead of the inequality (13.28) the inequality
n
n 1/ 2 JJn - (J'2J ~ 4nJPo J + C1Sn3/2JOn - (JJ2 + C19n-1 L IO(-t) _(J1 2
t=l

(15.40)

is used. Obvious calculations show that

E;Gt = 0, E;G~= O(n- 1/ 2),


E;P~ = J.t4 - (J'2, E;GtGt = -q(J.t4 _(J'4)n- 1/ 2, (15.41)

E;GtG~ = q(J.t4 - (J'4) + O(n- 1).

These inequalities are used for obtaining (15.34) and (15.35).


Passing to the estimator of Cn , let us introduce the event

and instead of (15.40) let us use an analogous inequality for Cn. The calculations
of the mathematical expectations analogous to (15.41) are obvious.
To within o(n- 1 ) the estimator I n has the least bias. The sizes of the bias for
a~ and Cn are identical in modulus but differ in sign. From (13.34) and (15.35) it
follows that
206 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

The sign of the difference depends upon the sign of the expression on the right
hand side of (15.42). Let us note that /32 ~ -2, and the case /32 = -2 corresponds
to the degenerate r.v. ISj. Let /31 = 0, then for q > 2(2 + /32) and n > no

But, for example, for Gaussian (0,0'2) r.v.-s ISj (/32 = 0) for dimensions q = 1,2,3
and n > no

Analogously

and for /31 = 0 and n > no

In exactly the same way

and for /31 = 0 and n > no

Let us compare the variances: Firstly,

and for /31 = 0 and n > no

Secondly,

D(Cn ) - D(Jn ) = 0'4(2q(/32 + 2) - 2/310'Z(O))n- 1 + o(n- 1),

D(Cn ) - D(a;) = 0'4(4q(/32 + 2) - 4/310' Z(O))n-1 + o(n- 1),


and under the same conditions

In this way, for /31 = 0 and n > no the variance and mean square deviation of
the functional a;are smallest (with the exception of the case q > 2(2 + /32),
when S(a;) > S(Jn )). By these indicators I n possess the second place and the
functional Cn in this case has the worst characteristics. If, also, the r.v. ISj has a
non-zero skewness (/31 ::j:. 0), then the properties of the regression function which
are specified by the term Z (0) will influence the relations between variances and
mean square deviations (see (15.39)).
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 207

16 ASYMPTOTIC EXPANSIONS OF DISTRIBUTIONS OF QUADRATIC


FUNCTIONALS OF THE LEAST SQUARES ESTIMATOR

Setting
u(8) = n 1 / 2 (9 n - 8),
let us consider the following functionals of 9n :
7(1)(8) = u- 2 (L(8) - L(9n )) , (16.1)
7(2)(8) = u- 2 (I(9 n )u(8), u(8)} , (16.2)
7(3)(8) = u- 2 (I(8)u(8), u(8)} , (16.3)
2
7(4)(8) = u- 'Pn(8n , 8). (16.4)
A

For Gaussians (0, ( 2 ) the r.v.-s (16.1) and (16.2) are the statistics of the
Neyman-Pierson criteria (with coefficients u 2 /2) and of the Wald criteria of hy-
pothesis testing in which the value of the unknown parameter is equal to 8 ([189],
Section 6e.2). The functional (16.1) is widely used in regression analysis to con-
struct regions of confidence for the unknown parameter 8. The functional (16.4) is
naturally called the Kullback-Leibler statistics, since for the Gaussian (0, ( 2 ) the
r.v. Cj the quantity u- 2 'Pn(8 1 , 82) is the double of the Kullback-Leibler distance
[39] between the Gaussian measures p~ and P~. And, finally, the functional
(16.3) is a modification of the statistics (16.2) of Wald's criterion.
The functionals (16.2)-(16.4) are quadratic in the sense that they weakly con-
verge to the X~ distribution as n -+ 00.
This Section contains a theorem about the a.e. of the distribution of the func-
tionals (16.1)-(16.4). Our goal is to obtain and analyse the initial terms of the
a.e.: they are the most important ones for applications. Therefore the assertions in
this Sections are deduced only as necessary for this purpose of generality, although
they are true in more general formulations.
The central place in the Section is occupied by the concept of virtual vector.
We say that a random vector is virtual if it is similar to the s.a.e. of an l.s.e.,
and generally speaking it is not an a.e. of any estimator. As is seen later on,
the concept of virtual vector is technically convenient for obtaining the a.e. of a
distribution of functionals in 9n of statistics of the Neyman-Pierson type, which
do not admit the expansions (16.7)-(16.9).
It is necessary for us to use the special case of Theorem 18 of Section 7 (see
also Lemma 25.1 of Section 12 and Remark 26.1 of Section 13).
LEMMA 34.1: Let J.ts < 00 (condition If!) and conditions II, III, V of Section 10
be satisfied for k = 4, and IVI of Section 12 for m = 5. Then, if for any r > 0,
supP;{19n - 81 ~ r} = o(n- 3 / 2 ),
(JET

there then holds


208 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

moreover for some constant Cl = Cl (T) >


sup P;{lh;1 2:: cllog2 n} = o(n- S / 2 ),
8eT

and hv, v = 0,1,2, are the vector polynomials of Section 7 (taking into account
the normalisation n 1 / 2 1 q instead of dn (9).

It is now convenient for us to write the coordinates of these polynomials in the


following form (i = 1, ... , q):
hb = Vi,

hi = 11'1 V/v j + 11'2 AiOtaOtjk VjV k ,


h~ = PI V./kVjV k + P2 v/v1 Vk
. . k I {3 k
+psAtOtaOtjklV3V V +P4AOt a{3jkV~V3V

+ P5 AiOt aOtjk v.jVkVI


I + P6 A iOt A{3"Y aOt"Y1a{3jk VjVkV I , (16.5)
where
1
11'1 = 1, 11'2 = --,
4
1
PI = '2' P2 = 1, (16.6)

1 1
P4 = - -,
4
P5 = - -,
2
and the quantities aOtjk and aOtjkl are those introduced in Section 7.
LEMMA 34.2: Under the conditions of Lemma 34.1 the functionals ,(m), m =
2,3,4 admit the s.a.e.
(16.7)
moreover,
(1) ,(m) = 0'-2 {Iijuiui + (C(m)rr(i)(jk)UiUju k ) n- 1 / 2 (16.8)

+ ((d(m)rr(ij)(kl) + e(m)rr(i)(jkl) uiujuku l ) n- 1 } ,


(2) the t(m) are the r.v.-s that have the following property: for some constants
c4
m) = c4m ) (T) > 0,
sup P;{ltl(m) 2:: c~m) log2.5 m} = o(n- S / 2 ), (16.9)
8eT

(3) c(2) = 2, d(2) = e(2) = 1, c(S) = d(S) = e(S) = 0,


c(4) = 1, d(4) = ~, e(4)
1
= _. (16.10)
4 3
16. AEs OF QUADRATIC FUNCTIONALS' DISTRlBUTIONS 209

Proof: For the quantities ,(m) (9+n-1/2u), m = 2,3,4, let us write the expansions
in Taylor series in u up to the fourth order derivatives inclusive, with the remainder
term in Lagrange form, and let us rewrite them in the form (16.7). Using the
conditions of the Lemma, for t(m), m = 2,3,4, we obtain the bound

(16.11)

From the conditions of the Lemma it is not difficult to deduce (ef., with (13.4))
that there exists a constant C4 = C4 (T) > 0 such that

supP;{lu(9)1 ~ c4Iog1/2n} = o(n- 3 / 2 ). (16.12)


BET

In fact, for the sums of the r.v.-s

and some constants


c = C(i) (T), C(it i 2) (T), c(i)li)2i)3)(T)

there hold the relations (Theorem A.5)

Therefore it is possible to determine constants

C5 = c5(T), c~ = c~(T),
such that for

we have

P;{lu(9)1 ~ c41og1/2 n}

~ P;{lhol ~ c41og1/2 n - an}


q q

+ L P(bi ) + L P(bi1i2 ) +
i=l

=o(n- 3 / 2 ).
uniformly in 9 E T. (16.7) is then evident from (16.11) and (16.12).

210 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

LEMMA 34.3: Under the conditions of Lemma 34.1 the functionals T(m), m =
2,3,4, admit the s.a.e.

T(m) = a- 2 { Iij ViVj + (t, aim) Ai) n- 1 / 2

+ (t,{3;m) Bi)n- 1 } + c(m)n- 3/ 2, (16.13)

where:
(1) The c(m) are the r.v.-s having the following property: there exist constants
c~m) = c~m) (T) > 0 such that
supp;{lc(m)1 ~ c~m) log2.5 n } = o(n- 3 / 2 );
BET

(2) A2 = II(i)(jk) V'V3V


. . k
;

(3) Bl = I ia v~rV'ViVk,
B2 = (II(a)(jk) + 2II(j)(ak) vtViVjV k ,
B3 = A r8 (II(r)(kl)II(8)(ij) + 4II(r)(kl)II(i)(j8)
+ 4II(k)(rl)II(i)(j8) ) V i V3V
. k I
V ,
. . k I
B4 = II(ij)(kl) V'V3V V ,

B5 = II(i)(jkl) V'V V V ,
. k k I

B 6 -- 1-,a vaViVjv
jk
k.
,

(4) The coefficients aim) and (3;m) satisfy the following relations:
{I} 211"1 = aim),
{2} 1211"2 + c(m) = a~m),
{3} 2P2+11"12 = (3(m)
1 ,
{4} 4(P4 + P5) + 411"111"2 + 11"IC(m) = (3~m),
{5} 8P6 + 411"~ + 211"2C(m) = (3~m) ,
{6} 12p3 + d(m) = (3im) ,
{7} 16p3 + e(m) = (3~m),

{8} 2PI = (3(m)


6 (16.14)
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 211

Proof: The proof consists of substituting the polynomials (16.5) in (16.8).


It is not difficult to notice that the functional 7(4) in the form (16.7), (16.8)
is not represented. Nevertheless, for 7(4) a result holds that is analogous to
Lemma 34.3.
LEMMA 34.4: Under the conditions of Lemma 94.1 the functional 7(4) may be
represented in the form of (16.19).
Proof: We have P;-a.c.

Clearly,

L Cj(g(j, On) - g(j, 0)) = L -;a.1


A 3
b(0:)uO:n-(10:1-l)/2 + lnn- 3/ 2,
10:1=1

where for some constant C7 = c7(T) > 0


supP;{llnl ~ c7Iog2.5n} = o(n- 3/ 2 ).
(JET

We note, further, that


bi = li6A 6{3b{3 = lie V e ,
b ij = li6 A6{3b{3j = li6 ~o,

bijk = liO Ao{3b{3jk = liO~~'


Instead of a direct proof of Lemma 34.4 we could refer to the result of Theo-
rem 26 of Section 13 and the formal expansion (13.13) preceding its formulation,
which can be rewritten in the form

LP +l(0)n-
00

7(1)(0) = _0'2 II Il / 2,

11=0

where PII , v = 1,2, ... , are polynomials of the a.e. (13.16). Analogously, we have,
formally,
L A +l(0)n-
00

T(4)(0) = 0'-2 II Il / 2,

11=0

where the quantities AII(O) are assigned by (13.14). The first terms of the expan-
sions mentioned are given by the expressions (13.19)-(13.21).
In Table 3.2 are listed the values of the coefficients a~m) and p~m) for various
criteria 7(m). For m = 2,3,4 the values of a~m) and p}m) are obtained from (16.6),
(16.10), and (16.14), and a~l) and pF) are taken immediately from (16.13).
212 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Table 3.2: The coefficients a~m), ,B~m).

m
(m) a(m) ,Bi m ) ,B~m) ,B~m) ,B~m) ,B~m) ,B~m)
a1 2
1 1 1 1
1 1 -1 1 -1 4 -4 -'3 '3
1 1
2 2 -1 3 -2 4 0 -'3 1
5 4
3 2 -3 3 -4 4 -1 -'3 1
4 2 -2 3 -3 s s -1 1
4 -4

The standard method of obtaining the a.e. of distributions of the functionals


r(m), m = 2,3,4, consists of the following. Let us write
<5(m) = 4,m)n- 3 / 2 10g2 .5 n
(see (16.9)). Then we have, uniformly in 9 E T,
> P{rf m] < z =F <5(m)} + o(n- S / 2 ).
p;{r(m) < z} ::: (16.15)

Let Fn(x) be the d.f. of the vector u(9). Then

(16.16)

Let us approximate the d.f. Fn in (16.16) by its a.e.. We shall arrive at the
final result after the necessary changes of variables in the approximated integral
and taking into account the values of the remainder terms in the expressions so
obtained.
Unfortunately we can not apply this method to the important functional T(1),
since its lacks a representation (16.7), (16.8). Nevertheless, the result of the
Lemma 34.3 permits us to unify the method of obtaining the a.e. of the dis-
tributions of all four functionals. Let
(16.17)
where hs (9), 9 E e is a bounded non-random function on T, ho = ho, h1 and h2 are
vector polynomials of the form (16.5), the coefficients 11'1, 11'2, P1-P6 of which are
arbitrary and are not obliged to coincide with the values of (16.6) corresponding
toen. Also let

and
r[1] = (1-2 {IiiUiUi + (c(1)II(i)(ik)UiUiuk) n- 1/ 2

+ ((d(1)II(ii)(kl) + e(1)II(i)(i kl) uiuiuku l ) n- 1 }, (16.18)


16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 213

where C(l) , d(l) , e(l) are some coefficients


Let us assume that the coefficients 71"1, 71"2, P1-P6 are the coefficients of the
representations of the vector ii, and that C(l), d(l) , e(1) are related by the system
of equations (16.14) with right hand sides 0:11), 0:~2), .8P)-.8~l) from the expansion
(16.13) for T(1). From Table 3.2 we find at once that
3
P2 = 8
From equations (2), (4), and (5), using Table 3.1, we obtain

(9) C(4) =- 1- 1271"2,


1
(10) P4 + Ps = 71"2 - 8'
2 1
(ll) 1071"2 + 71"2 + 8 - 4P6 = o.
From (6) and (7) it follows that

(12) d(l) = -!4 - 12p3,


(13) e(l) = -! - 16p3,
3
and, consequently,
(14) d(l) = ~ e(l).
4
Equations (9}-(14) do not allow us to determine the coefficients 71"2 and P3-P6
uniquely. IT, in order to give ,[1] a simple form we choose in (16.18)
C(l) = d(l) = e(l) = 0
(cf., ,(3)), then the coefficients 71"1, 71"2, P1 -P6 of the representation of ii are given
by the equalities
1 1
71"1 = 2' 71"2 = - 12 '
1
P3=-48' (16.19)

5 1
P4 + Ps = - 24' P6 = 36

Substituting (16.17) in (16.18), with the help of (16.13) we can be persuaded


for m = 1 that

(16.20)
214 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

where the r.v. t(l) has the property (16.9) with the constant

c~l) = c~l) (T) > o.


In this way (16.20) is analogous to (16.7).
We shall call ii(8) a virtual vector, and the representation (16.18) a virtual
s.a.e .. The vector with coefficients (16.19) and
r[l] = (1-2 Iij iiiii j

are one of the realisations of a virtual vector and s.a.e. respectively. Let us keep
the notation Fn(x) for the d.f. of the virtual vector ii. Then for the functional r(l)
the relations (16.15) and (16.16) hold. Consequently the a.e. of its distribution
can be obtained, having available the a.e. of the d.f. of the virtual vector ii.
In the work [16] an assertion about the a.e. of the d.f. of the functional T(l) is
proved which uses the a.e. of the d.f. of the vector V(8) (see Section 10) and the
a.e. (16.13). In spite of the greater naturalness of such an approach in comparison
with the virtual approach just stated, it turns out to be unsuccessful from the
calculational point of view. As we have already been persuaded above, the proofs
of theorems about a.e.-s usually also contain a calculational scheme, following
which it is possible to find the initial terms of this a.e. that are important in
applications. The proof of the theorem in the work [16] is no exception to the
rule. However, the attempt to calculate the second term of the asymptotic d.f. of
r(l), confining oneself to [16], was shown to be unsuccessful, since one arrived at a
complete halt owing to the extraordinary tediousness required for that calculation.
In solving the problem under consideration under a distinctive law of the con-
servation of the difficulty of calculation, it becomes clear that the use of the virtual
approach does not set us free from the huge volume of processing. But here the
fundamental calculational difficulty is absorbed into Theorem 34 about the a.e. of
the d.f. of the virtual vector ii, which is close to Theorem 24 of Section 10 about
the a.e. of the d.f. of the vector On.
THEOREM 34: Let the conditions of Theorem 24 of Section 10 be satisfied for
k = 4. Then

sup sup
(JET CEe: q

(16.21)

where M v, v = 1,2, are polynomials of degree 3v in the variables y = (y1, ... , yq)
with coefficients uniformly bounded in 8 E T and n.

We do not give the proof of Theorem 34 since it coincides with the proof
of Theorem 24. The calculation of the polynomials M 1 and M 2 is carried out
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 215

analogously to Section 11 and is accompanied by more tedious calculations of the


literal coefficients. The final result is awkward and is postponed to the end of the
Section. The polynomial M 2 contains three terms that are sums with sixth power
variables, sixteen terms that are sums with fourth power variables, twenty eight
terms that are sums with second power variables, and nine terms that are sums
of constant terms. In all, M 2 contains fifty six terms that are sums. To obtain
the polynomials Ml and M2 corresponding to On it is sufficient to substitute in
M 1 and M 2 the set of coefficients (16.6). When this is done sixteen terms in M 2
become zero, and in total only forty terms remain.
Let
e-z/2zr/2-l
gr(Z) = 2r / 2 r(r/2) , Z ~ 0,

be the density of the X2 distribution with r degrees of freedom,

r = 1,2, ....

THEOREM 35: Under the conditions of the preceding Theorem, for any Zo >0
(q = 1), Zo = 0 (q> 1), and m = 1,2,3,4,

sup sup
(JETz2::zo

= O(n- 3 / 2 log2 .5 n), (16.22)

where
\ (m) _ \ (m) ((m) (m) (3(m) (3(m))
Ajk - Ajk al , a2 , 1 , ... , 6

are the numerical coefficients characterising the functionals T(m), and the quan-
tities Pk((}) do not depend upon m, and given by the expressions

'Y4 Ais A01.8 II


0'4 (01)(.8)(i)(s) ,

2
'Y3
(16
AiS Ajr AOI.8II (a )(.8)( i) II (s)(j)( r) ,

2
'Y3 Ais Ajr AOI.8II (a )(j)( i) II (.8)( s)( r) ,
(16

'Y3 Ais A01.8 II


(12 (is)( a )(.8) ,

'Y3 Ais A01.8 II


(12 (ia )(.8)( s) ,
216 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

P6 = "Y3 A
2"
(J'
is Ajr AQ,Bn
(i)(s)(j)
n
(Q,B)(r),

P7 = "Y3
2"
(J'
A is Ajr A Q,Bn
(Q)(i)(j)
n
(,Bs)(r) '

Ps = "Y3 AisAjrAQ,Bn (Q)(i)(s) n


2"
(J'
(,Bj)(r) ,

P9 = (J'
2 A is Ajr A Q,Bn
(Q,B)(j) n (ir)(s) ,

P lO = (J'
2 A is Ajr A Q,Bn
(Qi)(j) n (,Br)(s) ,

Pu = (J'
2 A is Ajr AQ,Bn
(i)(js)
n (Q)(,Br)'

Pl2 = (J'
2 A is Ajr A Q,Bn (i)(jr) n (s)(Q,B) ,

Pl 3 = (J'
2 A is Ajr AQ,Bn (i)(jQ) n (s)(r,B) ,

2 .. kl
Pl4 = (J' A'3 A n(ij)(kl)'
2 .. kl
PIS = (J' A '3 A n(ik)(jl)'
2 .. kl
Pl6 = (J' A \3 A n(i)(jkl)'

Proof: We shall carry out the proof for the virtual vector u and its d.f. Fn(x).
In particular, it includes the case u = u. The coefficients >.Y::) are contained in
Table 3.3.
Let us denote
X! = {x: r[mj (9 + n- l / 2 x) < z:t= c5(m)}
sn(9,x) = {u:(J(9)u,u} ~ x2logn},
where x > 0 is some constant. Thanks to (16.15), (16.16), the Theorem will be
proved if the required expansion can be obtained for the integrals Ix;[ dFn . The
sets Xl n sn(9, x) are convex for n > no. On the other hand the constant x can
be chosen such that

< Po{lul > x,\;;;~2(J)logl/2n}


= o(n- 3 / 2 ),
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 217

Table 3.3: The coefficients >.t::) .


k j=O j=1 j=2 j=3
1 1 1 1 0
8 -4 8
2 1 3 3 1
-8 8 -8 8
3 1 1 1 1
- 12 4 -4 12
4 i 0:1 -! 0:1 i 0:1 0
5 0 -! 0:1 ! 0:1 0
6 - i 0:1 ~ 0:1 + i 0:2 - ~ 0:1 - ! 0:2 i (0:1 + 0:2)
7 0 ! 0:1 -0:1 - ! 0:2 ! (0:1 + 0:2)
8 0 ! (0:1 + 0:2) - (0:1 + 0:2) ! (0:1 + 0:2)
! (0:1 + 0:2)2- - (0:1 + 0:2)2+
9 0 ! (0:1 + 0:2)2
!,82 - 2,83 !,82 + 2,83
! (0:1 + 0:2)2 - !,81- - (0:1 + 0:2)2 + !,81 +
10 0 ! (0:1 + 0:2)2
2,82 - 6,83 2,82 + 6,83
! (0:1 + 0:2)2 - !,81- - (0:1 + 0:2)2 + !,81 +
11 0 ! (0:1 + 0:2)2
,82 - 2,83 ,82 + 2,83

2
- !8 0:1
i (0:1 + 0:2)2 + i o:~ - - i (0:1 + 0:2)2 - i o:~ +
12 ~ (0:1 + 0:2)2
! ,83 ! ,83
- i o:~ + i (0:1 + 0:2)2 + o:~- - ! (0:1 + 0:2)2 - i o:~ +
13 i (0:1 + 0:2)2
! ,81 ,81 - ,82 - 3,83 ! ,81 + ,82 + 3,83
14 !8 0: 2
1 - i o:~ - !,84 1 2 + 21 ,84
80:1 0
1 2-
40:1
15 - ! o:~ + !,81 -,84 12(3
40:1 + 4 0
! ,81
16 0 - ~ (,85 + ,86) ~ (,85 + ,86) 0
218 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

uniformly in 0 E T, since clearly the virtual vector u has the property (16.12).
Therefore it is sufficient to restrict ourselves to the consideration of the integrals
Jx~nSn dFn
By Theorem 34

sup sup
(JET z~zo

= O{n- 3 / 2 Iog2 n).

Consequently the task is reduced to the study of the integrals

Let us consider the integral Y:. The integral Yn- is considered analogously. In
Y: let us carry out the substitution of variables u -+ UA1/2U, and then the polar
substitution of variables u -+ (r, cp), cp = (cp1, .. . , cpq-1):
i-1
ui = r II sin cpo. cos cpi, i = 1, .. . ,q,
01.=1
cpo. E [0,11"), a: = 1, ... , q - 2,
cpq-1 E [0,211"), CPq == 0, r ~ o.
Then the function ,[m] is transformed into the form

where a 1 and a 2 are trigonometric polynomials in the variables cp1 , ... , cpq-1. For
example,

a1 = C(m)IT(i1)(i2is) (A 1/2)id1 (A 1/2)i2h (A 1/2)isjs


j1-1 h-1 ja-1
X II sin cp0l.1 cos vJ1 II sin cp0l.2 cos vJ2 II sin cpOl.S cos vJs .
01.1 =1 01.2=1 OI.s=1

The polynomials a2 is written analogously. The set x;t n Sn is transformed into


the set

IT(q) = [0,1I")q-2 X [0,211").


16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 219

The integrand in the integral Y: takes the form

where
q-2
I(r, <p) = r q- 1 II (sin <pi)q-i-1
i=l

is the Jacobian of the polar coordinate transformation

= mi3)(T3 r 3 + mil) (Tr,

= m(6)(T6 r 6 + m(4) (T4 r 4 + m(2) (T2r2 +m{O)


2 2 2 2 ,

where m~j) are trignometric polynomials in <p1, ... , <pq-1 determined by the sub-
stitutions, mentioned above, of variables from the formulae for the polynomials
M1 and M 2
By Fubini's Theorem

=+
Xn (16.23)

Let us denote by Hn(<P) the image of the interval in under the mapping r --t
rfm1(r,<p). It is clear that for n > no and some constant Cs E (0,00)

[0, (T-2 ~ logn - csn- 1/ 2 Iog3/ 2 n] C Hn

C [0, (T-2 ~ log n - csn -1/2 log3/2 n]

uniformly in () E T. Let
1l1 n (p,<p) :Hn --t]R1
be the inverse function of rfm1(r,<p)li". The function 1l1 n can be found formally
in. the form of a series in half-integer powers of p, the coefficients of which are
trigonometric polynomials in <p1, . .. , <pq-1, with coefficients uniformly bounded in
() E T and n:

r = 1l1n (p,<p)

+ L n- v/ 2p{v+1)/2(TV A~(<p).
00

= p1/2 (16.24)
v=l
220 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

The coefficients Ll~ are calculated by the substitution of r!m1(r,<p) in the series
(16.24) and setting equal to zero all coefficients of the powers r > 1. Let us find
the quantities Lli and Ll;. Let us denote

p = rfm1(r,<p).

Then
T = urn- 1/ 2.
Since

for any c > 0 and n > no,

Consequently

(16.25)

Substituting (16.25) in (16.24) we obtain recurrence relations for the calculation


of the coefficients of Ll~:

~ Lll + Lli = 0,
~ Ll2 - ~ Ll~ + LllLli + Ll; = 0,

In particular we find

Ll" 5 1
2= 8~1 - A2
2~2'
A
(16.26)

From arguments linked to the solution of the problem of the inversion of a


power series ([81], pp. 498-502) it follows that for small values of
t = upl/2n-l/2
the series (16.24) is convergent. For small t

'lin = pl/2 + uLlipn-l/2 + u 2Ll2p3/2n- 1 + r~1)n-3/2, (16.27)

with
r~l) = O(log2 n).

uniformly in () E T, <p E ITq and p < c9logn (C9 > u- 2 x2).


16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 221

For p > 0 and small t, from (16.24) we obtain

=
where
r~2) = O(logn).
uniformly in (J E T and cp E II q , and for p < Cs logn. _
Let us denote by z;t the integral in (16.23) over the set X rt, and let us carry
out the change of variable
r -t p = pm](r,cp).
in it. For n > no we obtain

z~ = {
JH n n[O,z+6(m)]
'li~-le-"'~/2 (1 + tv=l
M v ('lin' cp)n- V / 2) a'li n dp. (16.29)
ap
In (16.29) let us substitute the representations (16.27) and (16.28) for 'lin and
a'lin/ap, substituting for the ~t and E; in them the quantities ~l and ~2 in
(16.26). Simple transformations show that

z~ = ! {
2 JH n n[O,z+6(m)]
e- p / 2pq/2-1 (1 + t
v=l
Mv(p, cp)n- V / 2) dp

+ r~3)n-3/2. (16.30)
In the representation (16.30)
r~3) = O(log2 n)

uniformly in (J E T and cp E II(q), the Mv are polynomials in pl/2 of degree 3v,


moreover Ml contains only odd powers of pl/2, and M2 only even powers of pl/2
(i.e., integer powers of p):

Ml i
= (a3m13) + ~l )p3/2 + (amP) - q; 1 a~l) pl/2, (16.31)

(16.32)
222 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

The coefficients of the polynomials Mil are trigonometric polynomials in <p1, ... ,
<pq-1 with coefficients uniformly bounded in 8 E T and n. For 11 1 each term of =
these trigonometric polynomials contains either one odd power of cos <pi, i = 1, ... ,
q - 1, or an odd power of sin <pq-1 j for 11 = 2 it contains terms with even powers
of the functions stated.
It is easy to understand that for a proper x one can obtain

!~
2 Rl\H..
e- p / 2pq/2-1 (1 + t 11=1
Mlln- II / 2) dp = O(n- 3/ 2)
uniformly in 8 E T and <p E II(q). Therefore (16.30) can be rewritten in the form
1
z;t = 2 (
z+6(m)
e- P/ 2pq/2-1
(2 _ )
+ LM n-"/2
1 l dp + r~4)n-3/2,
h 11=1

where r~4) has the property of r~3)


Returning to the integral (16.23) we obtain

Yn+ =
1
2 (21r)-q/2 10
r+&(m) {
e- P/ 2pq/2-1 in 1(1,<p)
(2
1+ LM n-"/ 2
)
l d<pdp
o nw ~

(16.33)
and

uniformly in 0 E T.
Let us integrate (16.33) with respect to <po Thanks to the properties of the
trigonometric coefficients of the polynomial (16.31), the integral with respect to <p
of the polynomial 1(1, <p)M1 is equal to zero. In fact each coefficient of 1(1, <p)M1
contains:
Jo
(1) either one integral of the form 1r sina <pi cos b <pi d<pi, j = 1, ... , q-1, where
b = 1 or 3j
Jg
or (2) the integral 1r sina <pq-1 cosb <pq-1 d<pQ-1 with a = 1 or 3 and even b.
But such integrals vanish.
Let each
i-1
ui = II
sincpi cos <pi, i = 1, ... ,q
;=1
appear in M2 in degrees (li. As has only just been established, the integral with
respect to <p differs from zero if and only if all the (li are even numbers (in partic-
ular, zero). In such a case when integration with respect to <p is performed, in the
expressions (16.33) there appear the integrals
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 223

q-l r(~ (q-j-l+.j; ai+l)) r(~(aj+l))


= 2q II 1 -
'-3+1
---''---...!.-._-;--;------'--'---:-:-----
j=l 2 r (~ (q - j + E:=j ai + 1) )
= 2J1r(~(aj+l))r-l(~(q+lal)),
lal = al + ... + a q
In this way, after integrating with respect to the variables cp of the trigonometric
coefficients of the polynomial (16.32) and simple transformations of the expressions
obtained, we obtain
1 z+,s(m)
y+ = e- p/ 2pq/2-l (1 + n- l p. (p))dp
10
---;-::-=-:--:--:- {
n
2q/ 2r(q/2) n
+ O(n- 3/ 2 Iog2 n),
Pn(P) = Po(9)p3 + Pl (9)p2 + P2(9)p + P3(9),
where
pj(9) =pj(9jc(m),d(m),e(m)jll'l,1l'2,Pl, ... ,P6), j = 0,1,2,3,

for q = 1 uniformly with respect to z ? Zo > 0,



are coefficients uniformly bounded with respect to () E T and n.
Clearly, for q > 1 and Z ~ Zo > we have uniformly with respect to Z ~ 0, and

y+
n
= 1
2q/2r(q/2) 1. 0
z e-p/2pq/2-l(l+n-lp' (p))dp+O(8(m))
n .
An analogous representation also holds for Yn- . Analysing the coefficients Pj (9),
j =0,1,2,3, of the polynomial Pn(P) it is not difficult to establish that

(16.34)

where
,(m) _ ,(m)( (m) d(m) (m).
Ajk -Ajk C , ,e ,ll'l,1l'2,Pl, ... ,P6 )
224 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

are the coefficients characterising the functional r(m). Let us express with the help
of the system of equations (16.14) the quantities d(m), 11"1, 11"2, P1, ... ,P6 through
the variables c( m), e( m), 0:1, 0:2, /31 -/36 in the following way:

(16.35)
/36 = 3e(m) + 4/34 - 3/35,
P1 = 2' 4d(m)

P2
1
= 2 /31 - -t,
0: 2
P3 =
/35 - e(m)
16
Let us substitute the relations (16.35) into the coefficients >";7:) from equality
(16.34). The collection together of similar terms gives the coefficients )..Y::) the
form indicated in Table 3.3.

REMARK 35.1: For m = 2,3,4 the quantities 11"1,11"2, P1-P6 are given by (16.6).
Therefore in (16.34)
>";7:) = >";7:) (c(m), d(m), e(m)
and their values as functions of c(m), d(m) and e(m) can be set out in a table
analogous to Table 3.3.

Let us consider the question of Bartlett's corrections for the a.e. (16.22) of
Theorem 35. Bartlett's correction is designated here as a perturbation of the
argument of the dJ. of the functionals T(m), for which the term of order n- 1 in
their a.e. vanishes.
Let us write the a.e. for p;{T(m) < z} in a form that is more suitable for
applications. Let us denote
16
WJ m ) = L >.;7:)Pk, j = 0,1,2,3.
k=l

The quantities W)m) differ from the coefficients Pj of the polynomial Pn(P) in
(16.34) only by constant factors. There holds the remarkable equality
3
LW)m) = 0, (16.36)
j=O

which was discussed in another context in [47). We have (see Table 3.3)
3
L >')7:) = 0, k = 1, ... ,16,
j=O
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 225

and the validity of (16.36) is ascertained immediately. Therefore in correspondence


with (16.22) we have, uniformly in () E T,
3
p9{T(m) 2: Z} = Gq(Z) + n- 1 L wjm)Gq+2j (z) + 0(n- 1), (16.37)
j=O

where G= 1 - G. By [1], p. 735,


j

Gq+2j = Gq + 2 Lgq+2k, j 2: o. (16.38)


k=l

In view of (16.36) and (16.38) one can write

Denoting
k-l
bkm) = Lwjm)
j=O
and once again taking (16.36) into account, we finally obtain
3 3
'~
" wj(m)G~ q+2j -- - 2'" b(m)
~ k gq+2k (16.39)
j=O k=l

Let Z = a2 and
ep(a) = (27r)-1/2 e -a /2 2

be the standard Gaussian density. Then combining (16.37) and (16.39) we obtain

p.(Jn {T (m) 2: a}
2 = ~2
G ( ) Vrn=
qa -
27r ()
3
k a L
b(m) q+2(k-l)
--;--=-'''-:-.,.---,-:;--,------,...,..
7rep a k=12(q/2)+k-lr(!(q+2k))

+0(n- 1). (16.40)


uniformly in () E T. Let us set Z = a2 - 8n- 1 in (16.37). Then
Gq(a 2 - 8n- 1 ) Gq(a 2) + gq(a 2)8n- 1 + 0(n- 1),
~ 2 1 ~ 2 .
Gq+2j(a -8n-) Gq+2j(a )+0(1), )=1,2,3.
With regard to (16.40) we obtain
p;{T(m) 2: a2 - 8n- 1 }

~
= G q(a 2) + hep(a)
{3 b(m) a2kr(q/2) }
8 - ~ 2k~1 r(~(q + 2k)) n- 1 + 0(n- 1 ), (16.41)
226 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

where
..j2-ff a q - 2

h = 2q / 2 f(q/2)'
Choosing 6 by equating to zero the term of order n -1 in the right hand side of
(16.41) we arrive at the expression

6_ ~ bim )a 2k f(q/2)
(16.42)
- ~ 2k - 1 f(!(q + 2k)) .

Finally, let us set


a2 _ 6n- 1 = a2 (1- 6q n- 1 ),
where a2 = Z1-a is the quantile of the X~ distribution. Then from (16.41) and
(16.42) we find that

p;{rcm)~Z1_a(1- ~)}=1-a+o(n-1), (16.43)

uniformly in () E T, where

(16.44)

To conclude the Section we shall give explicit expressions for the polynomials
M1 and M 2 :
M 1 (y)

= { 6~6 IICa)(.B)('Y) + :2 (11"1 + 611"2)IICa.B)('Y) } yay.By'Y


- Aa.B { (11"1 + 811"2)IICa)(.B'Y) + (11"1 + 411"2)IIC'Y)(a.B) + 2~4 IICa)(.B)('Y) } y'Y,
16. AEs OF QUADRATIC FUNCTIONALS' DISTRIBUTIONS 227

13
- 20'6 (11'1 + 61r2)A i'JII(a,B)(')')II(i)(j)(o)
13 ..
- 20'6 (11'1 + 21r2)AtJII(a,B)(i)II(i)(j)(0)
13
+ 20'6 1r1II(a)(,B)(')'0)

-
13 (
60'6 11'1 + 411'2 ) A i'JII(ij)(a) II(,B)(')')(o)
13 ..
- 20'6 (11'1 + 41r2)AtJ II(a)(,Bi) II (j)(')')(o)

- ~
a
(1r~ + 1211'111'2 + 401r~)ArsII(r)(sa)II(,B)(')'0)
+ ~[-
a
(1r~ + 1011'111'2 + 321r~) + 2(p4 + P5 + 8p6)]A rs II(r)(a,B)II(s')')(0)

+ :2 [- ~ (1r~ + 41rD + 4 P6] ArsII(r)(a,B)II(s)(')'o)


+ :2 [- ~ (11'1 + 411'2)(311'1 + 2811'2) + (P2 + 4P4 + 4P5 + 16P6)]
x ArsII(ra)(,B)II(s')')(o)

+ :2 (~1r~ + 6P3) II( a,B)( ')'0)

- ~
a
(11'1 + 411'2)(11'1 + 81r2)ArsII(rs)(a)IICB')')(0)

+ { - 4~6 A ijII(i)(j) (a)(,B)

I .. kl
+ 40'8 AtJ A II(i)(j)(k)II(I)(a)(,B)

2
13 .. kl
+ 40'8 AtJ A II(i)(k)(a)II(j)(I)(,B)

2
13 .. kl
+ 80'8 AtJ A II(i)(j)(a)II(k)(I)(,B)

+ 2~4 (11'1 + 81r2)A ab ArsII(ar)(s)II(,B)(a)(b)

+ 2~4 (11'1 + 21r2)A ab ArsII(a,B)(r)II(s)(a)(b)


228 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

+ "/3 71"1 AabArsII (o:r)(a) II (b)(s)(,B)


4"
a

+ "/3
2a 4 71"1
A ab ArsII
(rs)(a)
II
(b)(o:)(,B)

"/3 AabII
- a 4 71"1 (ao:)(b)(,B)

-
"/3
2a 4 71"1
AabII (o:,B) (a)(b)

"/3 AabII
- 2a 4 71"1 (ab) (o:)(,B)

+ 2~4 (71"1 + 471"2) A ab ArsII(rs)(o:)II(,B)(a)(b)

+ 2~4 (71"1 + 471"2)A ab ArsII(o:)(,Br)II(s)(a)(b)

+ [- ~ 7I"~ + P2 - 12P 3 ] AirII(rO:)(i,B)

- (71"~ + 6p3)A ir II(ri)(O:,B)

+ [~(371"~ + 871"171"2 + 3271"~) - (P2 + 2ps + 16P6 )] Air AjSII(s)(io:)II(j)(r.B)

+ [(71"~ + 271"171"2 + 871"~) - 4p6]A ir AjSII(ir)(j) II(s) (o:,B)

+ ~ (71"1 + 871"2)2 Air AjSII(r)(io:)II(s)(j,B)

+ [~(71"~ + 871"171"2 + 3271"~) - 2(ps + 8 P6 )] Air AiSII(r) (jo:) II (s)(i,B)

+ [271"2(71"1 + 871"2) - 2(p4 + 4p6)]A ir Ai 8


II(r)(ii)II(s)(o:,B)

+ (71"1 + 471"2)(71"1 + 871"2) Air AjsII(s)(io:)II(ir)(,B)


+ [(71"1 + 471"2)(71"1 + 871"2) - (P2 + 8P4 + 6ps + 32p6)]
X Air Aj 8 II(s)(iO:) II(rj) (,B)

+ [(71"1 + 471"2)(71"1 + 871"2) - (P2 + 4P4 + 4ps + 16p6)]


X Air AiSII(s)(ij)II(ro:)(,B)

+ ~ (71"1 + 471"2)2 Air AiBII(ir)(o:)II(js)(,B)


+ [(71"1 + 471"2)2 - 2(ps + 4p6)]A ir AjsII(ir)(j) II(so:) (,B)
17. POWERS OF TESTS 229

+ [~(1I'1 + 411'2)2 - (P2 + 4P4 + 2P5 + 8 P6 )] Air AjSII(rj)(a)II(si)(,B)

- (P1 + 12p3)Air II(i)(ra,B)


- 2(P1 + 6p3)A irII(a)(ir,B) } yay,B
14 .. kl
+ { 8(14 A'J A II(i)(j)(k)(l)

2
13 AijAklArsII II
- 12(16 (i)(k)(r) (j)(l)(s)

2
13
- 8(16 AijAklArsII
(i)(j)(k)
II (l)(r)(s)

13 11'1 AirAjsAabII (ir)(j) II (a)(b)(s)


- 2(12

13 AirAjsII
+ 2(12 11'1 (ir)(j)(s)

2
(1 2A ir Aj s II
+ 211'1 (ir)(js)

+ (12 (~11'~ - P2) Air AjSII(rj)(is)

2
(1 2AirAjsAabII
- 211'1 II
(ir)(a) (js)(b)

+ (1
2(P2 - 1 2) AirAjsAabII (rj)(a) II(si)(b) } .
"211'1

17 COMPARISONS OF POWERS OF A CLASS OF TESTS OF


HYPOTHESES ON A NON-LINEAR REGRESSION PARAMETER

In this Section the model (0.1) is considered with a regression function g(j,O)
depending on a scalar parameter 0 E e, where e is an open interval of the real
line. Let us assume that for the function g(j, 0) there exist continuous derivatives
with respect to 0 up to the fourth order inclusive.
Let us write
230 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

LIIgi. (j,9),
B

IIil ... d9) =n- 1 etc.


r=l

It is clear that these notations correspond to the notations for q > 1 introduced
earlier.
Let us consider three funtionals of the observations (0.1):

7(0)(9) = 1 (1-2 A(9) (d


4n d9 L(9) Y (17.1)

7(1)(9) = (1-2(L(9) - L(8n , (17.2)


7<2)(9) = (1-2I(8 n)u2(9), (17.3)

where
u(9) = n 1/ 2(8n - 9),
The Neyman-Pierson functional 7(1) and Wald functional 7(2) have already
been considered in the preceding Section. For the Gaussian (0, (12) r.v.-s Cj the
functional 7(0) is Rao's statistical criterion for simple hypothesis testing Ho : 9 =
90, where 90 is the presribed value of the parameter 9 ([189], Section 6e.2). The
important difference between the functional 7(0) and the two other functionals is
that it does not depend upon 8n , and therefore it seems to be more preferable
from a practical point of view. In fact, using the functional 7(0) we economise on
the procedure of calculation of the l.s.e. 8n .
No longer assuming that the Cj are Gaussian r.v.-s, we shall use the functionals
7(m), m = 0,1,2, for the construction of the criteria "iI!~m) for the solution of the
following problem of hypotheses testing.
Let us fix the number 6 =f 0 and consider a sequence of criteria "iI!~m) for a
testing by the observations X = (Xl, ... , Xn) of the simple hypothesis

Ho:9 = 90 , 90 E e

against the simple alternative

H6:9 = 90 + 6n- 1/ 2
The question is, which of these criteria "iI!~m) generated by the statistics (17.1)-
(17.3) are asymptoticaly locally more powerful, as n -+ 00, than their competitors
for properly equated errors of the first kind?
In this Section an answer is given to this question under the conditions of
regularity assembled below. For this we shall not assume that for the r.v.-s Cj
17. POWERS OF TESTS 231

there exists a (smooth) density p = P'. In this way the solution of the problem
can not be reduced to a direct application of the theory developed in the works
[46,48-51,65,66, 143] for two reasons.
Firstly, in the model (0.1) the distributions Xj differ in shifts, at the same time
as the investigations conducted for a repeated sample.
Secondly, in the works on statistical criteria mentioned, the conditions of regu-
larity, the expressions for the deficiency of the criteria which are calculated under
these conditions (see, for example, [65]) contain a density of observations and its
first derivative. We can not, however, use the concept of Fisher information being
within the framework of our regularity conditions.
A method of comparison of powers for a class of test stated below and two-
sided modifications of 'IJI~m) -tests with statistics (17.1)-(17.3) are representatives
of this class. The essence of the results consists in the following.
The critical regions of the criteria 'IJI~m) are chosen such that the errors of the
first kind coincide up to magnitude o(n- 1 ). It is shown that the powers of these
criteria begin to differ only with terms of order n- 1 , and that these differences are
determined only by terms of order n- 1 / 2 of the s.a.e. of the generating functionals
r(m). The choice of more powerful criteria essentially depends upon the symmetry
of the distribution of the r. v .-s Cj. If the cumulant of the third order together with
its coefficient of skewness /31 of the r.v. Cj are equal to zero, then the answer to
the problem coincides with the answer in the works [49-51], namely: a two-sided
modification of Rao's criterion is the most powerful of the three under consider-
ation. If 73 :f:. 0, then the answer turns out to be considerably more complicated
and any of the three criteria 'IJI~m), m = 0,1,2, can be the most powerful.
The conditions of regularity in this Section repeat the conditions introduced
earlier in Section 10 for q = 1, k = 4, and retain the previous numbering.
Let us write

V'(O) = (b1 (O), b2(O), b3(O, i.e., V(O) = n- 1 / 2 LYj(O)Cj,


where the vectors Yj(O) E 1R3 have the coordinates gl (j, 0), g2(j,O), g3(j,O).
Let us set
Y(O) = L
Yj(O)yj(O).
Then the third order matrix

is the arithmetic mean of the correlation matrices of the vectors Yj(fJ)Cj.


Let 1/J().) be the c.f. of the r.v. Cj. Then

1/Jj(t) = 1/J( (Yj(fJ), t}), t E JR3


is the c.f. of the random vector Yj(fJ)Cj. Let us assume that for any compact Tee
the following conditions are satisfied:
I. 1'5 < 00.
232 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

II. For any R > 0 there exist constants C; = ci(R, T) < 00, i = 1,2, such that
(1) sup sup n- I / 2 d r () + c) ~ CI, r = 1,2,3,4;
(JET lul~RjuEuc(J)

(2) sup sup n-I~4(uI,U2)luI - u21- 2 ~ C2.


(JET IU11,lu21~RjuEuc(J)

III. lim infn- I / 2dr ()) >0, r=I,2,3,4.


n-too (JET

VIII. There exists an integer h ~ 3 such that amongst any h vectors from the
collections {Yj,j = m + 1, ... , m + h}, 0 ~ m ~ n - h, n ~ h + 1, three vectors
Yil, Yh, Yja can be found such that the matrix
3
y(h) _
m -
!"
3 L...JY3'.. ()) Y3
'.. ())
i=l

is uniformly positive definite:

inf inf 'xmin(y~h)())) >,xo > O.


n-h>m>O (JET -
n~h+l

IX. i: 1'I/I('x)iP d,X < 00 for some p ~ 1.

Let us further assume that Tee is a fixed compact set containing the point
()otogether with some neighbourhood of it.
The functions r(m) , m = 0, 1,2, admit a s.a.e. analogous to Lemma 34.3 of the
preceding Section.
LEMMA 36.1: Let us assume that conditions I-IV are satisfied, and that for any
p>O
(17.4)

Then for m = 0,1,2,

r<m) = u- 2 { Ab~ + n- I / 2 t. a~m) Ai + n- l t, f3}m) Bi } + c(m)n- 3 / 2 , (17.5)


17. POWERS OF TESTS 233

where
(I) the c(m) are r.v.-s having the following property:

supp;{lc(m)1 ~ C~m)logs/2n} = o{n- 3 / 2 ), (17.6)


9ET

(2) A1 = A2b~~, A2 = A3II12b~,


B1 = A3b~b~, B2 = A4II12b~b2'
B3 = Asm2bt, B4 = A4II22bt,

Bs = A4II 13 bt, Ba = A3b~b3; (17.7)

(3) the coefficients a~m), i =


1,2, 13}m), i =
1, ... ,6, characterising the func-
tionals T(m) do not depend upon nand (), and are listed in Table 4.9.
Proof: The proof uses the s.a.e. of Lemma 34.1:

(17.8)

where for q = 1

ho = Ab1 ,

h1 = A2 b1~ -"23 A3 II12b21 ,


3 2 9 4 2 132
h2 = A b1b2 - "2 A II12b1b2 + "2 A b1b3

+ ( "29 As II12
2 2 4 1 4
- 3 A II13 - "2 A II22
) 3
b1 (17.9)

Let us note further that under the conditions of the Lemma the functional ,(2)
admits a representation analogous to (16.7)-{16.9):
,(2) = ,[2] + t(2)n-3/2, (17.1O)
,[2] = u- 2 {Iu 2 + 2II12u3n-1/2 + {II22 + II 13 )u4n- 1}, (17.11)

where t(2) is a r.v. having the property (16.9).


We obtain the a.e. (17.5) for ,(2) by substituting (17.8), (17.9) into (17.11).
We obtain the a.e. (17.5) for ,(0) immediately from (17.1) taking into account
that
dd L(7)/ = - 2n- 1 / 2 b1 (()), P;-a.c ..
7 r=O

A corresponding result for the functional ,(1) can be obtained by expanding it in a


Taylor series and substituting in this expansion of the a.e. (17.8) (see Lemma 34.3,
Section 16).
234 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Without specifying the dependence upon m, let us rewrite the principal part
of the s.a.e. (17.5) in the form
a 2,T' = Abi(1 + An- l / 2 + Bn- l ), (17.12)
where

A = a l Ab2 + a2 IIl2 bl, (17.13)


B = rhA2b~ + ,82A2IIl2blb2 + ,83A4m2bi
+ ,84 A3II 22 bi + ,85 A3IIl3 bi + ,86 A2bl b3 . (17.14)

LEMMA 36.2: Let the conditions (17.4) and I-IV be satisfied. Then for T = T(m),
m = 0,1,2,

(17.15)
and

(1) w=a-lAl/2bl{I+~An-l/2+ (~B- ~A2)n-l}; (17.16)

(2) there exist constants C4 = C4 (T) < 00 such that

supP;{IDI ~ c4Iog5/2n} = o(n- 3 / 2 ). (17.17)


9ET

Proof: Let us consider the r.v.


W = a-l Al / 2bl (1 + An- l / 2 Bn-l)l/2.
By Lemma 36.1
supP;{IT - W 2 1 ~ c3Iog5/2n} = o(n- 3 / 2 ).
9ET

Let the event


Xl = n3

r=l
{lbr (8)1 ~ c~r) logl/2 n},

be realised, where
c~r) = c~r) (T) < 00
are certain constants. Then by the binomial expansion
W = W + En- 3 / 2
where E is the r.v. for which
17. POWERS OF TESTS 235

where
ct; = ct;(T) < 00
is some constant.
Consequently if the event Xl is realised then the representation (17.15) holdsj
however, instead of (17.17) it is possible only to assert that

sup P;{IDI ~ c41og5/2 nj Xd = o(n- 3 / 2 ).


9ET

For the completion of the proof we appeal to Theorem A.5.


The distributions of the functionals T(m) converge as n -+ 00 to the xi distrib-
ution (c/., Section 16 of this book). The meaning of the representation of T(m) in
the form (17.15) is that we obtain the possibility of passing from criteria exploit-
ing the x2-ap~roximation of the distribution of the functionals 7(m) with critical
regions {7(m) > Z2} to the modified criteria q,~m) having a series of technical
conveniences (see below) with critical regions

and asymptotically normal W.


Let us introduce a series of quantities in terms of which we can formulate
the answer to the problem of the comparison of the powers of the test under
consideration.
Let us denote
k
J.t1 .. t.k (8) = ",. + ... +tk. A!(i +2 i2 +...+ki
1t1
1 k )n- 1 ~ IIgiv(J'
L..,.; V"
8) (17.18)
v=l

where "f8 is the cumulant of the 8th order of the r.v. Ojj the key role in the
presentation later on is played by the quantities
Ik = O'-k Jk = "fkO'-kn-......,.......
11 ... 1 , (17.19)
k

and the products of the quantities (17.18), (17.19) of the form

102 O'-2J2 J 02 = O'2A2n22'


1101 0'-2 J 2 J 101 = 0'2 A2n 13 ,

111 = 0'-1 J 11 = O'A3/ 2n 12 ,


121 0' -4J2J 21 = "f3O' -2A2n 112 ,

13 1 11 0'-4 J 3 J 11 = "f3O'-2 A3n 111 n 12 ,

Ii = O'-6J2 - "f 2 O'-6A 3 n 2


3- 3 111,
14 = 0' -6J2J.4 = "f4O' -4A 2n 1111 .
236 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Let us set

B1 = 102 - 1;1 ~ 0, (17.20)

B2 = 121 - hIll , (17.21)

B3(a, f3) = aI 4 + f3 I 5, a, f3 E 1R? . (17.22)

For comparison of the powers of the modified criteria qi~m) it is necessary to


equate their errors of the first kind. Let us consider the critical regions

X(z;t",z~) = {W(Oo,X) > z;t"} U {W(Oo,X) < z~}, (17.23)

where

(17.24)

in which a = U1-a/2 is the quantile of the standard normal distribution, b, c


depend upon the number m of sequences selected with the condition

(17.25)

For obtaining the a.e. of the probability of the error of the first kind and the
power of the modified criterion it is necessary to have available the a.e. of the
statistic
W = W(Oo,X)
under the null hypothesis 00 and the alternative

00 = 00 + 8n- 1 / 2 .
We obtain the expansion of the r.v. W for 00 , writing the expression (17.16) in
orders of n- 1 / 2
Let us denote

From (17.16) it follows that for an arbitrary alternative 0 E T the critical


region (17.23) is given by the statistics

W = a- 1A1/ 2(00)b 1(00;X)

+ ~ a-I { alA 3/2(00)b1(00; X)b 2(00; X)

+ a2A5/2(00)II12(00)b~(00; X) }n- 1/ 2
17. POWERS OF TESTS 237

1
+2 (34A7/2 (90)TI22 (90)b13 (90; X)

+ ~ (35A7/290)TI13(90)b~(90;X)

(17.26)

Let us call the reader's attention to the fact that criteria using the statistics W,
in spite of their specific awkwardness, are simpler than generating criteria (except
Rao's criterion), in the sense that the r.v. W depends on 90 and the observations
X, and do not depend upon the estimator On. The last property drives them
together with Rao's criterion.
The expansion of W (modP~) is not difficult to obtain from (17.26) in the
following way. Clearly

(modP~)

On the other hand, it follows easily from condition II that

= 6TI1r (90) + ~ u2TI2r(90)n-1/2 + ~ 63TI3r(90)n-l


+ O(n- 3 / 2 u 4 ). (17.28)

Let us denote
A = u- 1[1/2(90 )6.

Substituting (17.27) and (17.28) in (17.26), and collecting similar terms, we obtain
P~-a.c.

(17.29)

where
2
W = L W;n- II / 2
11=0
238 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

and, (1):
W; = A+0'-lA 1/ 2b1, (17.30)

W: 0'-1 (~a1 + ( 2) A1/21Ub1A + ~ a1Ab2A

+ '12 (a1 + a2 + 1)luA2 + 1 3/2 1 2


20' alA b1b2 + 20'2 a2 Alu b1 ,(17.31)

Wi = 0'-2 A {~,84102 + (~,86 + ~,85) llOl

+ {~(a1 - ai - a1 ( 2) +,81 + ~,82 } Alub2A2


+ ~ 0',86A3/2b3A2
2

+{ (~a1 + ~ ,84)102 + ~ (,86 +,85 + ~)1101


1 1 1 2 1 1 2
+ ( 4 a1 + '2 a2 - 8" a 1 - 4 a1 a2 - 8" a2
17. POWERS OF TESTS 239

+ 0'-2 (- ~ a1 a 2 + ~.82 )A2 Illb~b2


+ ! 0'-1.86 A 5/2b~b3
2

+ ~ 0'-1 A3/2 {.84I02 + .85hOl + ( - ~ a~ + .83) I;l} j (17.32)

(2) W2' is a r.v. having the following property: there exists a constant C7 < 00
such that
(17.33)
If we omit the terms in (17.30)-(17.32) containing ,xr, r = 1,2,3, then we
obtain the expansion of the r.v. W for the null hypothesis. Let us thus underline
that all the functions of (J in the formulae (17.30)-(17.32) are calculated at the
point (J = (Jo.
From further arguments it follows that
sup Ip~ {W > u} - p~ {W" > u}1 = O(n- 3 / 2 Iog3 / 2 n),
u~O

and, analogously,
sup Ip~{W < -u} - P~{W" < -u}1 = O(n- 3 / 2 Iog3 / 2 n).
u~O

Let us denote
kj = kjo + kj1 n- 1/ 2 + kj2n- 1 + o(n-1), j = 1,2,3,4, (17.34)
the a.e. of the cumulants of the jth order ofthe r.v. W in the measure p~ . Then
using the s.a.e. (17.29)-(17.32) of the statistics W one can formally calculate the
quantities required later on
m"
kjv =L kjll (r),xr , v = 0,1,2, (17.35)
r=O

which are presented below:

klO(l) = 1,
1
ku(O) = '2 (a1 + (2)lu ,

1
k12(2) = '2 (a1 + a2 + 1)lu,
ku (1) = (-18 a12 + 1 3
'2.81 + ) 102 + '23 (.86
'2.84 + .85)1101
240 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

+ (-1-4 a12 - -43 al a2 - -83 a22 + /31 + -3


2
/32 3
+ -2)
/3s 2
111
'
k20(0) = 1,

k21(1) = 2(a1 + a2)ln ,


k 22 (0) = (/31 + 3(34)102 + 3(/36 + (35)1101

+ {- ~ (al + a2)2 + 2/31 + 3(/32 + /3s) } lrl


+ a1121 + a21s I n ,

k22(2) = (1'2 a1 + 4
1 a12 + '2/34
3 ) 102 + 3(/36 + (35)1101

{III
+ '2 a1 + a2 + '2 a1 a 2 + 4 a22 + 3(/31 + /32 + /3s) } In,
2

kS1(0) = 3(a1 + a2)111 + Is,

kS2(1) = 3 (~ a~ + 3(34) 102 + 9(/36 + /35 )1101

+ 3 {5'2 a1a2 + 4
3 a12 + 5 2
4 a2 + 3(/31 + /32 } 2
+ /3s) 111

+~a1121 + (~al +a2) Isln,


k 42 (0) = 12 { (~a~ + (34) 102 + (/36 + (35)1101

'2 a 1 + 43 a22 + '23 al a2 + /31 + /32 + /3s )2


+ C2 III

+ ~ a1 121 + (~al + a2) Isl11} + 14


The remaining kjl/(r) are equal to zero.
Let us denote

pi = P~{W > z~},


Clearly the power of the modified criterion is equal to

(17.36)
17. POWERS OF TESTS 241

Let us remark in passing that for 8 = 0 (17.36) is turned into the equality
Po = pet + Po- , (17.37)
where Po is the error of the first kind of the modified criterion.
THEOREM 36: Let the conditions I-IV, VIII, and IX be satisfied. Then
2
P6 = L 'frvn- v/ 2 + 0(n- 1), (17.38)
v=o

where

(1) 'fro = I(a; 1, A2) = 1


00
!(X; 1, A2) dx, (17.39)

and where !(x; 1, A2) is the density of the non-central X2 distribution with one
degree of freedom and non-centrality parameter A2 ,.

r ll rv
(2) 'frv = cp(a - A) L AjQvj(a) + cp(a + A) L(- A)jQVj(a), (17.40)
j=1 j=1

where v = 1,2, T1 = 2, T2 = 5, and where

and the QVj(a) are polynomials in a with coefficients depending upon the cumulants
kj, j = 1,2,3,4, of the r.v. w.
Proof: Let us outline the proof of the Theorem. Let us use the Edgeworth expan-
sion of the d.f. of the vector v (0) E ]R3 and the a.e. of the statistic W, understood
as a function of the coordinates of the vector v(8) for the notation of the desired
probabilities ptin the form of integrals and the subsequent transformation and
estimation of these integrals. Such an approach is fruitfully used in the preceding
part of the book.
The calculation of Pt
can be carried out by at least two methods. The first
method consists of calculating the integrals arising in the process of proving the
Theorem by the method indicated above. The second method is the application
of the 8-method, described in Section 14, to the asymptotic normal statistics of
W.
Both methods, unfortunately, are equally laborious, although this is an intrinsic
peculiarity of the problem under consideration. In this Section the calculations
are carried out by the 8-method. F,ollowing [49-51] we find

pi = P~{W-A>zt-A}
2
1- c)(x) + cp(x) L Vv+(x)n- V/ 2 + o(n- 1 ), (17.41)
v=1
242 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

where
~(X) = i~ <p(t) dt, x = a- A,
and
3v
Vv+ = L.B;!'"vHr-1(x), v = 1,2, (17.42)
r=l

where the Hr(x) are the Chebyshev-Hermite polynomials of order r. The coeffi-
cients .B;!'"v in (17.42) have the form

.Bt = -b+ + kll'


1
.Bt = 2" k21'

1
.Bt = 6 k31'
.Bt; = -c+ + k12'

.Bi2 = ~ [k22 + kll (kll - 2b+) + (b+)2],


1
.Bt = 6 [k32 + 3k21 (kll - b+)],

.B12 = 2~ [k42 + 3k~1 + 4k31 (k ll - b+)],

1
.B~ = 12 k21 k 31 ,

1 2
.B~ = 72 k 31 (17.43)

It is clear that if we set 6 = 0 in (3.41) (consequently A = 0 also) then

pro{W > zt} = pet

In (17.41) let us choose b+ and c+ from the condition V1+ = V2+ = 0 for A = O.
Then

(17.44)

Since
k12 (0) = k21 (0) = k32 (0) = 0,
17. POWERS OF TESTS 243

then from (17.42) and (17.43) we obtain

(17.45)

(17.46)

i.e., b+ is an even and c+ an odd function of a. Substituting (17.45) and (17.46)


into (17.43), we bring V/(x) into the form
Tv

V/(x) =L )..jQvj(a), x = a -).., v = 1,2. (17.47)


j=l

The calculation of P6- is realised completely analogously and leads to the re-
lations

and
a
p,-
0 = P~{W < z';} = "2 +o(n- 1 ), (17.48)

P-
6 P~ {W - ).. < z'; - )..}
2
= <p(x) + <p(x) L Vv-(x)n- V/ 2 + o(n-l), (17.49)
v=l

where
x=-a-)..
and
Tv

Vv-(X) = L(-)..)jQVj(a), v = 1,2. (17.50)


j=l

The relations (17.44) and (17.48) show that the errors of the first kind of all
the modified criteria are equal to

pet + Po- = a + o(n- 1 ) (17.51)


The relation (17.40) follows from (17.47) and (17.50). Finally,

11'0 = (Xl <p(x) dx + ('0 <p(x) dx


Ja->.. Ja+>..
(17.52)
244 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

For the completion of the proof of Theorem 36 it remains to write out the poly-
nomials QVj(a):

Qll(a) = {~k21(1) - ~ k3 t{0)} a,


1 1
Q12(a) = kll (2) - 2 k21 (1) + (3 k31 (0),

Q21(a) = {-~k42(0)+ 158k~1(0)+ ~k32(1)- ~k31(0)k21(1)}a2

+{ + (3 k31 (0)k21 (1) - 21 k22 (0)


k12 (1)I

5 k 2 (O) - (3I
- 36 I
k32(1) + S}
k42 (0) ,
31

Q22(a) = 2 (1) - (31 k31 (0)k21 (1)


{IS k21 + 18 2 (0) } a3
1 k31

+{~k22(2)- ~kll(2)k31(0)- ~k~1(1)


5 1 2
- "31 k32 (1) + (3k21 1 }
(I)k31 (0)- "3 k31 (0) + Sk42(0) a,

Q23(a) = {I2 k21 (1)kll (2) - "31 k31 (O)kll (2) - S 2 (1 )


3 k21

5 k21 (l)k31 (0) - 9"1 k31


+ 12 2 (0) } (a 2 - 1)

+ { k12(3) - 2
1 k22(2) + (3I
k32(1)I
- 24}
k42 (0) ,

Q24(a) = {~ k~1 (2) - k21 (l)kll (2) + ~ k~1 (1)

+ 21 kll (2)k31 (0) - "31 k21 (l)k31 (0) + 5k31


72 2(0)
} a,

Q25(a) = - 21 kll2 (2) + 21 k21 (l)kll (2) - 1 2


S k21 (1)

1 1 1 2
- (3 k31 (O)kll (2) + 12 k21 (l)k31 (0) - 72 k31 (0).
Let us consider the problem of the comparison of the powers of the criteria \I1~m)
based upon the statistics W. In the preceding part of the Section it was shown, in
17. POWERS OF TESTS 245

particular, that under the appropriate conditions of regularity these statistics have
identical limiting Gaussian (0,1) distributions for the null hypothesis and identical
accompanying Gaussian distributions (A, 1) for a sequence of close alternatives.
(We can speak of the limiting law (AO, 1) only in the event that there exists

lim I(()o)
n-+oo
= 10
Then
\ _
"0-0'
-lL01 / 2 () )

The quantity
e1 = 11"0 = I(aj 1, A2)
gives an idea of the local behaviour of the power curves of the modified criteria in
a neighbourhood of the point ()o. From (17.39) it follows that e1 does not depend
upon m. In particular, if there exists

lim e1 = I(aj 1, A~),


n-+oo

then the preceding assertion means that all criteria considered have one and the
same Pitman efficiency, i.e., they are asymptotically equivalent.
In order to establish the distinguishability of the criteria it is necessary to
investigate their equivalence at a higher order. It is known [179,34] that for a
broad class of one-sided criteria the efficiency of the first order implies the efficiency
of the second order. Below we come up against the phenomenon, similar in form,
(the values 11"1 in (17.38) do not depend upon m) which, however, has a quite
different nature, and is associated essentially with the property of the modified
criteria under consideration to be two-sided ones. A corresponding fact does not
hold for the original one-sided variants of the criteria under consideration.
And so our goal is the calculation and comparison of the quantities 11"2 = 1I"~m)
THEOREM 37: Let the conditions of Theorem 96 be satisfied. Then the polynomials
Q/ljand the quantities b+, c+ defining the powers of P6 and error of the first kind
Po of the modified criteria w~m) can be represented in the form:
1
Qu = - -Iaa
3 '

6 Ia +'2 Iu ,
1 1
Q12 = (17.53)

Q21 = { - -1 01B1
8
2 - -1 01B2
2
+Ba ( --8'1 -185)} a 2

+Ba(~, -:6) ,
246 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

Q23 { B3( 0, - ~) - ~ 13 111 } (a 2 - 1) + B3( - 2~ ,0) + ~ 1101 ,


Q24 { B3(0, :2) + ~ IiI + ~ 13 111 } a,
Q25 = B3 ( 0, - 72
1) - 8
12 1 13 111 ,
III - 12 (17.54)

b+ = {2 (a1 I I+}
+ (2)111 "6 a - "613 2 1 13,

c+ = {I16 a1(a1 -1)B1 + 4 a1B2 +B31 (1 1)


24' -18

+21 (a1 + (2) [14 (a2 + 4)102 + "61 (a2 + 5)1101

+4
1 (a1 2
- a2 - 10)111 + 1
313111 ]} a3

+ { - -81 a1(a1 - 2)B1 - -1 a 1B2 + B3 ( - -1 -5)


4 8' 36

- ~ (a1 + (2)13111 } a, (17.55)

where ai = a~m), i = 1,2, are the coefficients from (17.5).


Proof: We obtain the expressions (17.53)-(17.55) with the help of the formulae
(17.35), the result of Theorem 36, and the relations (17.45), (17.46).
Let us also take note that the expression for 1r1 does not depend upon the form
of the criterion, and in the expressions for 1r2 the polynomials Q21 and Q22 depend
only upon the criterion.
Let us return to the direct calculation of the difference
~ _ 1r(r) _ .".(s)
rs - 2 "2

for the rth and 8th criteria and determine the condition for which the rth criterion
is more powerful than the other two.
Expanding rp(a>.) by Taylor's formula we obtain from (17.40), (17.53), (17.54)

~rs = ~ a- 2 arp(a)I(Oo)152 Drs(Oo) + 0(154 ), 15 -t 0, (17.56)

where

(17.57)
17. POWERS OF TESTS 247

d1,rs = (a~S) _ a~r)) {( a~S) + a~r)) a 2 - 2}, (17.58)

d2,rs = 2(2a 2 - 1) (a~S) _ a~r)) . (17.59)

And so the quantity ~rs to within 0(8 4 ) is a linear combination of the quantities
B1 and B2
!
Let d2,rs > 0 (this condition is satisfied for a 2 > and the proper enumeration
of the criteria, for which a~s) > a~r)). Then the inequality Drs 2: 0 is equivalent
to the inequality

B d 1 rs ( a(S) + a(r)) a 2 - 2
- 2 > - -' - =- 1 1
--'------:---;:-..!....-,-:---- (17.60)
B1 - d2 ,rs 2(2a 2 - 1)

Hence, knowing the coefficients aim) of the compared criteria, we find in the half-
plane OB 1 B2 of the regions of preference of the criteria w~m), m = 0,1,2 (see
Figure 3.1). A more subtle interpretation of the results obtained is presented in
the final section of Chapter 4.
The method of comparison of the criteria w~m), m = 0,1,2, as n -t 00 ad-
mits a generalisation to a broad class of two-sided criteria, the critical regions of
which are assigned by statistics of the form (3.4) and different sets of coefficients
{a1' a2, .81, ... , .86} In particular the criteria with generating statistics T repre-
sentable in the form (17.5) belong to the class. Let us denote by K(a) the class
of two-sided criteria wn of size an = a + o(n-1), the critical regions of which are
given by the statistics (17.26).
DEFINITION: We shall say that the criteria wt), w~) E K(a) are asymptotically
equivalent if

lim lim !!:. (pC r ) _ pCS)) = O. (17.61)


82
o-tO n-too u 0

Since under the conditions of Theorem 36


Po(r) _ p(s) _
0 -
A
"-l.rsn
-1 + 0 (n -1) , (17.62)

then from (17.56)-(17.59) there follows the following assertion.


THEOREM 38: If the regularity conditions I-IV, VIII, and IX are satisfied, then the
criteria wt), W~) E K(a) are asymptotically equivalent if and only if r ) = als) . al

Since the quantities b, c and the polynomials Qvj defining the errors of the
first kind Po and the powers Po ofthe criteria Wn E K (a) depend only upon a1 and
a2, then those criteria in K(a) are of special interest for which .81 = ... = .86 = 0
in the expression (17.26).
248 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

(32

Rao criterion; al =0

I
Neyman-Pearson criterion;
0!,=1

Wald
criterion; at =2

Figure 3.1: Regions of preference of the criteria 'l1~m). (a 2 > 2 (a < 0.1576)).
17. POWERS OF TESTS 249

Let us assume that the error of observation Ci has the property ,3 = O. Then
the difference of the powers of the criteria w!:'), w~) E K(o:) is determined by the
quantity d1 ,rB' Let the quantity o:~B) = 0:1 be given. Let us indicate that it is
the criterion w!:') which has the maximal power in the class K(o:). For this let us
define the point of the maximum of the function

d1 ,rB(x) = (0:1 - X){(O:l + x)a 2 - 2}. (17.63)

Clearly the maximum of (17.63) is attained for x = a- 2 and


(17.64)

And so the maximal power in the class K(o:) is had by the criteria for which
0:1 = a- 2 in the representation of statistics (17.26). The simplest criterion of such
a form is obtained for 0:2 = (31 = ... = (36 = 0 and is given by the statistic

(17.65)

DEFINITION: We shall say that a criterion wn E K(o:) is u-representable if it has


a generating statistics T which admits a s.a.e. of the form

r = pI + tn- 3/ 2 ,

where the r.v. t satisfies (16.9), and r[ I has the form


pI = C1- 2 {Iu 2 + clII12u3n-1/2 + (c2II22 + C3II13)u4n-1}. (17.66)

Formula (17.66) evidently generalises (17.11). Formally, along with others,


criteria with t = 0 and T = T[ I are u-representable. The coefficients {0:1' 0:2, (31,
... , (36} and Ck, k = 1,2,3, of the u-representable criteria satisfy the relations

(31 = 1, (32 = 3,
(34 = 49 (5 - 2cd, (35 = C2 -1, (17.67)

From (17.67) and the preceding Theorem there follows:


COROLLARY 38.1: Under the conditions of Theorem 98 u-representable criteria
are asymptotically equivalent.
250 CHAPTER 3. ASYMPTOTIC EXPANSIONS RELATED TO THE LSE

In the case of Gaussian errors ej bs = 0 and, consequently, 8 2 = O} the results


of the Section are consistent with the results of the works [49-51] on the greater
power of the criterion 'IJI~O) with respect to the Neyman-Pierson and Wald criteria,
since for these criteria

if only a > ../2.


However, in the general case ('Ys =F O) it does not follow from (17.56}-(17.59)
that Rao's criterion is more powerful than the criteria of Neyman-Pierson and
Waldo
Chapter 4

Geometric Properties of Asymptotic


Expansions

The linear theory of estimation by the method of least squares uses the language
of algebra and plane geometry. In the non-linear theory planes yield place to
surfaces and inference acquires a local character. Therefore the natural geometrical
language in non-linear regression analysis is the language of differential geometry
and tensor calculus. Nowadays an intensive geometric reinterpretation of the basic
concepts of mathematical statistics is made. One of the goals pursued in this
consists in the move from geometric invariants of statistical matters, alloted a
geometric structure, to invariant statistical inference.
In this Chapter we consider a series of questions about the differential geometry
of non-linear regression models, and we suggest a geometric interpretation of the
results about a.e.-s in Chapter 3. In the following the tensor sum notation is
extended also to summation over indices from 1 to n, etc ..

18 CERTAIN ASPECTS OF THE DIFFERENTIAL GEOMETRY


OF MODELS OF NON-LINEAR REGRESSION

18.1 EMBEDDED RIEMANNIAN MANIFOLDS AND STATISTICAL


CONNECTEDNESS

Let M be the basic model (0.1) of observations. The model M is embedded in the
'free model' S

X=g+c, (18.1)

where

9 = (g(l), ... ,g(n)),


and 9 takes any value. Let us denote

S(X, g) = - 2~2 L [Xa - g(a)]2. (18.2)

251

A. V. Ivanov, Asymptotic Theory of Nonlinear Regression


Springer Science+Business Media Dordrecht 1997
252 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

For Gaussian Cj S(X,g) is, clearly, the logarithm of the probability density of
the vector X. Otherwise (18.2) is the initial formula of the geometric theory.
We shall consider S as a parametric family of functions S = {s(X,g)} which
forms an n-dimensional manifold with coordinate system g. The model M corre-
sponds to the family of functions

M = {m(X,O) = s(X,g(O)),g(O) = (g(I,O), ... ,g(n,O),O E SC}


and it occupies a q-dimensional part of S defined in IRn by a curve for q = 1, and
for q > 1 by a surface

sq = {g E IRn :g(j) = g(j,O),O E 8}. (18.3)

In this way we consider the model M as a q-dimensional statistical manifold em-


bedded in S.
Let us set

(18.4)

where Pgn is the shift by the vector 9 of the measure pn.


The space of r.v.-s T8(S) with basis {8a }, a = 1, ... , n, is tangential to S at
the point s.
Let us give the metric Tab on S setting

(18.5)

for each Ts (S) .


Let Tm(M) be the tangent space of M at the point m E M. The space Tm(M)
is spanned by the vectors

(18.6)

According to (18.4)-(18.6) the metric tensor Tij induced on M has the form

(18.7)

the tensor associated with Tij is

(18.8)

The metric Tab of the enveloping manifold S does not depend upon s E S, and the
induced metric Tij of the embedded manifold M depends on the local coordinates
oof the point m E M.
Let us denote
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 253

where
8 aj = 8 (8S(X,9))
8(Jj 8 ( ) = - (J' -2 gj ((J)
a, Pon -a.c.,
9 a 9=9(0)
i.e., P;'-a.c.
8 ij = (J'-2n 1 / 2 bij ((J) - (J'-2nII(i)(j)((J)

= (J'-2n 1 / 2 bij ((J) - Tij((J). (18.9)


Using the formulae (18.6)-(18.9) let us introduce the Christoffel symbols of the
first and second kind (the coefficients of the affine connection) of the Riemannian
space M by the relations
(18.10)
k
r ij = T kpr ij,p -- AkpII (ij)(p) (18 11)
The statistical connection V defined by (18.10) and (18.11) is said to be ex-
ponential. A more general type of statistical connection in M, the so-called a-
connection v(a) [3-5] uses the concept of the tensor of skewness Tijk((J), which in
our case has the form
Tijk = E8i 8 j 8k = 1'3(J'-6 n II(i)(j)(k) . (18.12)
The meaning of the a-connection being considered
(a) 1- a
rij,k = (J' -2 nII(ij)(k) - -6
-2- 1'3(J' nII(i)(j)(k) (18.13)

consists in the connection taking into account for a f 1, 1'3 f 0, the skewness of
the errors of observation Cj. Clearly
r~~)k
'3,
= ri3" 'k ,
and for all a
r~~k)
"3,
= ri3" 'k ,
if
1'3 = rn3 = 0
(in particular if the Cj are symmetric r.v.-s).
In Section 19 of this book the first terms of the a.e.-s of Chapter 3 are in-
vestigated in detail from the viewpoint of the exponential connection. A fuller
geometric theory of the a.e. in non-linear regression analysis, at any rate, in the
spirit of Section 18, is not available to date.
The exponential connection V = V(l) is expressed in Riemannian manifolds
by the metric tensor, namely:
1 (8Tjk 8Tik 8Tij)
r ij,k = 2 8(Ji + 8(Jj - 8(Jk (18.14)

Since the symbols rij,k are expressed in terms of Tij the connection V reflects the
intrinsic geometry of the manifold M. However, for an embedded manifold the
connection can also be defined externally.
254 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

18.2 STATISTICAL CURVATURE

Let a covariant tensor field Vi = Vi(8) be given. Then its absolute (covariant)
derivative is given by the second rank tensor [191]

VjVi = Vij = 8Vi


80j
k
- rijVk. (18.15)

Let
Na = N~8a, a = 1, ... ,n- q,
be a basis of the space T;;{M) orthogonal to the space Tm{M). The derivatives
of the vectors 8i are written via the derivation formulae

8~j 8i = gij{a,0)8a = rfj 8k + BijNa , (18.16)

where the Bij are the matrix elements of the second quadratic form, with the
corresponding direction Na [155].
The relations (18.16) characterise infinitesimally small alterations of the vectors
of the moving frame referred to itself [155,191].
If in (18.15) we set
Vi = 8i
then from (18.16) it follows that the covariant differentiation (18.15) means geo-
metrically the projection of
gij{a,0)8a = a-2nl/2bij(8)
onto T;; (M).
Let there further be given a contravariant tensor field Vi = Vi{O). Then the
covariant derivative

(18.17)

represents the tensor with one covariant and one contravariant component [191].
The operation of covariant differentiation is not commutative, namely: differ-
entiating covariantly equation (18.17) and then interchanging the indices j and k
we obtain
i - v;i
V3,ok k,3 - V'Ri3Ok ,"
0 -

where

(18.18)

is the rank four tensor called the curvature tensor (or the Riemann-Christoffel
tensor). Lowering the upper index of the tensor (18.18) we obtain the covariant
curvature tensor

80 k + rskp rp)
Rlk,ij = ( 8rii Ii rsj -
(8rti s p
88' + r lp r ki ) rsj (18.19)
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 255

Using the properties of the covariant differentiation operation, this tensor can
be written in a form more convenient for calculation [191]

rP r
R'k,ij = arli,j
a(}k - kj Ii,p - ----aoz + r
arki,j P
Ij
r
ki,p' (18.20)

The symmetric tensor obtained by contraction,

Rki = r l"J R'k,ij , (18.21)

is called the Ricci tensor, and its contraction

R = r k''Rki = r k'Irl"J R'k,ij (18.22)

is the scalar curvature of M, or the Ricci curvature.


With the aid of formulae (18.8), (18.10) and (18.12) it is easy to convince
oneself that the tensor (18.20) has the form

R'k,ij = nO'-2 {II(U)(jk) - II(ki)(jl)

+ Apr (II(/j)(p)II(ki)(r) - II(/i)(p)II(kj)(r)}' (18.23)

and the scalar curvature is

R = n- 1 {(A - B) + (C - D)}, (18.24)

where

A = 0'
2
AI'J A''k II(li)(jk) ,

B = 0'
2
AI'J AI'k II(lj)(ik) ,

2 Apr A'j AikII


C = 0' (ik)(p)
II
(/j)(r) '

D = r ik
0' 2Ap A ' jA II (Ii)(p) II (jk)(r)' (18.25)

For
q=dimM=1
the curvature tensor R'k,ij together with the Ricci tensor Rki and the scalar curv-
ature R become zero. In this connection let us consider one more concept of
curvature defining the 'extrinsic' geometry of the manifold M.
Let
No< = N~aa, a = 1, ... , n - q,
be an orthogonal basis of T~(M). Associated with the direction No< are the
principal curvatures kr, . .. ,k~ of the manifold M at the point m EM, and are
defined as the eigenvalues of the bunch of forms Bli - krij, i.e., as roots of the
equation
det(BO< - kr) = 0,
256 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

where
Ba = (Bij) and r = (rij)
are the matrices of the second quadratic form corresponding to N a and the metric
tensor. The mean curvature in the direction N a is the name given to the quantity
[161]
q
ka = Lkf = trr-1B a =rijBij. (18.26)
i=l

Let us introduce the mean curvature vector [155,156] N = kaNa. Using (18.16)
we obtain

(18.27)

i. e., the vector N does not depend upon the choice of the basis of the space T~ (0).
We find the square of its length

The quantity

(18.28)

is called the Efron curvature. In contrast to R the curvature H does not become
zero when q = 1. By reason of its definition it is non-negative, i.e., H ~ 0, at the
same time as the curvature R can take values of both signs.

18.3 MEASURES OF THE NON-LINEARITY OF REGRESSION MODELS

Measures of non-linearity are the characteristic numbers defining the extent of the
divergence of the non-linear regression model from its linear approximation and
the possibility of using this approximation in statistical inference. The Ricci and
Efron curvatures could serve as examples of measures of non-linearity. However,
the immediate use of the curvature R proves to be inconvenient because it is able
to admit not only positive but also negative values, and consequently can not be
used as an index of the non-linearity of a model M.
In the theory expounded there exists a clear correspondence between the basic
concepts of differential geometry of an embedded statistical manifold M and a
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 257

q-dimensional surface sq in IRn, given in (18.3). This fact is easy to explain in the
following way. Let

9 = (g(a)):=l ' 1 = (j(a)):=l '


g = (1'g(a)oa, :F = (1' I(a)oa'
Then the mapping cp(g) = 9 is an isometry (cp:Ts(S) -+ IRn):

Eg:F = (g, f)n.


where ( . , . }n is the scalar product in IRn. Thus, for example, the vector

((1'-lgk(a,8)):=1
corresponds to the basis vector Ok E Tm (M). Let us introduce the n x q-matrix

Then
(1'-2 F' F = r = (rij)
is the matrix consisting of the coordinates of the metric tensor of M which coincides
with the metric tensor of the surface sq to within a factor of (1'-2.
The material of this section can be conveniently presented using the geometry
of sq. Let us consider the tangent plane of the surface sq at the point g(8):

We shall give the name of normal plane to the surface sq at the point g(8) to
the orthogonal complement Nn-q(8) of the tangent plane Tq(8). Evidently Tq(8)
corresponds to Tm(8) and Nn-q(8) corresponds to Tr*(M). Let us denote

H a = (gik(a,8))~,k=1 '
and let
P = F(F'F)-lF', p.l. = In- P (18.29)
be the orthogonal projection operators onto Tq(8) and Nn-q(8). Let us set

Q = (Qab) :,b=l '


where

(18.30)

LEMMA 39.1: The scalar curvature R admits a representation 01 the form

(18.31)
258 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

Proof: Using the expressions (18.24) and (18.25) for the scalar curvature, we obtain
sequentially

(J"-
2
R = n- 1 At"k AI"J { II(ij)(lk) - II(ik)(lj)

- Apr (II(ik)(p)II(Ij)(r) - II(ij)(p)II(lk)(r)}


n
= '2JF'F)-:;.l(F'F);,l (HijHla,. -HikHlj)
a=l
n
-L (F' F)-:;.l (F'F);,l (F'F);rl F; F:(HikHjl - HijH1bk )
a,b=l

which is what was required to be proved.



Let us denote
KT = trPQ, (18.32)

(18.33)

We call the quantities K T , K N and K the tangential (geodesic), normal, and total
curvatures of the surface sq (of the manifold M). The tangential component KT
of the decomposition (18.33) is defined by the parametrisation of the model M
and becomes zero in a geodesic system of coordinates, i.e., r{j (0) = 0 at the given
point 0 for such a parametrisation of a regression function. In order that the
latter assertion be more obvious it is convenient to carry out a certain orthogonal
transformation of the sample space IRn.
Let us consider the orthogonal transformation

U = U(g(O))
of the space IRn defined by the matrix

in which
T=D'F'
is a (qxn)-matrix such that

(F'F)-l = DD'.
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 259

In particular, if D is a triangular matrix then this transformation by the matrix


F is under Kholteski's scheme [189]. The matrix Nis the ((n - q) x n)-matrix
composed of vectors of the orthonormal basis of Nn-q((J).
Let us set

c a =D'HaD, a = 1, ... ,n, (18.34)

and let
n
Aj = LTja ca , j = 1, .. . ,q, (18.35)
a=l
n
Aj = LNja ca , j = q+ 1, ... ,n. (18.36)
a=l
By reason of the orthogonality of the transformation U we have

trQ = tr (UQU') = tr A, (18.37)

where

(18.38)

The projection P = T'T has, in the new basis, the form

Therefore by reason of (18.32) and (18.37)

KT = tr (PuUGU')
q

= L { tr A~ - tr 2 Ai} , (18.39)
i=l
n
KN = L {tr A~ - tr 2 Ai} . (18.40)
i=q+l
Let us consider the Christoffel symbols
n n
rti2 = L Aik FakH~i2 =L (DT)iaH~i2 . (18.41)
a=l a=l
Let us denote
260 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

Then
(18.42)

Since
ri(8) = 0, i = 1, ... , q,
for a geodesic parametrisation, then the equalities

Ak = 0, k = 1, ... , q,

are a consequence of (18.42) and the non-degeneratcy of the matrix D.


Let us consider the question of the impact of a reparametrisation of the model
M on the form of the matrices A j Let

8 = 8(0)
be a twice-differentiable one-to-one mapping of e into e,

are the Jacobi matrices of the mappings cp and cp-l. Also let

k = 1, ... ,q.
Let us set

F(O) = (()6() ig(j,8(8))


_ )n,q
"_
J,t-l

F(8)~,

{)2 _ )q
H j (8) = (
()6i {)Ok g(j,8(8)) , _
t,k-l

Then
(F' (O)F(O))-1 = (~-1 D(8))(~-1 D(8))',

D(O) ~-1 D(8)


18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 261

and

T(O) D '(O)F'(O) = D'(O)F'(O)

T(O).

Therefore for j = 1, ... , q,


n
Aj(O) = LTjk(D'HkD)
k=l
n
LTjk D' (<}>-1)' (<' Hk<}> + Fkm <}>m)<}>-l D
k=l
(18.43)

where we have written

To obtain (18.43) the equalities

T=D'F', F' F = (D,)-l D- 1


were used. From the identity

8 2 0i 80 s 80 m 80i 8 2 0m
80 m 80 s . 80k . 80j + 80 m . 80j 80 k =0
we find
<}>'lJIi<1> + (<1>-1) ~ <1>m = 0, i = 1, ... ,q,
or

Consequently

(18.44)

Substituting (18.44) in (18.43) and using the equality

D -1 = D- 1 <}>
finally we obtain

(18.45)
262 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

To the geodesic coordinate system

8 = (8 , ... ,8q )
- iil -

corresponds the condition

Aj(8) = 0, j = 1, ... ,q,


i.e., at the point 8 = 8(8)

(18.46)
From (18.46), with the aid of (18.42), we obtain the relation
ri - x..i ,T,r
-"l'r~' .= 1, ... ,q ,
.; (18.47)
connecting the Christoffel symbols of the surface sq with the required transform-
ation of the parameters. The equality (18.47) corresponds to the formulae (91.4)
of the book [191] and is suitable for when the mapping 8 = 8- 1 (8) is given. Then
the right hand side of (18.47) can be written in terms of the original parameter 8.
For this it is sufficient to calculate

Measures of non-linearity must have a structure with the curvatures KT and


KN, having the additional property of non-negativity. Let l E IRq be some direc-
tion. On the surface sqlet us consider the curve

Differentiating the curve Lq twice at the point A = 0 we define the vectors ai, b, E
IRn with the coordinates

i.e.,

b, = ( I
l H 3l
.)nj=1

Let us represent the vector b, in the form

b, = bT +bl",
where

The quantity

(18.48)
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 263

can be called the normal curvature of the surface sq


at the point g(O) in the
direction 1, since it is the analogue of the normal curvature of a regular surface in
IR3 , which is defined in the form of the ratio of the second and first quadratic forms
of the surface (182). The distinction consists in this, that Kf 2: 0 at the same
time as the second quadratic form can also take negative values. The quantity

KT _ IbTl (18.49)
I - lad 2
we call the tangential (geodesic) curvature of the surface sq
at the point g(O) in
the direction l.
Let d E IRq be a vector of unit length: Idl = 1. Let us consider the direction
1 = Dd. Then

al T'd,
bl = (d,cjdf3=1 '
Ual = (d 1 , .. ,dq,0, ... ,0)',
Ubi = (d'A jd);=l . (18.50)

Since U is an isometry, then in the definition of the curvatures Kf and Kr it is


possible, by formulae (18.48) and (18.49), to replace the vectors al and bl by their
images. Since
lad = IUaDdl = Idl = 1
and the projection P in the new basis has the form Pu, hence it follows that

Kbd = Ibbdl = I(d'A jd);=l I' (18.51)

K Dd IbZdl = l(d'A jd);=q+1l (18.52)

Let the vector


1 = (F' F)-l F' e,
where e is a Gaussian (0, a 2 1 n ) vector, be chosen as a direction 1. The meaning
of the choice of such a direction becomes clear if it is considered that for a lineari-
sation at the point () of the regression functions gi(j, ()()i and the l.s.e. ()~ of the
parameters ()i, i = 1, ... , q,

()~ - () = (F'F)-l F'e.


In this way, for the given 1 the curve Lq is the image on the surface sq of the
straight line passing through the points () and the l.s.e. ()~ of the parameters of
the linearised initial model. Let us note that
264 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

where ho is the first term of the a.e. of the l.s.e. obtained in Section 7.
In the quantities KT and Kf" let us substitute the random direction

Beale [28] proposed considering as a measure of the non-linearity of the initial


Gaussian model M quantities of the form

E;lbd 2
B = E;l ad4
En8 JbIT J2 + En8 JbIN J2
= E;lad 4 E;l ad4
-T -N
= B +B . (18.53)

Since, now,
0'-21ad 2 = IPO'-1 c:12,
thanks to the idempotency of the matrix P, has a X~ distribution, then

On the other hand,

(18.54)

Therefore
1
B = q(q + 2) trG,

G = (2 trGiGj + trG i trGj)~. (18.55)


',3=1 '

BT 1
= q(q + 2) tr PG,

BN 1 .L
= q(q + 2) tr P G. (18.56)

Let us set
r=Tc: and l = Dr.
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 265

Then in Beale's definition of the measure of non-linearity (18.53)


Ual = (r 1 , . ,rq,O, ... ,8)',
Ubi = (r'Ai r)7=1. (18.57)

Substituting (18.57) in (18.53) we obtain (taking into account that r is a Gaussian


(0, a 2 1 q )-vector)

B = 1,",
q(q + 2) L..J (2 tr Ai
22)
+ tr Ai

1
= q(q + 2) tr A,

(
1 ~
2) L..J (2 tr Aj
2
+ tr2Aj) , (18.58)
q q+ i=1
-N
B = ( 1 2)
q q+
t
j=q+1
(2 trA~ + tr 2 A j ). (18.59)

Let sq(l) be a sphere of unit radius in the space IRq, ds the Lebesgue measure on
the surface of the sphere sq(I), and ISq(I)1 the area of sq(I). As Bates and Watts
(25) showed, Beale's measure of non-linearity (18.56) is linked to the curvatures
Kbd and K ~d by the relations

BT
=
-N
B =
Let us set
BT = trPG, (18.60)
The quantities (18.60) are different from the measures of non-linearity introduced
by Beale only by numerical factors, and have a structure analogous to the curv-
atures KT and KN. The latter circumstance allows us to introduce the following
generalisation of the relations (18.32) and (18.60).
Let 151 , 152 E IRl and

where

(18.61)
266 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

The quantities
q

tr PG(8) = L(81 tr A; + 82 tr 2Ai), (18.62)


i=l
n
CN = tr P.lG(8) = L (81 tr A; + 82 tr2 Ai) (18.63)
i=q+1

we call the generalised curvatures of the surfaces sq (of the manifold M). For
81 , 82 ~ 0 the generalised curvatures C T and C N are measures of non-linearity.
According to (18.24) and (18.28) can be written as
R = n- 1 {(A - D) - (B - Cn,
Hence with regard to the equalities
n tr G(8) = 81 A + 82 B
= (8 1 D + 82 C) + (81 (A - D) + 82 (B - C))

we obtain
cT = n- 1 (81 D + 82 C), (18.64)

CN = 81 (R+H) + 82 (H). (18.65)


From (18.63) and (18.65) there now follows:
THEOREM 39: There hold the equalities
n n
H = L tr2Ai' H+R= L trA;.
i=q+1 i=q+1

COROLLARY 39.2: The Efron and Ricci curvatures of the manifold M are related
by the inequality
H+R~O.
(18.66)

The geometric interpretation of the measures C N follows from (18.65), namely:


CN is a linear combination of the 'extrinsic' and 'intrinsic' curvatures H and R of
the surface sq (of the manifold M) of the form
CN = (81 + 82 )H + 81 R. (18.67)
For 81 = 2, 82 = 1 we obtain Beale's measure of non-linearity (see (18.60)):
B = BN = 3H + 2R. (18.68)
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 267

18.4 STATISTICAL INVARIANTS

An object of any sort which is not changed under a local coordinate transformation
is said to be invariant in differential geometry. For example, any point m E M
is invariant because a transformation changes its coordinates () = (}1, ... ,(}q) but
does not change the point.
Every system of points is an invariant, every functions of points is also an
invariant. If such a function in one coordinate system () has the form T(}), then in
a new coordinate system iJ linked to the system () by the relation (}i = <pi (iJi , ... ,iJq )
it is represented in the form

In this way a function of points is defined through the specification in each


coordinate system () of some function I(}) of q variables. A function I is said to
an (absolute) scalar, and its value I(}) is a scalar component in the coordinate
system (). In this way a scalar is the invariant that in each coordinate system has
an unique component.
Let Wo be the components of a quantity w under consideration, w being a con-
crete structure on the manifold M. Let us fix the local coordinates () = (}1, ... , (}q).
Then

Let us consider the function

(18.69)

depending on a finite number of formal variables wo , 8w o /8(}i, etc .. Then I(w, ())
is some function of the variables (}i.
DEFINITION: ([2J, p. 207). If, for a structure wand any coordinate system iJ =
() , ... ,
-1 -
(}q) ,

T(w, (}) = T(w, iJ), (18.70)

then I is called a (scalar) differential invariant.


Equality (18.70) means that T(w, (}) and T(w, iJ) are the representations of the
same function I(}) on M in the coordinate systems () and iJ, respectively.
Thus the Ricci curvature R is a scalar differential invariant of the metric tensor
Tij
The problem which is before us does not lie in the determination of different
invariants, since their construction does not show any particular difficulty. The
problem that unwinds in the theory below is that in order to start from a given
system of basic variables we must construct from them a system of invariants
which will completely describe the solution of the problem of non-linear regression
analysis.
268 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

Such a system of basic variables is naturally called statistically complete, and


scalar invariants constructed from them are called statistical invariants of the
model M.
Let us consider the tensors

(18.71)

In particular,
Wij = E8i 8 j = rij

is the metric tensor, or the tensor of covariance;

Wijk = E8i8j8k = Tijk


is the tensor of skewness (18.12); Wijkl is the tensor of the excess of the model M.
In the geometric theory of asymptotic expansions the tensors Wij, Wijk, Wijkl
play an important role in the construction of statistical invariants.
It is convenient to examine the theory of statistical invariants in the two cases
q = 1 and q 2: 2 separately.

18.5 INVARIANTS OF NON-LINEAR REGRESSION WITH A SCALAR


PARAMETER

Let

(18.72)

be a parametrised curve of general type in the Euclidean space IRn, and

the metric given in IRn. Let us denote


dag a
F~ =
dO a '
(Fq)n
a a=1'

The matrix G = (Gap) is the Gram matrix of the system of vectors Fa, a
1, ... ,n.
Let us introduce the parameter

s= r8 G~{2(U) du,
180
(18.73)

corresponding to the length of arc of the curve 8 1 between the points go, 9 E S1
that have local coordinates 00 and O. Let us write {ei(s)h=1, ... ,n for the Frenet
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 269

frame of the curve SI at the point 9 E SI. The rate of its change as 9 moves along
SI is given by the Frenet formulae [191]

where it is assumed that

ko(s) = kn(s) = 0,

The functions k i (s), i = 1, ... , n, are called the curvatures of the curve SI. The
kr
quantity (]"2 (s) coincides with the Efron curvature (18.28).
Let us denote by tl k , k = 0, ... , n, the principal minors of order k of the Gram
matrix G. By definition we set tlo = 1.
THEOREM 40: The equalities

k,s
2( ) = tli-l(S)tli+1(S)
tlHs)' i=I, ... ,n-l, (18.74)

hold (for simplification of the equations we also omit the argument s below).

Proof: According to [182], p. 259, we can write

k. - {3ii i=I, ... ,n-2, (18.75)


, - {3i+l,i+l '

where the {3ii are the last coefficients in the decompositions


i
ei = L. {3ij F j , i = 1, ... , n - 1, (18.76)
j=1

corresponding to the Gram-Schmidt orthogonalisation procedure of the system of


vectors F j , j = 1, ... , n - 1. Following [88], let us set

i-I

Ei = Fi - L CtikEk, i = 1, ... ,n -1, (18.77)


k-l

where the Ctik are chosen from the orthogonality condition of the vectors Ei =
(Ef) to the vectors E l , ... ,Ei- l . Then from (18.75)-(18.77), with regard to the
normalisation ei = Ei/IEil, we obtain

i=I, ... ,n-2. (18.78)


270 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

Let us denote by E and A the n x n matrices with elements

=f
if i > j,
if i = j,
otherwise.
Then from (18.77) we obtain

i.e.,
G=AEA'.
Rewriting the latter equality in the form

and taking into account the condition det A = 1 we conclude that


detE = detG. (18.79)

It is not difficult to see that a relation of the type (18.79) can be extended to
all the principal minors of the matrices E and G. Since E is a diagonal matrix we
have
k
Ak = II(Ej , E j },
j=l

which together with (18.78) gives the required result.


COROLLARY 40.1: Efron's cUnJature 0'2k~ for a natural and an arbitrary paramet-
rised cUnJe admits a representation in the form

k~(s) = A2(S) = G22 (S), (18.80)

A2(8)
k~(8) = AH8) .
(18.81)

Proof: Equations (18.80) follow from (18.74) taking into account that Al(S) = 1,
G 12 (S) =
O. The relation (18.81) follows from the equalities

A.(s) = Ai(8) i ~ 2,
, A~(i-l)(8) ,

which, in turn, follow from (18.73).



18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 271

Let us consider the quantities B1 ,132 , B3 which are given by equations (17.20)-
(17.22) and appear in the formulations of Section 17.
THEOREM 41: The quantities

B1 = 102 - 1?1 ,

are statistical scalar invariants of the model M.


Proof: The statistical invariance of B1 is clear, since

is the Efron curvature.


On the other hand the quantities Ik of formula (17.19) are products of the
quantities 'Yk(1-2kII ll ... 1 analogous to the tensors Wit ... ik for q = 1, and powers of
~
k
the quantity (12 A analogous to the contravariant tensor

Let
o:e-te
be a diffeomorphism defining a reparametrisation of the curve 51. From the rela-
tions

A(O) = (dUr
dO A(O),

- k
TI ll ... 1 (0) = (~:) II ll ... 1 (O)

it follows that the quantities hare inavariants, i.e., B3 is invariant.


Since the set of invariants is closed with respect to the differentiation

D = A1/ 2 ~
dO'
then the invariance of B2 is a consequence of the equality

B
2
= ~A1/2
3
~I
dO 3
Let us observe that 13 is the coefficient of skewness of the r. v. b1 (0). The
invariant B1 is always non-negative, and the invariant B2, describing the rate of
change of 13 as a function of 0, can take any value, with B2 = 0 if 1'3 = o.
272 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

The proof of Theorem 41 is not complete, since there remain some unexplained
questions about the system of basis quantities from which the statistical invari-
ants of the regression model with scalar parameters are constructed. Clearly, the
collection of quantities Jil ... ik given in (17.18) can serve as such a basis. A more
precise answer is contained in the Subsection that follows.

18.6 INVARIANTS OF NON-LINEAR REGRESSION WITH A VECTOR


PARAMETER

Using the definitions of the affine connection r~j and the tensors Wil ... ik according
to (18.11) and (18.71), let us rewrite the system of quantities P k , k = 1, ... ,16,
from the formulation of Theorem 35 of Section 16 (see equation (16.22)) in the
following form:
is jr
Pl nr r Wijrs,

is jr kt
P2 nr r r WiktWsjr,
is jr kt
P3 nr r r WijkWsrt,

P4 = n 2r is r jr (-6
(J' 1'3 n(is)(j){r) ) ,

P5 n 5 r is r jr (-6
(J' 1'3 n(ij)(r)(s) ) ,

P6 = nr r
is jr
Wisk
rk
jr'

P7 nr is r jr Wijk rkrs'

Ps = nr r
is jr
Wijs rk'
rk

Pg u
nri j rij rvuv'

P lO = u
nr i j r iv rvju'

Pn = u
nr i j riu rvjv'

P 12 = nr is r j r ris,u rujr'

P13 nr is r j r r rs,u ruij'

P14 = n 2r is r jr (
(J'
-2n (is)(jr) ) '
P15 n 2r is r jr (
(J'
-2rr (ij)(rs) ) ,
P16 n 2r is r jr (
(J'
-2rr (i) (jrs) ) . (18.82)

The quantities (18.22) are multi-dimensional generalisations of the quantities


(17.19) and are basic in the sense of Subsection 18.4 for the series of problems with
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 273

Table 4.1: The correspondence between Ii1 ... i" and Pj.

q=l! q~2
14 PI
[.2 P2, P3
3
121 P4, Ps
13111 P6, Pr, Ps
Ifl P9 - P13
102 P14 , PIS
hOI P16

q> 1. The correspondence between the quantities (17.91) and (18.82) is listed in
Table 4.1. Consequently the quantities I h ... i " contained in it are a basis for q = 1.
The quantities PI, P2, P3 are scalar invariants of the model M, since they re-
present its contraction of the tensors r ij and Wil ... i" , k = 3,4. Indeed, an arbitrary
tensor under transfer from the coordinate system () to 9 is transformed according
to the law

(18.83)

The linear combination aPl + {3P2 + ,,(P3 is the multi-dimensional analogue of the
invariant B3
Further, it is not difficult to see that

PIS = A, P14 = B, P 12 = C, P13 = D,


where A, B, C, D are defined in (18.25). Consequently
R = n-l{(plS - P 13 ) - (P14 - Pl,,)}, (18.84)
H = n- 1(P14 - P12) (18.85)
are the Ricci and Efron curvatures, respectively. We obtain analogously that (see
equation (18.68))

(18.86)
In the theory considered below the normal McCullagh curvature [155]

Y = H +2R
= n- 1{(P12 + 2PlS) - (P14 + 2P13)}. (18.87)
274 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

is encountered. Let us note in passing that it is not a measure of non-linearity of


the regression model, since it follows from (18.65) that

THEOREM 42: The quantity

Q = (P4 + 2Ps) - (P6 + 2P7) (18.88)

is a scalar differential invariant of the model M. If q = 1 then

Proof: Let
8 = 8(9) : e -t e
be a diffeomorphism specifying the reparametrisation of the model M. Then

xa {3 = Aij (el>-l)~ (el>-l): '

j k j
II(a)({3'Y) = i
II(i)(jk)el>a el>{3el>'Y + II(i)(j)el>ai el>{3'Y'
j k
II( a )({3)( 'Y) = i
II(i)(j)(k)el>a u el>{3el>'Y '

II( a: )({3)( 'YeS) = ijkl


II(i)(j)(kl) el> a: el>.Bel> 'Y el> eS + II(i)(j)(k) el>ia: el>j{3el>
k
'YeS (18.89)

Let us substitute the corresponding expressions (18.89) in P4 and P6. Then

Hence we obtain

(18.90)

Analogously we can write

(18.91)
18. DIFFERENTIAL GEOMETRY OF NON-LINEAR MODELS 275

Table 4.2: The correspondence between Bi , i = 1,2,3, and their multi-dimensional


generalisations.

q=1 q~2

From the equations

= n- 1{P4 + 2Ps ),

= n-1{Pa +Ps),

is_ir kt awrs
r 1- r Wijk ao t = 2n- 1P7,

is jr kt aWkt
r r r Wijs aO r = 2n- 1 ps,

removing P s we find that

(18.92)

is some function of the tensors Wij and Wijk. Since, according to (18.90) and (18.91)
we have
Q=Q,
then the validity of the assertion being proved follows from (18.69), (18.70), and
(18.92).
The scalar differential invariant Q is a joint invariant of the tensors Wij and
Wijk.The quantities

(18.93)

are merely the scalar invariants with respect to the reparametristion, with

(18.94)

Table 4.2 shows the correspondence between the one-dimensional statistical


invariants Bi, i = 1,2,3, and their multi-dimensional generalisations.
276 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

19 THE GEOMETRIC INTERPRETATION OF ASYMPTOTIC


EXPANSIONS

19.1 GEOMETRY OF THE AE OF THE LSE MOMENTS

We shall use the notations and results of the preceding Section.


For the coordinates of the bias vector of the l.s.e. en

tn = E(jnl/2(On - e)

according to (12.19) we obtain


n 1/ 2 ..
13 + o(n- ) '
t nk = - -2- r'3r~. k =, q
1 ... ,. (19.1)
1

In its turn, from (19.1) and (18.42) it follows that in the language of 'tangential'
matrices Ai, i = 1, ... , q,

t nk = __21 n 1 / 2 D kltrA+o(n-
t
1)
, k = 1, ... ,q. (19.2)

The relations (19.1) and (19.2) show that the bias tn depends upon the parametris-
ation of the model and can be made equal to zero within terms of order o(n- 1 )
upon passage to a geodesic coordinate system.
Turning to the a.e. (12.24), for the correlation matrix of the l.s.e. On we obtain
the expression

and

(19.3)

where Ridl is the Ricci tensor (18.21),

T(ij) = ! (Tij + Tji),


2
and the matrix Tij has the form

x (2II(il i2HJIHh) - risisII(JIHhHia) (ri2is ,i1 + r ida ,i2 + r i1 i2,jS))


+ n 2 rjJI ri2h (r~
1131
. r~l.
1232
+ 2r~1112. ri.l3132. )
(19.4)
19. GEOMETRIC INTERPRETATION OF AEs 277

From (18.82), (18.84), and (19.2)-(19.4) we obtain


u- 2 tr 1Dn = q + (n- I + o(n-I), (19.5)
where
1
( = nR + 2Q2 - P7 + P9 + 2PlO + 2 PI3 - PI6 . (19.6)

In order to make the expression (19.6) more intuitive let us take advantage of the
equation
8
U
-2
nrJ
k
8(}k r iij = - PlO - PI3 + PIS + P16. (19.7)

Substituting PI6 from (19.7) into (19.6) we can write


( = (N + (T + (* , (19.8)
where
(N = n(H + 2R) + 2Q2
= nY +2Q2' (19.9)

e = - P7
1
+ P 9 + P lO + 2 P I3 , (19.10)

2 Ok 8
(* = - u - r J 8(}k r'ij

(19.11)

are invariant, tangential and uninterpretable components of the quantity (.


For regression with a scalar parameter the values of the bias and variance have
the form

(19.12)

(19.13)
Let us introduce the normed bias
tn = tnD;;I/2

= - ~A3/2III2n-I/2 +o(n- 1 ). (19.14)

Hence from the definition (18.64) of the measure of non-linearity CT in the one-
dimensional case it also follows that
CT = n- I (c5I + c52)U 2A3m2

= 4(c51 + c52)t~ + o(n- 3/ 2). (19.15)


278 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

Equalities (19.14) and (19.15) give the statistical interpretation of the tangential
measure of non-linearity CT. For q ~ 2 there exist no simple relations of the type
(19.15).
Let
1'3 = O.
Let us turn to the expressions (N and (T in the language of the matrices Ai,
i= 1, .. . ,n.
By Theorem 39 of the preceding Section
n
(N = nY = na 2 2: (2 tr A~ - tr2 Ai). (19.16)
i=q+l

Taking into account that

by equation (18.41) we next obtain

n
(DD')i2h 2: (DT)haHi;h(DT)itbHLl
a,b=l
n
2: Tpa (D~i2 Hi;h Dhk) Tmb (D~il HLl Dit m)
a,b=l
q n n
= 2: 2: Tpa C~k 2: TmbC!m
k=la=l b=l
q

= 2: Ap,kkAm,pm
k,p,m=l
q

= tr Ap 2: Am,pm
m=l

Am,pm trAp. (19.17)

We establish analogously that

2: Ap,kmAm,pk
k,p,m=l
19. GEOMETRIC INTERPRETATION OF AEs 279

= L {AmAp}pm
m,p=1
(19.18)

We find the quantity ~P13 from equations (18.62) and (18.64). Finally we obtain

(T = n0'2 (Am,pm tr Ap + (AmAp) pm + ~ ~ tr A~) . (19.19)

19.2 THE GEOMETRY OF THE AEs ASSOCIATED WITH THE ESTIMATOR


OF THE VARIANCE 0'2

Let us comment from the geometric viewpoint on the form of the coefficients of the
polynomials (13.69) and (13.70) of the a.e. of (13.44) of Theorem 29 of Section 13.
First of all, all quantities which contain PI, P2 and P3 do not enter the set of basis
variables PI-PI6. Consequently the complete interpretation of the a.e. (13.44) in
the spirit of Subsection 19.1 is not possible, and we restrict ourselves only to some
remarks. Clearly,

= L (DD\j FaiFbj
a,b=1
n

= L D:niFlaD:njFjl
a,b=1
n

= L T~mTmb
a,b=1
n
= L Pab. (19.20)
a,b=1
is the sum of the elements of the matrix of the orthoprojection on the tangent
space Tm (M) (or in terms of the surface sq on the tangent plane Tq (0)).
On the other hand,

P2 = 20'-2nri2hr~l. II( ) + 0'- 2 nri dl r~2. II( )


'1'2 32 'PI 32

- 20' -2 nr hhril II
i2ia (ia) -
Ai lillI (hil) , (19.21)

P3
= 0'
-4 2 idl i2h
n r r riaili2 II (M II(h) II (ia)
- Ai dl Ahh II (M II (Ma) II (is). (19.22)
280 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

And so in the geodesic coordinate system

P2 = - A d1II (idd'
i

P3 (19.23)

In respect of the quantity P2 let us observe that


n
- Ahil II(i1il) - IJDD'LjHij
a=l
n
= - L D:niHijDjn
a=l
n
-L trC a. (19.24)
a=l

'furning to the a.e. (13.34) of Theorem 28 of Section 13 and the a.e.-s (15.37)
and (15.38) of Theorem 33 of Section 15 let us note that the function z(B), defined
by (15.39), is the difference of two quantities Zl - Z2, where Z2 coincides with
(19.24), and

Zl Ai d1 Ai2h II (i131)(i2) II (h)

= a- 2 nr i131 r~2.
'131
II(.32 )
n
L D i1m D:nil Di2rD~hH~il Fai2 Fbh
a,b=l
n
L (D~hF;2b) (D~i2F:2a) (D:nil H~il D j1m )
a,b=l
n q

L L TrbTraC!m
a,b=l m=l
n
LTrb tr Ar
b=l
n
L T~rTra tr C a
a,b=l
n
L Pab trC a . (19.25)
a,b=l
19. GEOMETRIC INTERPRETATION OF AEs 281

Let us consider the a.e. (14.1) of Theorem 30 of Section 14. The polynomials R1
and R2 of this expansion do not depend upon the parameter 0, and the polynom-
ial R3 contains the function Y(O). Rewriting equation (14.5) for Y(O) through
the basis quantities (18.82), we ensure that Y(O) coincides with the McCullagh
curvature (18.87). Since R = 0 when q = 1, then in this case
Y = H = n-1B1 = 0'2k~(O)
is the Efron curvature, or to within a factor 0'2 is the square of the first curvature
of the curve (18.72).

19.3 THE GEOMETRY OF THE AE OF DISTRIBUTIONS OF QUADRATIC


FUNCTIONALS OF THE LSE

From the viewpoint of Subsection 18.4 the functional of Section 16


T(1) = 0'-2(IX - g(OW -IX - g(9nW)

is invariant, since it is a function of the three points X E IRn, g(O),g(9n ) ESq. T(4)
is invariant analogously. On the other hand T(2) and T(3) are not invariants, and
this is reflects on the properties of the a.e. of the d.f. of the functionals T(1LT(4).
For the formulation of the geometric results let us rewrite (16.37) in the form
3
p;{ ,(m) ~ Z1-Q} = a + n- 1 :L>;m)gq+2j(Z1-Q) + o(n-1), (19.26)
j=1
where

L J.Ljk(m)pk,
16
(m)
cj = j = 1,2,3,
k=1
(m) - 2..\(m)
J.Llk Ok ,
(m)
J.L2k = _2(..\(m)+..\(m))
Ok 1k' (19.27)

(m) 2..\(m)
J.L3k = 3k .

The numerical coefficients J.L;7:) obtained from Tables 3.2 and 3.3 are given in
Table 4.3.
Let us set

S3 =
~
00
~

Table 4.3: The coefficients J1-j7:).

~ 3
1 [-4 r[6 [ 81 7 [ 9 [10 rl~rLl14115116
1 1 1 1 1 1 1 1 1
1 -4 4 6 -2 0 2 0 0 0 0 0 4 -2 -4 2 0
@
1 1 1 1 1 >
1 2 4 -2 -"3 2 1 -2 -1 0 0 0 0 0 0 0 0 0 '\j
1 1
3 0 4 '6 0 0 0 0 0 0 0 0 0 0 0 0 0 ~
::x;,
1 1 1
1 -4 4 -1 0 1 0 0 0 0 0 1 -1 -1 1 0 ~
'6
1 1 1 -~ C':l
2 2 4 -2 -"3 1 2 2 -2 -1 -2 -3 -1 -1 -2 1 2 2 tr1
1 1 1 1 1 o
3 0 4 '6 0 0 2 1 1 1 1 1 4 2 0 0 0
1 1 1 1
-1 0 1 0 0 0 0 0 1 -1 -1 1 0
~
-4 4 '6
1 1 1 1 ~
C')
3 2 4 -2 -"3 1 2 -2 -2 1 0 1 -1 0 0 0 0 -1
1 1 1 1 ;g
3 0 4 '6 0 0 0 -1 -1 1 1 1 4 2 0 0 0
1 1 1
1 -4 4 '6 -1 0 1 0 0 0 0 0 1 -1 -1 -1 0
1 1 1 1 1
4 2 4 -2 -"3 1 2 -1 -2 0 0 0 0 -4 -2 1
4
1
2 0
~
~
1 1 ?;5
3 0 4 '6 0 0 0 0 0 0 0 0 0 0 0 0 0
~
>
~
19. GEOMETRIC INTERPRETATION OF AEs 283

Also let B, Y, Q be invariants defined by formulae (18.86), (18.87), and (18.94).


The following Theorem gives the geometric interpretation of the a.e. (16.22) of
Theorem 35 of Section 16.
THEOREM 43: Under the conditions of Theorem 24 of Section 10, for k = 4 there
hold
p;{ ,(4) (f)) ~ ZI-O:}
a + n- 1{(SI - Ql + R)gq+2(ZI-o:) + (S2 + Q + ~ B) gq+4(ZI-o:)

+ S39q+6(ZI-O:)} + o(n- 1), (19.28)

p;{ ,(1) (f)) ~ ZI-O:}


a + n- 1{ (SI - ~Q1 + ~ Y) gq+2(ZI-o:) + (S2 + ~ Q) gq+4(ZI-ga)

+ S39q+6(ZI-O:)} + o(n- 1) (19.29)

uniformly in f) E T.
The proof is obvious.

COROLLARY 43.1: Let the Cj be Gaussian r.v.-s. Then

p;{ ,(4) (f)) ~ ZI-O:} = a + Rgq+2(ZI-o:) + ~ Bgq+4(ZI-o:)


+ o(n -1) (19.30)

p;{ ,(1)(0) ~ ZI-O:} = a + ~ Ygq+2(ZI-o:) + o(n- 1). (19.31)

(since PI -Pg vanish for Gaussian Cj).

If we return from (19.31) to the base a.e. (16.22) of Theorem 35 it is easy to


understand that for a Gaussian regression

(19.32)

Let s~/a2 be a statistic that does not depend upon ,(I) and has a X~ distribution.
Then from (19.32) it follows that

v(z) = p;{ '(:~j;/q < Z}


= Sq,r ()
Z - 81 Y Llq,r(z) + o(n -1 ), (19.33)
284 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

uniformly in Z E JRl and () E T, where Sq,r is the Fisher-Snedecor distribution


with q and r degrees of freedom,

~ () _ (qjr)q/2r(~(q + r)) q/2 ( '!. )-(q+r)/2


q,r Z - r(~(q + 2))r(rj2) Z 1+ r Z .

Let Ua be the quantile of the distribution Sq,r' Then from (19.33) a formula
follows, generalising (19.31):

1 - v(ua) = a + 8"1 y ~q,r(ua) + o(n -1 ). (19.34)

The relation (19.34) only differs notationally from equation (A1.26) of Beale's work
[28].
Let us note that the quantities P6-P13 contain the Christoffel symbols of the
second kind r~j' which vanish in a geodesic coordinate system.
THEOREM 44: Let Cj be Gaussian r.v.-s. Then in a geodesic coordinate system,
under the conditions of the Theorem 43,

p;{ T(2)(()) ~ Zl-a}


= a + Rgq+2(Zl-a) + (B + 2n- 1P16 )gq+4(Zl-a) + o(n- 1), (19.35)

p;{ T(3) (0) ~ Zl-a}


(19.36)

and the tail of the distributions T(1) and T(4) satisfy {19.30} and {19.31}.
Proof: The proof of this assertion also is not difficult.

19.4 GEOMETRY OF THE STATISTICAL CRITERIA FOR TESTING
HYPOTHESES ABOUT NON-LINEAR REGRESSION PARAMETERS

Let 0 E 8, and 8 be an open interval of JR1 . We call a reparametrisation

0:8-+8

of a regression function g(j, 0) regular if for the observational model

j = 1, ... ,n, (19.37)

where
g(j,O) = g(j, 0(0)),
the conditions II, III, IV, VIII, and (17.4) of Section 17 are satisfied.
19. GEOMETRIC INTERPRETATION OF AEs 285

Here it is important to remark that not all the enumerated conditions for the
reparametrisated model (19.37) automatically follow from the analogous condi-
tions for the initial model (0.1). This remark relates to every result of Chapter 4
associated with the reparametrisation of the model (0.1), i.e., with the passage to
another local coordinate system. Rigorously speaking, these results are true only
for regular reparametrisations.
Let us consider the statistical experiment {JRn, En, P;, 8 E e} generated by
the observations (19.37), and let us introduce for the reparametrised model (19.37)
the class of criteria K(a) analogous to the class K(a) of Section 17. It is easy
to establish a one-to-one correspondence between K(a) and K(a) if the criterion
\[I n E K (a) with the statistics Wand set of coefficients {a1' a2, (31 , ... , (36} is set
correspondence with the criterion \[In E K(a) with statistics Wand the same set
of coefficients.
We shall say that \[In E K(a) is a criterion with the statistics W that is
invariant under a regular reparametrisation 8, if, for any Bo E e and 80 = 8(Bo)

(19.38)

as n -+ 00. Henceforward, for simplicity we shall call such a criterion \[In E K(a)
invariant.
THEOREM 45: The criterion \[In E K(a) is invariant if and only if

(35 + (36 = 0, (19.39)

(31 + (32 + (33 + (34 = O.

Proof: For distributions of the statistics W and W one can write the Edgeworth
expansion with remainder term o(n- 1 ) and compare the first terms of the expan-
sions. Since they depend upon thecumulants kjv(O) of the statistics Wand the
cumulants kjv (0) of the statistics W, j = 1,2,3,4, v = 0,1,2, respectively, then
the invariance condition is the condition that kl1 (O), k20 (0), k22 (0), k31 (0), and
k42 (0) (the remaining kjv(O) = 0) depend only on statistical invariants. The rela-
tions (19.39) ensure that this conditions is satisfied. Let us note that b, c are
statistical invariants if a1 + a2 = O.

COROLLARY 45.1 There exists an unique u-representable criterion, which is defined by the
coefficients

27
(32 = - 9, (33 = 4'
286 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

3
{34 = - -,
4
{35 = -1, {36 = 1,
1
CI = 1, C3 = -.
3

Proof: The Corollary follows from (17.67) and (19.39).


COROLLARY 45.2: Let 73 = 0. Then the most powerful invariant criterion of the
class K(a) can be given by the statistic

W = u-IAI/2(90)bl(90;X)

+ 2~2 u-IA3/2(90)bl(90;X)

x{b2 (9 0 ;X) - A(90)III2(90)bl(90;X)}n-I/2

- 8~4 u- 1 A5 / 2 (90 )bl (90 ; X)


x {b~(90; X) - 2A(90)III2(90)bl (90; X)b 2 (90; X)

+ A2(90)II~2(90)b~(90; X)} n- l . (19.40)

Proof: In fact such a criterion can be defined by the set of coefficients

{31 = ... = {36 = 0.

Let us denote by K(m) the classes of criteria asymptotically equivalent to w~m),


m = 0,1,2, by U the class of u-representable criteria, and by J the class of invariant
criteria.
THEOREM 46: The following assertions hold:
(1) The criteria w~m), m = 0,1,2, are not asymptotically equivalent.

(2) K(m) n J::p 0, m = 0,1,2,


(3) K(2) \ U ::p 0.

Proof: The assertion (1) follows from the relations a~m) = m, m = 0,1,2, and
Thea- rem 38. For the proof of the remaining assertions it is sufficient to mention
19. GEOMETRIC INTERPRETATION OF AEs 287

examples of the relevant criteria. Let us enlarge the list (17.1)-(17.3) by the
functionals:
7;.(0) (1-2 A(On)b~(Oj X), (19.41)

7;.(1) = (1-2 {J(O) - b2(Oj X)n- 1/ 2} -1 b~(Oj X), (19.42)

7;,(1) = (1-2 {J(O) + [A(O)I112 (O)b 1(OJ X) - b2(Oj X)]n-1/2} -1


xb~(Oj X), (19.43)

(19.44)

7;,(2) = (1-2cp(On'O)

= (1-2 L (g(j,On) - g(j,O))2, (19.45)

Tg(2) = (1-2 A(On)b~(Onj X). (19.46)


Here 7;.(0) and Tg(2) are modifications of Rao's criterion, 7;.(2) = 7(3) of Section 16
for q = 1. 7;.(1) and til)
are modifications of the Neyman-Pierson criterion pre-
sented in the workd [27,24] (they are also a form of modification of the criterio of
Rao and Wald) 7;,(2) = 7(4) of Section 16 for q = 1.
For uniformity of notation we set
7 (m) = ..,(m) m = 0 1 2
'0' , , .
In Table 4.4 we have set out the coefficients ai, f3i of the criteria w~';;) with the
generating statistics (17.1)-(17.3), (19.41)-(19.46) (empty cells correspond to zero
values of the coefficients).
Analysing Table 4.4 it is not difficult to observe that
E J
nO , W(l)
w(O)
nO , n2 , W(2)
w(1)
n2 , W(2)
n3 ,
i.e., assertion (2) of the Theorem is proved.
The inclusion
UCK(2)
follows from (17.67). On the other hand,
w~21 E K(2) \ U,
and this proves (3).

It is easy to notice that
,T,(2) 'T,(2) 'T,(2) EU . ,T,(2) J U
~ nO' ~ n1, ~ n2 , 1.e. ~ n2 En.
As Corollary 45.1 shows, such a criterion is unique, and we have described it earlier
not knowing the generating statistics ti2)
(see the penultimate row of Table 4.4).
288 CHAPTER 4. GEOMETRIC PROPERTIES OF AEs

Table 4.4: Coefficients of the criteria 'iJ!~r;:).

0 0
1 -2 -2 7 -1 -1
9 1 1 1
1 0 1 -1 1 -3 '4 -'4 -3 3
1 1 1
2 1 -1 1 -2 1
9 1
2 0 2 -1 3 -6 '4 -3 1
45 4
1 2 -3 3 -12 T -1 -3 1
27 3
2 2 -2 3 -9 T -'4 -1 1
3 2 -2 3 -9 7 -1 -1 1
Appendix

I SUBSIDIARY FACTS

For ease of reference, in this Appendix a series of results is included on which the
presentation of the principal material is based. In many cases an assertion is given
in a 'uniform' form.
Let us consider the independent r. v .-s ~l' ... , ~n and let us assume that Sn =
E ~j. The following assertion is owed to Petrov ([172] pp. 52-54) and strengthens
Bernstein's Inequalities.
THEOREM A.l: Let there exist positive constants rl, ... , rn and R such that

j = 1, ... , n, ItI ::; R.


Let us set

Then

P{Snl ~ x} ::; e- Rx / 2 if x ~ GR.

Lemma 5 on p. 54 of the book [172] clarifies the probabilistic sense of the


conditions of Theorem A.I.

THEOREM A.2: Let

j = 1, ... ,n.
Let us set

289
290 APPENDIX

Then
EISnl B < X(S)(MB,n + B~/2), s ~ 2,

EISnl B < (2 - ~) Ms,n, 1 ::; s ::; 2.

The first inequality is owed to Rosenthal ([173] p. 86), the second to Berry
and Esseen ([173] p. 98). In particular, for the r.v. {j = where j ~ 1, gj{j, gj,
is a sequence of numbers and Cj a sequence of independent identically distributed
r.v.-s, we obtain

EISnl B < x(s) (JLB L IgjlB + JL;/2 (LgJ) B/2)


< x(s) ( JLB+JL2B/2) 2f/2
(L gj' s ~ 2,

EISnl B < JL2B/2 (L gj2) B/2 , 1 ::; s ::; 2.



THEOREM A.3: Let 1J(u) be a separable and measurable random field defined on
the closed set F ~ IRq, and for any u, u, u, u + U E F,

for some s ~ m > q and a locally bounded function


l(u) : IRq ----t ~ .
Then for any Q, hand e > 0

p{ sup
1' ,1" EFnvo (Q)
11J(u') -1J(u")1 > eJ : ; Xo ( sup
1EFnvo (Q)
l(U)) Qqhm-qe- B ,
11'-1"lo~h

where the constant Xo depends upon s, m and q and does not depend upon Q, h, e
and the set F.
In particular, when the conditions outlined above

p{ sup
1',1"EFnvo(Q)
11J(u') -1J(u")1 > e} ::; it'o ( sup
1EFnvo(Q)
I(U)) Qme- B ,

are satisfied, and where Xo does not depend on Q, or F. e


Theorem A.3 is close to Theorem 19 of the Appendix of the book [120] by
Ibragimov and Has'minskii and to the theorems of Section 1 of Chapter 2 of the
book [218] by Yadrenko.
1. SUBSIDIARY FACTS 291

THEOREM A.4: Let en, n ~ 1, be a sequence of independent identically distributed


r.v.-s

for some 8 ~ 1. Then

(I) for any r > 0,

where

en = o{n-B+1)
and is independent of r, and m{n-1Sn) is the median of the r.v. n-1Sn'

.
The assertion (I) of Theorem AA coincides with Theorem 28 of the book [172]
p. 286. Assertion (2) is proved in the same way as Theorem 27 on p. 283 of this
same book (Billinger, Baum and Katz). See also the work of Nagaev and Fook
~~.

THEOREM A.5: On the statistical experiment {lRn ,Bn,p;,8 E 9} let there be


given a triangular array ejn, j = 1, ... , n, n ~ q, of r.v.-s independent in each row
and having finite absolute moments of order s for some integer 8 ~ 3,

8 E 9, j = 1, ... ,n, n ~ 1.

Let us assume that the quantities

lT~(8) = n- 1 L. Deejn and PB,n(8) = n- 1 L. Eelejnl S

for some set T C 9 satisfy the relations

lim inf lT~(8) > 0, lim sup Ps,n(8) < 00.


n-+oo 9ET n-+oo 9ET

Then

~~~ P; {In- 1/ 2 L.{ejn - Eeejn) I> an lTn (8) } ~ xn{T)n-(B-2)/2a~s ,

where

xn{T) ~ x{T) < 00


is a bounded sequence, and an is any sequence of numbers satisfying the condition

an ~ {8 - 2 + 8)1/210g1/2 n

for any given 8 > 0.


292 APPENDIX

Theorem A.5 is a one-dimensional variant, uniform in () E T, of Corollary 17.13


on pp. 179-180 of the book [33] and generalises the assertions about the probabil-
ities of moderate deviations (Amosova, Rubin and Seturaman [172], p. 254).

Also, as in [33], it is possible to strengthen the result stated, namely: if it is


known that

then

THEOREM A. 6: On the statistical experiment {lRn, Bn, P;, () E e} let there be


given a triangular array 1]jn j = 1, ... , n, n ~ 1, independent in each row of r.v.-s
with d./. Pjn(J,x), and Tee is a set of parameters.
In order that for any c > 0

sup P;{n- 1
(JET
IL1]j l > c}
n ----+ 0,
n-+oo

it is sufficient that for any c > 0 and some T > 0

(1) sup
(JET
L JI3:I~lm
f Pjn(J, dx) ----+ 0,
n-+oo

----+ 0,
n-+oo

(3) sup n- 1
(JET
L f
J I3:I<Tn
xPjn(J,dx) ----+ O.
n-+oo

Theorem A.6 is a modification of the theorem of Section 1 of Chapter 1 of the


book [172].

THEOREM A.7: {{172} p. 272). Let ~n, n = 1,2, ... , be a sequence of independent
r.v.-s with zero mathematical expectation, and for some p, 1 ~ p ~ 2, and a
sequence an t 00,
n-+oo

~ EI~jIP
L..J a1! < 00.
j=l :J
1. SUBSIDIARY FACTS 293

Then

a~1 Sn - - t 0 a.c..
n-too

THEOREM A.8: {[172} p. 994). Let en, n = 1,2, ... , be a sequence of independent
r.v.-s with d./. Pj(x). Let the conditions

L: p{lejl ~ j} <
00

(1) 00,
j=1

(3) n- 1 r xP (dx) -n-too


L: i1zl<n j -t 0

be satisfied. Then

n- 1 Sn - - t 0 a.c..
n-too

If z E IRq, then the notation ZZI means the product of a (q x 1) matrix z (z


being a column vector) with a (1 x q) matrix Zl (Zl being a row vector).
THEOREM A.9: Let ejn, j = 1, ... , n, n ~ q, be a triangular array of q-
dimensional random vectors, independent in each row, of degree q ~ 1, given
on the statistical experiment {IRn, Bn, pf n, 9 E e}. Let

j = 1, ... ,n, n ~ 1,

9 ET,

be the (uniformly in 9 E T) positive definite matrix:

P3 = lim
n-too
sup P3,n(9)
(JET
< 00,

Then there exists a constant c( q) < 00 such that


294 APPENDIX

Theorem A.9 is a variant, that is uniform in (), of Corollary 17.2 on p. 165 of


the book [33].
The following assertion is closely related to Theorm A.9.
THEOREM A.I0: Let ejn(z), z E Zn, j = 1, ... ,n, n ~ 1, be a triangular array of
independent random processes, independent in each row, defined on the statistical
experiment {]Rn, B n , P;, () E e}, and Zn be a monotonically extending sequence of
subsets of]Rl. Let
j = 1, ... ,n, n ~ 1, z E Zn.
Let us denote

and let us assume that


A* = n-+oo
lim inf
IIET;zEZ"
a;((), z) > 0,

P3 = lim sup P3,n((), z) < 00.


n-+oo IIET,zEZ"

Then there exists a constant c > 0 such that


sup sup
IIET,zEZ" yEIIP
Ip; {n- 1/ 2 L ejn(z) < yan ((), z) } - I
<I>(y) ~ CA;3/2 P3n - 1 / 2.

THEOREM A.ll: Let g be a non-negative differentiable function on [0,00) such


that

(1) b= 1 00
Ig'(t)lt q - 1 dt < 00,

(2) lim g(t)


t-+oo
= o.
Then for any convex set C E Bq and any > 0 there holds the inequality

r
}C.\C
g(lxl)dx ~ b (21l"(~/2)) .
r '2Q

This bound also holds for the integral

r
}C\C-.
g(lxl) dx.

The full proof of Theorem A.ll is contained in Section 3 of the book [33]
and is associated with the names of Ranga Rao [190], Sazonov [221,222] and von
Bahr [10].
1. SUBSIDIARY FACTS 295

THEOREM A.12:
Let a r.v. Cj with c.f. 1f;(>.) have density p(x),

sup p(x) = Po < 00


zERl

and let f.L2 < 00. Then the following inequalities hold:

(1) 11f;(>.) I ::;exp {- : },


Pof.L2

where A is an absolute constant;

(2)

The assertion (1) was obtained by Survila [207], and assertion (2) is owed to
Statulevicius [206].
Let f.L be a finite signed measure on (JRP, Bn). With the signed measure f.L there
are associated set three functions f.L+, f.L- and 1f.L1 which are called the positive,
negative, and absolute variations of the signed measure f.L, and

From the Hahn-Jordan decomposition [89] it follows that f.L+, f.L- ,1f.L1 are finite
measures on (JRP, BP) and that

Clearly, for any B E BP

The following Theorem indicates one subtle property of the absolute variation
1f.L1 of the signed measure

k-2
f.L = Qn((J) - L n- r / 2Pr ( -cI>j {;~:v((J)}), Po = cI>.
def
r=O

THEOREM A .13: On the statistical experiment {lRn , Bn, P; ,(J E e} let there be
given a triangular array {jn, j = 1, ... , n, n ~ 1, of random vectors, independent
in each row, with values in IRP , having zero means. Let us assume that:

(1)
296 APPENDIX

(2) There exists an integer u > 0 such that the functions

IE; exp {i (K;1/2(8)~jn' t) } I '


m+u
\Ifm,n(8, t) = II
j=m+l

o ~m ~n-u, n ~ u+ 1,

satisfy the condition

sup sup ( \Ifm,n(8, t) dt < 00,


O<m<n-u (JET llRP
n~u+1

and for any number b > 0

sup sup \Ifm,n(8, t) < 1,


O<m<n-u Itl~b;(JET
n~u+l

for some integer k ~ 2.


Then for the distribution Qn(8) of the sum n-l/2K;;1/2(8) L:~jn there holds
the relation

where Xv(O) is the arithmetic mean of the cumulants of vth order (Ivl = 3, ... , k+
1) of the vectors K;1/2(8)~jn' j = 1, ... , n.
Theorem A.13 is a uniform variant of one of the assertions in Section 19 of the
book [33] (cl, p. 206, equation (19.100)).

COROLLARY A.13.1: Under the conditions of Theorem A.13

~~~ :~Ep L Qn(8)(dx)- L (<p(x) + ~Pr(-<p;{Xv(8)})(X))dX

= O(n-(k-l)/2).

In fact, let us denote by

IIILIi = { IILI (dx)


llRP
1. SUBSIDIARY FACTS 297

the variational norm of the signed measure 1-". Then by Theorem A.13

On the other hand, for any B E BP

In many assertions in the book the Chebyshev-Hermite polynomials

Hs(z) = (_IYe z2 / 2 d~se-z2/2, 8=0,1,2, ....

have been used. The first ten polynomials Hs have the forms:

Ho(z) = 1,
H3(Z) = Z3 - 3z,
H5(Z) = z5 - 10z 3 + 15z,
= z6 - 15z4 + 45z 2 - 15,
H6(Z)
H7(Z) = z7 - 21z 5 + 105z 3 - 105z,
Hs(z) = ZS - 28z 6 + 21Oz 4 - 420z 2 + 105,
H9(Z) = z9 - 36z7 + 378z 5 - 1260z 3 + 945z.
The cumulant of any arbitrary order 'Yk of some r.v. c is expressed through the
moments ml, ... , ml of the r. v. considered, by the formula [172]

the summation 2: * is performed over all integral non-negative solutions of the


equation

In particular,

and conversely
298 APPENDIX

If

C = Cj and

then the formulae presented are simplified:

13 = m3,

II LIST OF PRINCIPAL NOTATIONS

]Rm , m ~ 1 is the m-dimensional Euclidean space.


[3m is the a-algebra of Borel subsets of]Rm.
e:m is the class of all convex Borel subsets of ]Rm.
If

then

(x, y) is the scalar product in ]Rm.


vCr) = {u E]Rq: lui < r}.
vo(r) = {u E]Rq: lulo < r}.
N is the set of natural numbers.
q is the number of unknown parameters.
n is the size of a sample.
ACe ]Rm is the closure of the set A in ]Rm .
A is the complement of the set (the event) A.
Aa = U(A + p6) is an exterior set parallel to A E [3m, p E ]Rm, 6 > o.
IpI9
A-a = ]Rm \ (]Rm \ A)a is an interior set parallel to A.
e E [3q (.0) is a parametric set.
Tee is a compact set.
dn (8) = diag (d in (8), i = 1, ... ,q), 8 E e,
Un = dn (8)(e - 8), Un (8) = n- 1 / 2 Un (8), U(8) = e - 8.
x{ A} is the indicator of the event A.
r. v. - random variable.
II. LIST OF PRlNCIPAL NOTATIONS 299

d.f. - distribution function.


c.f. - characteristic function.
a.c. - almost certainly.
l.s.e. -least squares estimator, On are l.s.e.-s.
l.m.e. -least moduli estimator, On are l.m.e.-s.
m.c.e. - minimal contrast estimator.
a.e. - asymptotic expansion.
s.a.e. - stochastic a.e ..
P(x) is the d.f. of the r.v. ej.
p(x) is the density of the r.v. ej.
'I/J(>") is the c.f. of the r.v. ej.
Xl, ... , Xn are observations.
el, ... ,ej are the random errors of observations.

e,
P PI' are measures induced by a sequence of observations Xl, ... , X n , .. of the
form (0.1).
E; (Et') is the mathematical expectation under the measure P (pf). e
D;e = E;e - (E;e)2 is the variance of the r.v. e.
E is the mathematical expectation under the measure P.
ms = Eej.
/Ls = Elejls.
"fs is the cumulant of degree s of the r.v. ej.
n

L=L
j=l

s~ = n- Le;.
1

9(j, fJ), j EN, fJ = (fJ l , ... , fJ q ) E E)c is the regression function.


If g(j, fJ) is differentiable, then

d~n (fJ) = L(9i(j, fJ))2, i = 1, . .. ,q.

8
gi = 8fJ i g.

L(fJ) = L[Xj - g(j, fJ)]2.

R(fJ) = L IXj - g(j, fJ)1 and is also the Ricci curvature.

<Pn (fJ l , fJ2) = L(g(j,fJl ) - g(j,fJ2))2.

()n(Ul,U2) = <Pn(fJ + d;;l(fJ)Ul.fJ + d;;1(fJ)U2).


300 APPENDIX

U1, U2 E U~(O).

j(j,u) g(j,O + n 1/ 2d;/(O)u), U E fJ~(O).

gh ... i r (80 h ~~ 80 ir ) g, r ~ 1.
h ... i r = gh ... dj,O + n1/2d~1(O)u),
a (a1, ... ,aq ) is a multi-index, with
lal a1 + ... + a q ,
g(O:) (j, 0 + n1/2d~1 (O)u).

810:1 )
(
(80 1 )0:1 ... (80 q )O:q g.

~~0:)(U1,U2) = LU(O:)(j,uI) - j(O:)(j,U2))2, U1,U2 E fJ~(O).

dn(O) == n 1 / 2 1q is the standard normalisation.


1m is the identity matrix of order m.
1(0) = (n L gi (j, 0) gl (j, 0)
-1 r .,1=1
if the standard normalisation is used:
A(O) = (Ail(O))~ ,1-1
_ = r1(O) .
Amin(A) (Amax(A)) is the minimal (maximal) eigenvalue of the real symmetric
positive definite matrix A.
41 K (x), x E ]Rm is the dJ. of the Gaussian random vector (m > 1) with zero mean
vector and correlation matrix K (the dJ. of the Gaussian r.v. (m = 1) with zero
mathematical expectation and variance K).
CPK(X), x E ]Rm, is the Gaussian density corresponding to the dJ. q,K(X).
q,(x) q,lm(x), cp(x) = CP1m(x) m ~ 1.

q,(C) i cp(x) dx.

etc.
II. LIST OF PRINCIPAL NOTATIONS 301

Ps(it; {Xv(O)}), PB ( - CPK; {Xv(O)}) are polynomials of the a.e. of Section 10 (na-
tations are taken from [33]) .
-end of proof, end of example, end of remark.
Commentary

CHAPTER 1
1. A lot of attention was paid to questions of the existence of the l.s.e. in the works
of Demidenko [70,71), see also the work of Gallant [83]. The inequality (1.6) was
first made use of by Malinvaud [152] for the proof of the consistency of the l.s.e ..
2. The results of Section 2 are basically new. A result close to Theorem 2
is contained in the work of Dzhaparidze and Sieders [75]. Prakasa Rao [184]
considered the case of Gaussian regression. Exponential bounds for the probability
of large deviations of the l.s.e. for an infinite dimensional model of non-linear
regression were presented in the work of Kukush [146].
3. Power bounds of the probability of large deviations for the l.s.e. of the form
(3.4) for q = 1 were first obtained by Dorogovtsev [72] and then by the author
[124]. Theorem 8 was published in the author's work [128] and generalises a result
of Malinvaud [152]. Theorem 9 is owed to the author [129]. The property of
consistency of the l.m.e. was discussed in the works of Oberhofer [167), Kozlov
and the author [132,133] and others. Theorems 10 and 11 were published in a
note by the author [130].
4. This Section contains a revised form of the results of the author's work
[128]. A series of assertions about the consistency of statistical estimators of
the amplitude and angular frequency of harmonic oscillations is contained in the
book by Ibragimov and Has'minskii [120]. Other references may be found in the
commentary on Section 5.
5. In this Section the generalisation to the case of non-identical distibutions
of observations of the concept of minimal contrast defined in the work of Pfanzagl
[174] referred to by Kozlov and the author [133] is considered. The method of the
proof of the consistency, used in Theorem 13, goes back to Wald [208] and Le Cam
[216], and in conformity with various statistical estimates has been developed by
many authors: see the bibliography in the article by Wolfowitz [213] and the book
of Zacks [216), also the works ofPfanzagl [174), Dorogovtsev [73], Knopov [144] and
others. The problem of the detection of hidden periodicities in a setting similar
to that of our book was solved in the works of Whittle [211], Walker [209,210]'
Hannan [109,110], Has'minskii [120], Knopov [144], Kutoyants [147], the author
[126], and others.

303
304 COMMENTARY

The theme of Sections 2-5 is close to the work of Prakasa Rao [185-187],
van de Geer [86], Bunke [43], Jennrich [138], Hartley and Baker [112], and Birge
and Massart [35].
6. One example is considered at length, related to taking the logarithm of a
non-linear regression model, can be found in Rao [189]. Other examples may be
compared in the works of Demidenko [70,71] and Christonsen's monograpbh [67].

CHAPTER 2
7. The first s.a.e.-s of maximum likelihood estimators in a scheme of identically
distributed observations were obtained by Linnik and Mitrofana [149,150,162].
Then, for weakened restrictions such results for more general m.c.e.-s were ob-
tained by Chibisov [55-64] and Pfanzagl [176,177]. Michel [158,159] proposed
an approach related to the Newton-Raphson calculational procedure be used to
obtain the s.a.e. of statistical estimators. The s.a.e. of different estimators are
obtained in the works of Burnashev [44,45], Gusev [102], and many other authors.
In the work of Bhattacharya and Ghosh [32] broad regularity conditions ensuring
the existence of a s.a.e. of statistical estimators are given. The s.a.e. of the l.s.e.
for the model (0.1) was first obtained by Zwanzig and the author [136], and in
a more general form by Bardadym and the author [14]. Theorem 17 is a revised
variant of the results [136] and uses the scheme of proof in the work of Chibisov
[59]. Theorem 19 has not been previously published.
8. The bound in Theorem 20 for q = 1 for the m.c.e. in a scheme of independent
identically distributed observations was obtained by Michel and PfanzagI160]. For
the l.s.e. (q = 1) this result is contained in the work of the author [125]. The first
results on the approximation of the distributions of statistical estimators by a
Gaussian distribution with a rate O(n-l) (the Berry-Esseen Inequality) is owed
to Pfanzagl [175,178]. One sharpening of such results, related to the computation
of the constants in the Berry-Esseen inequality for the m.c.e., was obtained by
Michel [157]. The Berry-Esseen inequalities were also obtained by Prakasa Rao
[183,187], and Bhattacharya and Ghosh [32]. For the distribution of the l.s.e. of a
scalar parameter, using the method of [175] the author obtained [123] the Berry-
Esseen inequality. Theorems 21 and 22 substantially strengthen the results of [123]
and have not been published before. The property of asymptotic normality of the
l.s.e. of an infinite dimensional parameter was studied by Dorogovtsev [74] and
Kukush [146].
9. The asymptotic normality of the l.m.e. for linear and non-linear regression
was studied by Bloomfield and Steiger [37,38], Basset and Koenker [23], Gure-
vich [101], and van de Geer [87]. Theorem 23 was published in [129]. Various
generalisations of it in the case of correlated observations are contained in the
articles of Borshchevsky and the author [40,41]; c/., also the book of Leonenko
and the author [134]. The work [15] of Bardadym and the author is devoted to
the asymptotic normality of the la-estimators.
COMMENTARY 305

10. Latterly the theory of the a.e. of the probabilistic characteristics of statis-
tical estimators has been mostly constructed. In addition to the source literature,
mentioned in the commentary on Section 7, we refer to the important theoretical
investigations of Bhattacharya [30,31]' Chibisov [58], Hayakawa [113], Chandra
and Ghosh [47,48], Pfanzagl [180], Goetze [91], Skovgaard [205], and Pfanzagl and
Wefelmeyer [181].
An alternative to the expansions of Edgeworth of the distributions of statistical
estimators is the expansions obtained by the saddlepoint approximation method
and they have a series of useful statistical properties. From the large amount of
current literature on this problem we point to the theme of our book close to the
publications of Robinson et al. [139], Shuteev [204], and Jing and Robinson [139].
The application of the saddlepoint approximation to non-linear regression analysis
will be carried out in the near future.
The first variants of Theorem 24 were obtained by the author [124] for q = 1
and by Zwanzig and the author [135, 137] for q ~ 1. A full proof of Theorem 24
on the a.e. of a distribution of the l.s.e. with weakened regularity conditions is
published for the first time.
11. The calculation of the initial terms of the a.e. of Theorem 24 and others
is a subject of much applied interest, and is a laborious task. In the calculations
of Section 11 we follow the scheme of Michel's work [158]. Very helpful machinery
for a.e.-s in statistics is developed in the book by Barndorf-Nielsen and Cox [223].

CHAPTER 3
12. The a.e. of the moments of some statistical estimators of a scalar parameter
were obtained by Gusev [102] in connection with a problem of Linnik, see also the
work of Burnashev [45]. For the model (0.1) with Gaussian errors of observation
the first terms of the a.e. of the correlation matrix of the normed l.s.e. without
accounting for the remainder terms was found by Clarke [68]. The first terms of
the a.e. of the bias vector of an l.s.e. were obtained earlier by Box [42]. The Section
contains a revised version of results of the author [127].
13. Theorem 26 generalises a result of Bardadym and the author [13]. Theo-
rem 29 was published in a note by Ivanitskaya and the author [122] and is an essen-
tial strengthening of the results of Schmidt and Zwanzig [199,217]. Lemma 29.1
is a special case of more general assertions of Sadikova [197] and Yurinsky [219],
based on a Lemma of van Korput. Nevertheless, the proof of Lemma 29.1 uses
other conditions and is not based on van Korput's Lemma. One general fact
about the c.f. of singular distributions is contained in Bhattacharya [30]. Our
Lemma 29.2 is close to one of the assertions of Qumasiyeh [188].
14. The polynomials Rl and R2 can, in fact, be obtained within the framework
of the linear theory, see also the article by Schmidt and Zwanzig [199]. The
polynomial R3 was found by the &-method by Grigor'ev [92]. The results of the
Section were published in the work [97] by Grigor'ev and the author.
306 COMMENTARY

15. This Section contains, in a revised form, the results of the work of Bar-
dadym and the author [17-19,11]. Theorems 31 and 32 generalise and sharpen
the results of Zwanzig [217]. The work of Shao [220] borders on the theme of this
Section, and we also point to the monograph of Hall [105].
16. Theorem 35 is a basic result of the work of Grigor'ev and the author [99].
The preliminary assertions about the a.e. of the distributions of the functionals
7(4) and 7(1) were obtained by Bardadym and the author [12,16]. The first a.e. of
the distribution 7(1) was in fact found by Beale [28]. The X2 a.e. for the likelihood
ratio and other statistics is contained in the principal work of Chandra and Ghosh
[47], see also the work of Hayakawa [113].
The results of the section together with the corresponding results of Chapter 4
immediately border upon the problem of the construction of intervals and regions
of confidence for the non-linear regression parameter. At different times this prob-
lem was considered by Beale [28], Halperin [106], Hamilton, Watts and Bates [24],
Khorosani and Milliken [143], Pazman [168], Bates and Watts [26], Hamilton [107],
and others.
17. We follow the account in the article of Grigor'ev and the author [100].
The extension of the results of this Section to the case q ~ 2 is more likely to be
associated with difficulties that are calculational rather than of principle.

CHAPTER 4
18. Many publications have been dedicated to the development of geometric
methods in mathematical statistics. We must mention the works of Chentsov [53],
Amari [3-5], Barndorf-Nielson and Cox [21], Barndorf-Nielson, Cox and Reid [22],
McCullagh [155], McCullagh and Cox [156], Efron [76,77], Eguchi [78], Saville and
Wood [198], Murray and Rice [164], to name but a few, see the surveys by Kass
[141], and Morozova and Chentsov [54].
The presentation of Section 18 is based on the texts of the theses of Grigor'ev
[93] and the author [131] and their joint publications [94,95]. Subsections 1 and 2
are presented in the spirit of the approach of Amari [4,5], see also [156]. The
measures of non-linearity of a regression model were first studied by Beale [28]
and then by Guttman and Meeter [103], Bates and Watts [25,26], Bates, Watts
and Hamilton [24], and Hamilton and Wiens [108]. The reparametrisation of non-
linear regression models was studied by Haugaard [116,117], Kass [140], Ross [195],
see also the works [25,26] cited above. A series of geometric questions in the theory
of non-linear regression was considered by Clarke [69], Pazman [168,169] (see also
his works [170, 171]). The proof of Theorem 40 is borrowed from [93].
19. Box [42] indicated the possibility of decreasing the bias of the l.s.e. with the
help of the reparametrisation of a model. The representation (19.8) was obtained
by Grigor'ev and the author [95]. The presentation used here for the material of
Subsection 1 is based on the theses of Grigor'ev [93] and the author [131]. The
result of Subsection 2 on the invariance with respect to the reparametrisation
COMMENTARY 307

of the first three polynomials of the a.e. of a distribution of an estimator of the


variance of errors of observation in non-linear Gaussian regression is the principal
result of Grigor'ev and the author in [97]. The presentation of the material in
Subsections 4 and 5 follows the works of Grigor'ev and the author [98-100].
Bibliography

[1] Ambramovitz, M.A. and Stegun, LA. (1964) Handbook of Mathematical


Functions, (National Bureau of Standards: Washington, DC)
[2] Alekseevskii, D.V., Vinogradov, A.M. and Lychagin, V.V. (1989) The Ba-
sic Concepts and Notations of Differential Geometry, The Modern Prob-
lems of Mathematics. Fundamental Directions 28, Geometry-I, (VINITI
(Vsesoyuznyi Institut Nauchnoi i Tekhnicheskoi Informatsii): Moscow) (in
Russian)
[3] Amari, SJ. (1982) Differential Geometry of Curved Exponential Families-
Curvatures and Information Loss, Ann. Statist., 10, 357-385
[4] Amari, S.L (1985) Differential Geometrical Methods in Statistics, Lecture
Notes in Statistics, (Springer)
[5] Amari, SJ. (1986) Differential Geometrical Theory of Statistics - Towards
New Developments, Differential Geometry in Statistical Inference, IMS
Monographs, (Institute of Mathematical Statistics: Hayward, CA)
[6] Anderson, T.W. (1958) An Introduction to Multivariate Statistical Analysis,
(Wiley)
[7] Aitken, A.C. (1935) On Least Squares and Linear Combination of Observa-
tions, Proc. Roy. Soc. Edin. A, 35, 42-48
[8] Arkin, V.L and Evstigneev, LV. (1979) Probabilistic Models of Control and
the Dynamics of Economics, (Nauka: Moscow), (in Russian)
[9] Bard, Y. (1974) Nonlinear Parameter Estimation, (Academic Press)
[10] von Bahr, B. (1967) Multi-Dimensional Integral Limit Theorems, Arkiv
Math., 7, 71-88
[11] Bardadym, T.A. (1996) On the Strucutre of Asymptotic Expansions of Least
Squares Estimator and Variance Estimator, Teor. Veroyatnost. Mat. Statist.,
(to appear), (in Ukrainian)
[12] Bardadym, T.A. and Ivanov, A.V. (1985) An Asymptotic Expansion Related
to an Empirical Regression Function, Theor. Prob. Math. Statist., 30, 7-13

309
310 BIBLIOGRAPHY

[13] Bardadym, T.A. and Ivanov, A.V. (1986) Asymptotic Expansions Related
to the Estimation of the Error Variance of Observations in the 'Signal Plus
Noise'Model, Theor. Prob. Math. Statist., 33, 11-20
[14] Bardadym, T.A. and Ivanov, A.V. (1985) On a Polynomial Approximation
of Least Squares Estimator Using Non-Standard Normalisation, Dokl. Akad.
Nauk Ukrain. A, 7, 59-61 (in Russian)

[15] Bardadym, T.A. and Ivanov, A.V. (1988) Asymptotic Normality of la-
Estimator of Non-Linear Regression Model Parameters, Dokl. Akad. Nauk
Ukrain. A, 8, 68-70 (in Russian)

[16] Bardadym, T.A. and Ivanov, A.V. (1993) An Asymptotic Expansion of the
Distribution of Some Functional of the Least Squares Estimator, Theor.
Prob. Math. Statist., 47, 1-8
[17] Bardadym, T.A. and Ivanov, A.V. (1995) Asymptotic Expansions Connected
with Jack-Knife Estimator of the Error Variance of the Observation in Non-
Linear Regression Model. I. Ukrain. Mat. Zh., 47, 443-451 (in Russian)
[18] Bardadym, T.A. and Ivanov, A.V. (1995) Asymptotic Expansions Connected
with Jack-Knife Estimator of the Error Variance of the Observation in Non-
Linear Regression Model. II. Ukrain. Mat. Zh., 47, 731-736 (in Russian)
[19] Bardadym, T.A. and Ivanov, A.V. (1995) Asymptotic Properties of a Func-
tional Used as an Estimator of Observational Error Variance in the 'Signal
Plus Noise' Model, Dokl. Akad. Nauk Ukrain. A, 5, 69-72 (in Ukrainian)
[20] Bardadym, T.A. and Ivanov, A.V. Cross-Validation Functional Asymptotic
Expansion, Theor. Prob. Math. Statist., (to appear) (in Ukrainian)
[21] Barndorf-Nielsen, O.E. and Cox, D.R. (1986) Differential and Integral Geo-
metry in Statistical Inference, Some Aspects of Differential Geometry in
Statistical Inference, IMS Monographs, (Institute of Mathematical Statis-
tics: Hayward, CA)

[22] Barndorf-Nielsen, O.E., Cox, D.R. and Reid, N. (1986) The Role of Differ-
ential Geometry in Statistical Theory, Int. Statist. Rev., 54, 83-96

[23] Basset G, and Koenker, R. (1978) Asymptotic Theory of Least Absolute


Error Regression, J. Amer. Stat. Assocn., 73, 618-622

[24] Bates, D.M., Hamilton, D.G. and Watts, D.G. (1982) Accounting for Intrin-
sic Nonlinearity in Nonlinear Regression Parameter Inference Regions, Ann.
Statist., 10, 386-393

[25] Bates, D.M. and Watts, D.G. (1980) Relative Curvature Measures of Non-
linearity, J. Roy. Statist. Soc. B, 42, 1-25
BIBLIOGRAPHY 311

[26] Bates, D.M. and Watts, D.G. (1981) Parameter Transformations for Im-
proved Approximate Confidence Regions in Nonlinear Least Squares, Ann.
Statist., 9, 1152-1167
[27] Bates, D.M. and Watts, D.G. (1988) Nonlinear Regression Analysis and Its
Applications, (Wiley)
[28] Beale, E.M.L. (1960) Confidence Regions in Non-Linear Estimation (with
Discussion), J. Roy. Statist. Soc. B, 22, 41-88
[29] Bellman, R. (1960) Introduction to Matrix Analysis, (McGraw-Hill)

[30] Bhattacharya, R.N. (1977) Refinements of the Multidimensional Central


Limit Theorem and Applications, Ann. Prob., 5, 1-27
[31] Bhattacharya, R.N. (1985) Some Recent Results on Cramer-Edgeworth Ex-
pansions with Applications, Multi-Variate Analysis, IV, (ed. Krishnaiah,
P.R.), (Elsevier) 57-75
[32] Bhattacharya, R.N. and Ghosh, J.K. (1978) On the Validity of the Formal
Edgeworth Expansion, Ann. Statist., 6, 434-451
[33] Bhattacharya, R.N. and Ranga Rao, R. (1976) Normal Approximation and
Asymptotic Expansions, (Wiley)
[34] Bickel, P.J., Chibisov, D.M. and van Zwet, W.R. (1981) On Efficiency of
First and Second Order, Int. Statist. Rev., 49, 169-175
[35] Birge, L. and Massart, P. (1993) Rates of Convergence for Minimum Contrast
Estimators, Prob. Theory Rei. Fields, 97, 113-150
[36] Birkes, D and Dodge, Y. (1993) Alternative Methods in Regression, (Wiley)

[37] Bloomfield, P. and Steiger, W. (1980) Least Absolute Deviations Curve-


Fitting, SIAM J. Sci. Statist. Comput., 1, 290-301
[38] Bloomfield, P. and Steiger, W. (1984) Least Absolute Deviations Theory,
Applications and Algorithm. Progress in Probability and Statistics, (ed Hu-
ber, P. and Rosenblatt, M.) (Birkhauser)
[39] Borovkov, A.A. (1984) Mathematical Statistics, (Nauka: Moscow) (in Rus-
sian)

[40] Borschchevsky, A.V. and Ivanov, A.V. (1985) A property of the Optimum
Point in a Problem of Data Processing by the Least Deviations Method,
Dokl. Akad. Nauk Ukrain. A, 1, 53-56 (in Russian)

[41J Borshchevsky, A.V. and Ivanov, A.V. (1985) On the Normal Approximation
of the Distribution of the Optimum. Point in a Problem of Data Processing
by the Least Moduli Method, Kibernetika, 6, 86-92 (in Russian)
312 BIBLIOGRAPHY

[42] Box, M.J. (1971) Bias in Nonlinear Estimation (with Discussion), J. Roy.
Statist. Soc. B, 32, 171-201
[43] Bunke, H. (1977) Parameter Estimation in Nonlinear Regression Models,
Math. Oper.forsch. Statist.: Ser. Statist., 8, 23-40
[44] Burnashev, M.V. (1977) Asymptotic Expansions of Signal Parameter Es-
timators in White Gaussian Noise, Mat. Sbornik 104(146), 179-206 (in
Russian)
[45] Burnashev, M.V. (1981) The Investigation of the Second Order Properties of
Statistical Estimates in the Scheme of Independent Observations, Izv. Akad.
Nauk SSSR: Ser. Mat., 45, 509-539 (in Russian)
[46] Chandra, T.K. (1980) Asymptotic Expansions and Deficiency, (Doctoral
Thesis), (Indian Statistical Institute: Calcutta)
[47] Chandra, T.K. and Ghosh, J.K. (1979) Valid Asymptotic Expansions for
the Likelihood Ratio Statistic and Other Perturbed Chi-Square Variables,
Sankhya A, 41, 22-47
[48] Chandra, T.K. and Ghosh, J.K. (1980) Valid Asymptotic Expansions for
the Likelihood Ratio and Other Statistics Under Contiguous Alternatives,
Sankhya A, 42, 170-184
[49) Chandra, T.K. and Joshi, S.N. (1983) Comparison on the Likelihood Ratio,
Rao's and Wald's Tests and Conjecture of C.R. Rao, Sankhya A, 41, 226-
246
[50] Chandra, T.K. and Mukerjee, R. (1984) On the Optimality of Rao's Statis-
tics, Commun. Statist. - Theor. Meth., 13, 1507-1515
[51] Chandra, T.K. and Mukerjee, R. (1985) Comparison of the Likelihood Ratio,
Wald's and Rao's Tests, Sankhya A, 47, 271-284
[52) Chandra, T.K. and Mukerjee, R. (1991) Bartlett Type Modification for Rao's
Efficient Score Statistics, J. Multivar. Anal. 36, 101-112
[53) Chentsov, N.N. (1972) Statistical Decision Rules and Optimal Inference,
(Nauka: Moscow) (in Russian)
[54) Chentsov, N.N. and Morozova, E.A. (1991) Natural Geometry of the Families
of Probabilistic Laws, The Modern Problems of Mathematics. Fundamental
Directions, (VINITI: Moscow), 83, 133-265 (in Russian)
[55) Chibisov, D.M. (1972) Asymptotic Methods in Mathematical Statistics, (Doc-
tor of Sciences Thesis), (Moscow) (in Russian)
[56) Chibisov, D.M. (1972) An Asymptotic Expansion for Maximum Likelihood
Estimators, Teor. Veroyatnost. Primen., 17, 387-388 (in Russian)
BIBLIOGRAPHY 313

[57] Chibisov, D.M. (1972) An Asymptotic Expansion for Distribution of Statis-


tics Allowing for the Asymptotic Expansion, Teor. Veroyatnost. Prim en. ,
17, 658-668 (in Russian)
[58] Chibisov, D.M. (1972) On the Normal Approximation for a Certain Class of
Statistics, Proc. 6th Berkeley Symp. Mathematical Statistics and Probabil-
ity, I, 153-174
[59] Chibisov, D.M. (1973) An Asymptotic Expansion for a Class of Estimators
Containing Maximum Likelihood Estimators, Teor. Veroyatnost. Primen.,
18, 303-311 (in Russian)
[60] Chibisov, D.M. (1973) An Asymptotic Expansion for Distributions of Sums
of Special Type with Application to Minimum Contrast Estimators, Teor.
Veroyatnost. Primen., 18, 689-702 (in Russian)
[61] Chibisov, D.M. (1974) Asymptotic Expansions for Some Asymptotically
Optimal Tests, Proc. Prague Symp. on Asymptotic Statistics, Pmgue. II,
37-68
[62] Chibisov, D.M. (1979) Asymptotic Expansion for the Distribution of Statis-
tics Admitting a Stochastic Expansion, Preprints in Statistics, 47, (Univer-
sity of Cologne)
[63] Chibisov, D.M. (1980) Asymptotic Expansion for Distributions of C(o:) Test
Statistics, Proc. 6th Int. Conf. Wisla, 1978, Lecture Notes in Statistics,
(Springer: New York, NY) 63-96
[64] Chibisov, D.M. (1980) Asymptotic Expansion for Distribution of Statistics
Admitting a Stochastic Expansion. I, Teor. Veroyatnost. Primen., 25, 745-
756; (1981) Asymptotic Expansion for Distribution of Statistics Admitting a
Stochastic Expansion. II, Teor. Veroyatnost. Primen., 26, 3-14 (in Russian)

[65] Chibisov, D.M. (1982) An Asymptotic Expansion in Hypothesis Testing


Problems. I, Izv. Akad. Nauk Uzbek., 5, 18-26; An Asymptotic Expansion
in Hypothesis Testing Problems. II, Izv. Akad. Nauk Uzbekistan., 6, 23-30
(in Russian)
[66] Chibisov, D.M. (1982) Power and Deficiency of Asymptotically Normal Crit-
eria, Teor. Veroyatnost. Primen., 27, 812-813 (in Russian)

[67] Christensen, R. (1990) Log-Linear Models, Springer Texts in Statistics,


(Springer)

[68] Clarke, G.P.Y. (1980) Moments of the Least Squares Estimators in a Non-
Linear Regression Model, J. Roy. Statist. Soc. B, 42, 227-237

[69] Clarke, G.P.Y. (1987) Marginal Curvatures and Their Usefulness in the Ana-
lysis of Nonlinear Regression Models, J. Amer. Stat. Assocn., 82, 844-850
314 BIBLIOGRAPHY

[70] Demidenko, E.Z. (1981) Linear and Non-Linear Regression, (Finansy i


Statististika: Moscow) (in Russian)

[71] Demidenko, E.Z. (1989) Optimisation and Regression, (Nauka: Moscow) (in
Russian)

[72] Dorogovtsev, A.Ya. (1982) The Theory of Estimation of Random Processes,


Parameters, (Vishcha Shkola: Kiev) (in Russian)

[73] Dorogovtsev, A.Ya. (1992) Consistency of Least Squares Estimator of an


Infinite Dimensional Parameter, Siber. Math. J., 33, 65-69 (in Russian)

[74] Dorogovtsev, A.Ya. (1993) On Asymptotic Normality of Least Squares Es-


timator of Infinite-Dimensional Parameter, Ukrain. Math. J., 45, 44-53 (in
Russian)

[75] Dzhaparidze, K. and Sieders, A. (1987) A Large Deviation Result for Para-
meter Estimators and Its Applications to Non-Linear Regression Analysis,
Ann. Statist., 15, 1031-1049

[76] Efron B. (1975) Defining the Curvature of a Statistical Problem (with Ap-
plications to Second Order Efficiency) (with Discussion), Ann. Statist., 3,
1199-1242

[77] Efron, B. (1978) The Geometry of Exponential Families, Ann. Statist. 6,


362-376

[78] Educhi, S. (1985) A Differential Geometric Approach to Statistical Inference


on the Basis of Contrast Functionals, Hiroshima Math. J., 15, 341-391

[79] Eisenhart, C. (1961) Boscovitch and the Combination of Observation, Roger


Joseph Boscovich, (ed. White, L.L.), (Fordham University Press).

[80] Feller, W. (1971) An Introduction to Probability Theory and Its Applications.


Vol. II, (2nd edn), (Wiley)

[81] Fikhtengolts, G.M. (1966) The Course of Differential and Integral Calculus.
Vol. II, (Nauka: Moscow) (in Russian)

[82] Frangos, C.C. (1987) An Updated Bibliography on the Jack-Knife Method,


Commun. Statist. - Theory Meth., 16, 1543-1584

[83] (1977) Gallant, A.R. Testing a Nonlinear Regression Specification. A Non-


regular Case, J. Amer. Statist. Assoc., 72, 359

[84] Gallant, A.R. (1987) Nonlinear Statistical Models, (Wiley)

[85] Gauss, K.F. (1809) Werke. Vol. I, (Gottingen)


BIBLIOGRAPHY 315

[86] van de Geer, S.A. (1986) On Rates of Convergence in Least Squares Esti-
mation, Report, (Centre for Mathematics and Computational Science: Am-
sterdam)
[87] van de Geer, S.A. (1988) Asymptotic Normality of Minimum L1-Norm Es-
timators in Linear Regression, (Centre for Mathematics and Computational
Science: Amsterdam), MS-R8806
[88] Gelfand, 1.M. (1966) Lecture Notes in Linear Algebra, (Nauka: Moscow) (in
Russian)
[89] Gikhman, 1.1. and Skorokhod, A.V. (1977) Introduction to the Theory of
Random Processes, (Nauka: Moscow) (in Russian)

[90] Ghosh, J.K. (1991) Higher Order Asymptotic for the Likelihood Ratio, Rao's
and Wald's Tests, Statist. Prob. Lett., 12, 505-509
[91] Goetze, F. (1981) On Edgeworth Expansions in Banach Spaces, Ann. Prob.,
9, 852-859
[92] Grigor'ev, Yu.D. (1992) Asymptotic Distribution of Observation Variance
Estimator in Non-Linear Regression Model, Avtomat. Telemekh., 4, 37-43
(in Russian)
[93] Grigor'ev, Yu.D. (1994) Development and Investigation of Algorithms of
Non-Linear Regression Models Analysis, (Doctor of Sciences Thesis) (State
Technical University: Novosibirsk)
[94] Grigor'ev, Yu.D. and Ivanov, A.V. (1987) Asymptotic Expansions in Non-
Linear Regression Analysis, Zavod. Labor., 53, 48-51 (in Russian)
[95] Grigor'ev, Yu.D. and Ivanov, A.V. On Measures of the Non-Linearity of
Regression Models, Zavod. Labor., 53, 57-61 (in Russian)
[96] Grigor'ev, Yu.D. and Ivanov, A.V. (1991) Asymptotic Expansion of Power
of Criteria for Hypothesis Testing on Nonlinear Regression Parameter under
Contiguous Alternatives, Dokl. Akad. Nauk Ukrain., 1, 7-10 (in Russian)
[97] Grigor'ev, Yu.D. and Ivanov, A.V. (1992) Asymptotic Expansions Associ-
ated with an Estimator of the Observation Error Variance in Non-Linear
Gaussian Regression, Cybernet. Sys. Anal., 28, 62-71

[98] Grigor'ev, Yu.D. and Ivanov, A.V. (1992) On Asymptotic Expansion of the
Distribution of Some Functional of Least Squares Estimator, Dokl. Akad.
Nauk Ukrain., 7, 26-30 (in Russian)

[99] Grigor'ev, Yu.D. and Ivanov, A.V. (1993) Asymptotic Expansions for
Quadratic Functionals of the Least Squares Estimator of a Nonlinear Re-
gression Parameter, Math. Meth. Statist., 2, 269-294
316 BIBLIOGRAPHY

[100] Grigor'ev, Yu.D. and Ivanov, A.V. (1995) Comparison of Powers of a Certain
Class of Criteria for Hypothesis Testing for Non-Linear Regression Paramet-
ers, Siber. Adv. Math., 5, N2, 68-98
[101] Gurevich, V.A. (1983) The Least Moduli Method for the Non-Linear Re-
gression Model, Applied Statistics, (Nauka: Moscow) (in Russian)
[102] Gusev, S.1. (1975) Asymptotic Expansions Associated with Some Statistical
Estimators in the Smooth Case. I, Teor. Veroyatnost. Primen., 20, 488-514;
Asymptotic Expansions Associated with Some Statistical Estimators in the
Smooth Case. II, Teor. Veroyatnost. Primen., 21, 16-33 (in Russian)
[103] Guttman, I. and Meeter, D. (1965) On Beal's Measures of Nonlinearity,
Technometrics, 7, 623-637
[104] Haines, L.M. (1994) A Note on the Differential Geometry of Least Squares
Estimator for Nonlinear Regression Models, S. Afric. Statist. J., 28, 73-91
[105] Hall, P. (1992) The Bootstrap and Edgeworth Expansion, Springer Series in
Statistics, (Springer)
[106] Halperin, M. (1963) Confidence Interval Estimation in Nonlinear Regression,
J. Roy. Statist. Soc. B, 25, 330-333

[107] Hamilton, D. (1986) Confidence Regions for Parameter Subsets in Nonlinear


Regression, Biometrika, 73, 57-64
[108] Hamilton, D. and Wiens, D. (1987) Correction Factors for F-Ratios in Non-
linear Regression, Biometrika, 74, 423-425
[109] Hannan, E.J. (1971) Non-Linear Time Series Regression, J. Appl. Prob., 8,
767-780
[110] Hannan, E.J. (1973) The Estimation of Frequency, J. Appl. Prob., 10, 510-
519
[111] Hartley, H.O. (1964) Exact Confidence Regions for the Parameters in Non-
linear Regression Laws, Biometrika, 51, 437-453
[112] Hartley, H.O. and Booker, A. (1965) Non-Linear Least Squares Estimation,
Ann. Math. Statist., 36, 638-650
[113] Hayakawa, T. (1977) The Likelihood Ratio Criterion and the Asymptotic
Expansion of its Distributions. A, Ann. [nst. Statist. Math., 29, 359-378
[114] Hodges, J.L., Jr. and Lehmann, E.L. (1970) Deficiency, Ann. Math. Statist.,
41, 783-801
[115] Hoeffding, W. and Wolfowitz, J. (1958) Distinguishability of Sets of Distrib-
utions, Ann. Math. Statist., 29, 700-718
BIBLIOGRAPHY 317

[116] Hougaard, P. (1982) Parametrisations of Non-Linear Models, J. Roy. Statist.


Soc. B, 44, 244-252
[117] Hougaard, P. (1984) Parameter Transformations in Multi-Parameter Non-
Linear Regression Models, (Preprint), 2, (Institute of Mathematical Statis-
tics: University of Copenhagen)
[118] Huber, P.J. (1967) The Behaviour of Maximum Likelihood Estimates under
Nonstandard Conditions, Proc. 5th Berkeley Symp. on Mathematical Statis-
tics and Probability. I, (University of California Press: Berkeley) 221-234
[119] Huber, P.J. (1981) Robust Statistics, (Wiley)
[120] Ibragimov, LA. and Has'minskii, R.Z. (1979) Asymptotic Theory of Estima-
tion, (Nauka: Moscow) (in Russian)
[121] Ibramhalilov, LS. and Skorokhod, A.V. (1980) Consistency Estimators of
Random Processes Parameters, (Naukova Dumka: Kiev) (in Russian)
[122] Ivanitskaya, L.S. and Ivanov, A.V. (1992) Asymptotic Expansion of Dis-
tribution of the Observational Error Variance in the Nonlinear Regression
Model, Ukrain. Math. J., 43, 648-655
[123] Ivanov, A.V. (1976) Berry-Esseen Inequality for the Distribution of Least
Squares Estimates of Parameters of the Nonlinear Regression Function, Mat.
Zam., 20, 721-727
[124] Ivanov, A.V. (1976) An Asymptotic Expansion for the Distribution of Least
Squares Estimator of a Parameter of the Nonlinear Regression Function,
Theor. Prob. Applns., 21, 557-570
[125] Ivanov, A.V. (1977) On the Rate of Convergence of Least Squares Estim-
ator Distribution to a Normal Law, Theory of Random Processes, (Naukova
Dumka: Kiev), 5, 39-44 (in Russian)
[126] Ivanov, A.V. (1980) A Solution of the Problem of Detection of Hidden Peri-
odicities, Theor. Prob. Math. Statist., 20, 51-68
[127] Ivanov, A.V. (1982) An Asymptotic Expansion of Moments of Least Squares
Estimator for a Vector Parameter of Nonlinear Regression, Ukrain. Math. J.,
34, 134-139
[128] Ivanov, A.V. (1984) Two Theorems on Consistency of Least Squares Estim-
ator, Theor. Prob. Math. Statist., 28, 25-34
[129] Ivanov, A.V. (1984) On Consistency and Asymptotic Normality of Least
Moduli Estimator, Ukrain. Math. J., 36, 267-272
[130] Ivanov, A.V. (1991) On Consistency of la-Estimators of the Regression
Function Parameters, Theor. Prob. Math. Statist., 42, 47-53
318 BIBLIOGRAPHY

[131] Ivanov, A.V. (1991) Estimation Theory of Nonlinear Regression Models


Parameters, (Doctor of Sciences Thesis: Institute of Mathematics of the
Ukrainian Academy of Sciences: Kiev) (in Russian)

[132] Ivanov, A.V. and Kozlov, O.M. (1980) On Properties of the Regression Es-
timators for Nonlinear Subjects, Kibernetika, 5, 113-119 (in Russian)

[133] Ivanov, A.V. and Kozlov, O.M. (1981) On Consistency of Minimum Contrast
Estimators in the Case of Non-Identically Distributed Observations, Theor.
Prob. Math. Statist., 23, 63-72

[134] Ivanov, A.V. and Leonenko, N.N. (1989) Statistical Analysis of Random
Fields, (Kluwer Academic Publishers: Dordrecht)

[135] Ivanov, A.V. and Zwanzig, S. (1981) An Asymptotic Expansion for the Dis-
tribution of Least Squares Estimator of a Vector Parameter in Non-Linear
Regression, Sov. Math. Dokl., 23, 118-121

[136] Ivanov, A.V. and Zwanzig, S. (1983) An Asymptotic Expansion of the Least
Squares Estimator of a Non-Linear Regression Vector Parameter, Theor.
Prob. Math. Statist., 26, 45-52

[137] Ivanov, A.V. and Zwanzig, S. (1983) An Asymptotic Expansion of the Dis-
tribution of Least Squares Estimators in the Nonlinear Regression Model,
Math. Oper.forsch. Statist.: Ser. Statist., 14, 7-27

[138] Jennrich, R.I. (1969) Asymptotic Properties of Non-Linear Least Squares


Estimators, Ann. Math. Statist., 40, 633-643

[139] Jing, B. and Robinson, J. (1994) Saddlepoint Approximation for Marginal


and Conditional Probabilities of Transformed Variables, Ann. Math. Statist.,
22, 1115-1132

[140] Kass, R.E. (1984) Canonical Parametrizations and Zero Parameter-Effect


Curvature, J. Roy. Statist. Soc. B, 46, 86-92

[141] Kass, R.E. (1989) The Geometry of Asymptotic Inference, Statist. Sci., 4,
188-234

[142] Kendall, M., Stuart, A. and Ord, J. Keith (1987) Kendall's Advanced Theory
of Statistics, Vol. I, (5th edn), (Charles Griffin & Co.: London)

[143] Korosani, F. and Milliken, G.A. (1982) Simultaneous Confidence Bands for
Nonlinear Regression Models, Commun. Statist- Theor. Meth., 11, 1241-
1253

[144] Knopov, P.S. (1981) Optimal Estimators of Parameters of Stochastic Sys-


tems, (Naukova Dumka: Kiev) (in Russian)
BIBLIOGRAPHY 319

[145] Kukush, A.G. (1989) Asymptotic Properties of the Estimator of a Nonlin-


ear Regression Infinite-Dimensional Parameter, Math. Today, 5, 84-105 (in
Russian)

[146] Kukush, A.G. (1995) Asymptotic Properties of Estimator of Infinite-


Dimensional Parameters of Random Processes, (Doctor of Sciences Thesis:
Kiev Mathematical Institute) (in Russian)

[147] Kutoyants, Yu.A. (1980) Random Processes Parameter Estimation,


(Akademiya Nauk Armyanskoi S.S.R.: Yerevan) (in Russian)

[148] Linnik, Yu.V. (1962) Least Squares Method and Foundations of Observation
Theory, (Fizmatgiz: Moscow) (in Russian)

[149] Linnik, Yu.V. and Mitrofanova, N.M. (1963) On Asymptotic Distribution of


Maximum Likelihood Estimators, Sov. Math., 4, 421-423

[150] Linnik, Yu.V. and Mitrofanova, N.M. (1965) Some Asymptotic Expansions
for the Distribution of the Maximum Likelihood Estimate, Sankhya. A, 27,
73-82

[151] Loeve, M. (1963) Probability Theory, (Van Nostrand: Princeton, NJ)

[152] Malinvaud, E. (1970) The Consistency of Nonlinear Regression, Ann. Math.


Statist., 41, 953-969

[153] Markoff, A.A. (1900) Wahrscheinlichkeitsrechnung, (Teubner: Leipzig)

[154] Markov, A.A. (1898) The Law of Large Numbers and the Least Squares
Methods, Selected Works, (1951) (Izdatel'stvo Akademiya Nauk S.S.S.R:
Moscow) 233-251 (in Russian)

[155] McCullagh, P. (1987) Tensor Methods in Statistics, Monographs on Statistics


and Applied Probability, (Chapman and Hall: London)

[156] McCullagh, P. and Cox, D.R. (1986) Invariants and Likelihood Ratio Statis-
tics, Ann. Statist., 14, 1419-1430

[157] Michel, R. (1973) The Bound in the Berry-Esseen Result for Minimum Con-
trast Estimates, Metrika, 20, 148-155

[158] Michel, R. (1975) An Asymptotic Expansion for the Distribution of Asymp-


totic Maximum Likelihood Estimators of Vector Parameters, J. Multivar.
Anal., 5, 67-82

[159] Michel, R. (1977) A Multidimensional Newton-Raphson Method and Its Ap-


plications to the Existence of Asymptotic Fn-Estimators and Their Stoch-
astic Expansions, J. Multivar. Anal., 7, 235-248
320 BIBLIOGRAPHY

[160] Michel, R. and Pfanzagl, J. (1971) The Accuracy of the Normal Approx-
imation for Minimum Contrast Estimates, Z. Wahr.theor. verw. Geb., 18,
73-84

[161] Mishchenko, A.S. and Fomenko, A.T. (1980) Lecture Notes on Differential
Geometry and Topology, (Moscow University Press)

[162] Mitrofanova, N.M. (1967) On Asymptotics of Maximum Likelihood Estim-


ator of Vector Parameter Distribution, Teor. Veroyatnost. Primen., 12, 418-
425 (in Russian)

[163] Mukerjee, R. (1988) Comparison of Tests in the Multiparameter Case. I.


Second Order Power J. Multivar. Anal., 33, 17-30; (1988) Comparison of
Tests in the Multiparameter Case. II. A Third Order Optimality Property
of Rao's Test, J. Multivar. Anal., 33, 31-48

[164] Murray, M.K. and Rice, J.W. (1993) Differential Geometry in Statistics,
Monographs in Statistics and Applied Probability, 48, (Chapman and Hall:
London and New York)

[165] Nagaev, S.V. and Fook, D.X. (1971) Probability Inequalities for the Sums
of Independent Random Variables, Theor. Prob. Appl., 16, 660-675

[166] Neyman, J. and Davis, F.N. (1938) Extension of the Markoff' Theorem on
Least Squares, Statist. Res. Mem., 2, 105-116

[167] Oberhofer, W. (1982) The Consistency of Non-Linear Regression Minimizing


the L1 Norm, Ann. Statist., 10, 316-319

[168] Pazman, A. (1982) Geometry of Gaussian Non-Linear Regression-Parallel


and Confidence Intervals, Kibernetika, 18, 376-396

[169] Pazman, A. (1990) Small-Sample Distributional Properties of Nonlinear Re-


gression Estimators (A Geometric Approach), Statistics, 21, 323-367

[170] Pazman, A. (1990) Almost Exact Distributions of Estimators. 1- Low-


Dimensional Nonlinear Regression, Statist., 21, 9-19

[171] Pazman, A. (1990) Almost Exact Distributions of Estimators. II-Flat Non-


linear Regression Models, Statist., 21, 21-33

[172] Petrov, V.V. (1975) Sums of Independent Random Variables, (Springer:


Berlin)

[173] Petrov, V.V. (1987) Limit Theorems for Sums of Independent Random Vari-
ables, (Nauka: Moscow) (in Russian)

[174] Pfanzagl, J. (1969) On the Measurability and Consistency of Minimum Con-


trast Estimates, Metrika, 14, 249-272
BIBLIOGRAPHY 321

[175] Pfanzagl, J. (1971) The Berry-Esseen Bound for Minimum Contrast Estim-
ates, Metrika, 17, 82-91
[176] Pfanzagl, J. (1973) Asymptotically Optimum Estimation and Test Proced-
ures, Proc. Prague Symp. on Asymptotic Statistics {1973} Vol. I, (Charles
University: Prague) 201-272
[177] Pfanzagl, J. (1973) Asymptotic Expansions Related to Minimum Contrast
Estimators, Ann. Statist., 1, 993-1026
[178] Pfanzagl, J. (1973) The Accuracy of the Normal Approximation for Estim-
ates of Vector Parameters, Z. Wahr.theor. Geb., 25, 171-198
[179] Pfanzagl, J. (1979) First Order Efficiency Implies Second Order Efficiency,
Contributions to Statistics. J. Hajek Memorial Volume, (Academia: Prague)
167-196
[180] Pfanzagl, J. (1980) Asymptotic Expansions in Parametric Decision Theory,
Developments in Statistics, 3, (Academic Press: New York) 1-97
[181] Pfanzagl, J. and Wefelmeyer, W. (1985) Asymptotic Expansions for General
Statistical Models, Lecture Notes in Statistics, 31, (Springer-Verlag: Berlin)

[182] Postnikov, M.M. (1979) Linear Algebra and Differential Geometry.


Semester 2, (Nauka: Moscow) (in Russian)
[183] Prakasa Rao, B.L.S. (1975) The Berry-Esseen Bound for Minimum Contrast
Estimators in the Independent Not Identically Distributed Case, Metrika,
22, 225-239

[184] Prakasa Rao, B.L.S. (1984) On the Exponential Rate Convergence of the
Least Squares Estimator in the Nonlinear Regression Model with Gaussian
Errors, Statist. Prob. Lett., 2, 139-142
[185] Prakasa Rao, B.L.S. (1984) The Rate of Convergence of the Least Squares
Estimator in a Non-Linear Regression Model with Dependent Errors, J. Mul-
tivar. Anal., 14, 315-322
[186] Prakasa Rao, B.L.S. (1984) The Rate of Convergence of the Least Squares
Estimator in the Nonlinear Regression Model for Multiparameter, (Preprint),
(Indian Statistical Institute)

[187] Prakasa Rao, B.L.S. (1987) Asymptotic Theory of Statistical Inference,


(Wiley)

[188] Qumasiyeh, M.B. (1990) Edgeworth Expansion in Regression Models,


J. Multivar. Anal., 35, 86-101

[189] Rao, C.R. (1965) Linear Statistical Inference and Its Applications, (Wiley)
322 BIBLIOGRAPHY

[190] Ranga Rao, R. (1960) Some Problems in Probability Theory, (D.Phil. Thesis:
University of Calcutta)
[191] Rashevsky, P.C. (1967) Riemannian Geometry and Tensor Analysis, (Nauka:
Moscow) (in Russian)
[192] Ratkovsky, D.A. (1983) Nonlinear Regression Modelling, (Marcel Dekker:
New York)
[193] Ratkovsky, D.A. (1983) Handbook of Nonlinear Regression Models, (Marcel
Dekker: New York)
[194] Robinson, J., Hoeglung, T., Horst, L. and Quine, M.P. (1990) On Approx-
imating Probabilities for Small and Large Deviations in ]Rd, Ann. Prob., 18,
727-753
[195] Ross, G.J.S. (1982) Nonlinear Models, Math. Oper.forsch. Statist.: Ser.
Statist., 13, 445-453
[196] Ross, G.J.S. (1990) Nonlinear Estimation, (Springer-Verlag)
[197] Sadikova, S.M. (1966) Some Inequalities for Characteristic Functions, Teor.
Veroyatn. Primen., 11, 500-506 (in Russian)
[198] Savill, D.J. and Wood, G.K. (1991) Statistical Methods: The Geometrical
Approach, (Springer)
[199] Schmidt, W.H. and Zwanzig, S. (1986) Second Order Asymptotics in Non-
linear Regression, J. Multivar. Anal., 18, 187-215
[200] Seber, G.A.P. (1977) Linear Regression Analysis, (Wiley)
[201] Seber, G.A.P. and Wild, C.J. (1989) Nonlinear Regression, (Wiley)
[202] Sen, A. and Srivastava, M. (1990) Regression Analysis Theory, Springer
Texts in Statistics, (Springer)

[203] Shepp, L.A. (1965) Distinguishing a Sequence of Random Variables from a


Translate of Itself, Ann. Math. Statist., 36, 1107-1112
[204] Shuteev, G.E. (1990) Saddlepoint Approximation of a Distribution of a
Statistic, Theory Prob. Appl., 35, 607-612

[205] Skovgaard, LM. (1981) Edgeworth Expansion of the Distribution of Max-


imum Likelihood Estimators on the General (non-i.i.d.) Case, Scand. J.
Statist., 8, 227-236

[206] Statulevicius, V.A. (1965) Limit Theorem for Densities and Asymptotic Ex-
pansions for Distributions of the Sums of Independent Random Variables,
Theory Prob. Appl., 10, 645-659 (in Russian)
BIBLIOGRAPHY 323

[207] Survila, P. (1964) One Local Limit Theorem for Densities, Lit. Mat. Sbomik,
4, 535-540 (in Russian)
[208] Wald, A. (1949) Note on the Consistency of the Maximum Likelihood Es-
timate, Ann. Math. Statist., 20, 595-601
[209] Walker, A.M. (1971) On the Estimation of a Harmonic Component in a Time
Series with Stationary Independent Residuals, Biometrika, 58, 21-36

[210] Walker, A.M. (1973) On the Estimation of a Harmonic Component in a Time


Series with Stationary Dependent Residuals, Adv. Appl. Prob., 5, 217-241
[211] Whittle, P. (1951) The Simultaneous Estimation of a Time Series' Harmonic
Components and Covariance Structure, Trab. Estad., 3, 43-57

[212] Wilkinson, J.H. (1965) The Algebraic Eigenvalue Problem, Monographs on


Numerical Analysis, (Clarendon Press: Oxford)
[213] Wolfovitz, J. (1965) Asymptotic Efficiency of the Maximum Likelihood Es-
timator, Teor. Veroyatnost. Primen., 10, (in Russian)
[214] Wu, C.F.J. (1981) Asymptotic Theory of Nonlinear Least Squares Estima-
tion, Ann. Statist., 9, 501-513
[215] Wu, C.F.J. (1986) Jackknifing, Bootstrap and Other Resampling Methods
in Regression Analysis (with Discussion), Ann. Statist., 14, 1261-1296

[216] Zacks, S. (1971) The Theory of Statistical Inference, (Wiley)


[217] Zwanzig, S.A. (1985) A Third Order Asymptotic Comparison of Least
Squares, Jackknifing and Cross-Validation for Error Variance Estimation in
Nonlinear Regression, Math. Oper.forsch. Statist.: Ser. Statist, 16, 47-54
[218] Yadrenko, M.l. (1980) The Spectral Theory of Random Fields, (Vishcha
Shkola: Kiev) (in Russian)
[219] Yurinsky, V.V. (1972) Bounds for Characteristic Functions of Certain De-
generate Multi-Dimensional Distributions, Teor. Veroyatnost. Primen, 17,
99-110 (in Russian)

[220] Shoo, J. (1992) Consistency of Least Squares Estimator and Its Jackknife
Variance Estimator in the Non-Linear Model, Can. J. Statist., 20, 415-428

[221] Sazonov, V.V. (1968) On the Multi-Dimensional Central Limit Theorem,


Sankhya A, 30, 181-204

[222] Sazonov, V.V. (1972) On a Bound for the Rate of Convergence in the Mul-
tidimensional Central Limit Theorem, Proc. 6th Berkeley Symposium on
Mathematical Statistics and Probability, Vol. II, (University of California
Press: Berkeley), 563-581
324 BIBLIOGRAPHY

[223] Barndorf-Nielsen, O.E. and Cox, D.R. (1989) Asymptotic Techniques for Use
in Statistics, Monographs on Statistics and Applied Probability, (London,
New York: Chapman and Hall)
Index

6-method 194, 241 Berry-Esseen inequality 97, 142, 290


X2-distribution 215 boundedness of least squares estim-
non-central 241 ator moments 18

affine and a-connectedness 253 characteristic function 121, 179, 189,


asymptotic expansion of 231, 232, 295
cumulants 195, 239-240 of the sum of random vectors
criterion power 241 127, 190
least squares estimator Chebyshev-Hermite polynomials 129,
bias 165, 276, 277 187, 195, 242, 297
correlation matrix 276-279 Christoffel symbols 253, 259
distribution 125 close alternative 230, 236
distribution, initial terms 146, consistency of
151-153 lu-estimator 40,43
mean square deviation matrix least moduli estimator 33
166 least squares estimator 17, 27,
moments 160 30, 49, 91
stochastic 79, 81-82, 91, 140, contrast function 59, 63, 70
155-156,207-208 convergence of least squares estim-
error variance estimator ator moments 95
bias 174-176 coordinates
distribution 179, 188 geodesic 260, 262, 284
distribution, initial terms 187, local 267
188, 195 polar 218
mean square deviation 177, criterion
178 u-representable 249
stochastic 172-174 invariant 285
asymptotic normality of most powerful 249
least square estimator 92, 94, one-sided 235
103 two-sided (modified) 235
least moduli estimator 109 critical region 235, 236
asymptotically equivalent criteria 247, cross-validation estimator of error
249 variance 196
asymptotic expansion of the
Bartlett correction 224-226 moments 204-206
basic statistical model 1 stochastic asymptotic expansion
basic variables 215-216, 235, 272 203

325
326 INDEX

curvature stochastic asymptotic expansion


Efron 256, 271 198
generalised 266
geodesic 258, 263 Kullback-Leibler statistic 207, 287
mean 256
McCullagh 273, 281 laws of large numbers 292, 293
normal 258 large deviations of least squares es-
principal 255 timator 17, 25, 27
Ricci (scalar) 255
statistical 254 measure of non-linearity 266
tangential 258, 263 Beale 264, 266
tensor 254 moderate deviations 52, 87, 91, 103,
total 258 209, 291
moment inequalities 290
embedded Riemannian manifold 251,
253 Neyman-Pierson statistic 207, 230
error of the first kind 241 normal plane 257
estimator
lo:- 39 orthoprojectors 257
consistent 6
inconsistent 6, 7 Pfanzagllemma 134
least moduli 3 power of test 240
least squares 2-3
logarithmic 73 quadratic functionals 207
of error variance 168 asymptotic expansion of the dis-
minimum contrast 59 tributions 215
strongly consistent 58 stochastic asymptotic expansion
uniformly consistent 6 208, 210-211, 232-233
excess 178, 205
Rao conjecture 250
Fisher information 7 Rao statistic 230
matrix 47, 119 regression
Fisher-Snedecor distribution 284 analysis 1
Frenet formulae 269
function 1
Gaussian distribution 92-95, 142, model 1
189, 194, 300 linear 1, 73, 131-132
log-linear 73
invariant 267 non-linear 1
scalar differential 267, 274-275 regular reparametrisation 284
statistical 236, 271, 281 reparametrisation 260

Jack Knife estimator of error vari- signed measure 125, 126, 130, 295
ance 196 simple hypothesis 230
asymptotic expansion of mom- skewness 205, 206, 253, 271
ents 204-206 statistical manifold 252
INDEX 327

strong consistency of
least moduli estimator 70, 71
least squares estimator 64, 65,
67
minimum contrast estimator 62

tangential space 252


tensor
excess 268
metric 252, 257
associated 252
Ricci 255
Riemann-Christoffel 254
skewness 268

virtual stochastic asymptotic expan-


sion 214
virtual vector 214
asymptotic expansion of the dis-
tribution 214
initial terms 226-229

Wald statistic 207


Other Mathematics and Its Applications titles of interest:

P.M. Alberti and A. Uhlmann: Stochasticity and Partial Order. Doubly Stochastic
Maps and Unitary Mixing. 1982,128 pp. ISBN 90-277-1350-2
A.V. Skorohod: Random Linear Operators. 1983,216 pp. ISBN 90-277-1669-2
I.M. Stancu-Minasian: Stochastic Programming with Multiple Objective Functions.
1985,352 pp. ISBN 90-277-1714-1
L. Arnold and P. Kotelenez (eds.): Stochastic Space-Time Models and Limit
Theorems. 1985,280 pp. ISBN 90-277-2038-X
Y. Ben-Haim: The Assay of Spatially Random Material. 1985,336 pp.
ISBN 90-277-2066-5
A. pazman: Foundations of Optimum Experimental Design. 1986, 248 pp.
ISBN 90-277-1865-2
P. Kree and C. Soize: Mathematics of Random Phenomena. Random Vibrations of
Mechanical Structures. 1986,456 pp. ISBN 90-277-2355-9
Y. Sakamoto, M. Ishiguro and G. Kitagawa: Akaike Information Criterion Statis-
tics. 1986,312 pp. ISBN 90-277-2253-6
G.J. Szekely: Paradoxes in Probability Theory and Mathematical Statistics. 1987,
264 pp. ISBN 90-277-1899-7
0.1. Aven, E.G. Coffman (Jr.) and Y.A. Kogan: Stochastic Analysis of Computer
Storage. 1987,264 pp. ISBN 90-277-2515-2
N.N. Vakhania, V.I. Tarieladze and S.A. Chobanyan: Probability Distributions on
Banach Spaces. 1987,512 pp. ISBN 90-277-2496-2
A.V. Skorohod: Stochastic Equations for Complex Systems. 1987,196 pp.
ISBN 90-277-2408-3
S. Albeverio, Ph. Blanchard, M. Hazewinkel and L. Streit (eds.): Stochastic
Processes in Physics and Engineering. 1988,430 pp. ISBN 90-277-2659-0
A. Liemant, K. Matthes and A. Wakolbinger: Equilibrium Distributions of
Branching Processes. 1988,240 pp. ISBN 90-277-2774-0
G. Adomian: Nonlinear Stochastic Systems Theory and Applications to Physics.
1988,244 pp. ISBN 90-277-2525-X
J. Stoyanov, O. Mirazchiiski, Z. Ignatov and M. Tanushev: Exercise Manual in
Probability Theory. 1988,368 pp. ISBN 90-277-2687-6
E.A. Nadaraya: Nonparametric Estimation of Probability Densities and Regression
Curves. 1988,224 pp. ISBN 90-277-2757-0
H. Akaike and T. Nakagawa: Statistical Analysis and Control of Dynamic Systems.
1998,224 pp. ISBN 90-277-2786-4
Other Mathematics and Its Applications titles of interest:

A.V. Ivanov and N.N. Leonenko: Statistical Analysis of Random Fields. 1989, 256
pp. ISBN 90-277-2800-3
V. Paulauskas and A. Rackauskas: Approximation Theory in the Central Limit
Theorem. Exact Results in Banach Spaces. 1989, 176 pp. ISBN 90-277-2825-9
R.Sh. Liptser and A.N. Shiryayev: Theory of Martingales. 1989,808 pp.
ISBN 0-7923-0395-4
S.M. Ermakov, V.V. Nekrutkin and A.S. Sipin: Random Processes for Classical
Equations of Mathematical Physics. 1989,304 pp. ISBN 0-7923-OO36-X
G. Constantin and I. Istratescu: Elements of Probabilistic Analysis and Applica-
tions. 1989,488 pp. ISBN 90-277-2838-0
S. Albeverio, Ph. Blanchard and D. Testard (eds.): Stochastics, Algebra and
Analysis in Classical and Quantum Dynamics. 1990,264 pp. ISBN 0-7923-0637-6
Ya.I. Belopolskaya and Yu.L. Dalecky: Stochastic Equations and Differential
Geometry. 1990,288 pp. ISBN 90-277-2807-0
A.V. Gheorghe: Decision Processes in Dynamic Probabilistic Systems. 1990,372
pp. ISBN 0-7923-0544-2
V.L. Girko: Theory ofRandom Determinants. 1990,702 pp. ISBN 0-7923-0233-8
S. Albeverio, PH. Blanchard and L. Streit: Stochastic Processes and their Applica-
tions in Mathematics and Physics. 1990,416 pp. ISBN 0-9023-0894-8
B.L. Rozovskii: Stochastic Evolution Systems. Linear Theory and Applications to
Non-linear Filtering. 1990,330 pp. ISBN 0-7923-0037-8
A.D. Wentzell: Limit Theorems on Large Deviations for Markov Stochastic
Process. 1990,192 pp. ISBN 0-7923-0143-9
K. Sobczyk: Stochastic Differential Equations. Applications in Physics, Engineer-
ing and Mechanics. 1991,410 pp. ISBN 0-7923-0339-3
G. Dallaglio, S. Kotz and G. Salinetti: Distributions with Given Marginals. 1991,
300 pp. ISBN 0-7923-1156-6
A.V. Skorohod: Random Processes with Independent Increments. 1991,280 pp.
ISBN 0-7923-0340-7
L. Saulis and V.A. Statulevicius: Limit Theorems for Large Deviations. 1991,232
pp. ISBN 0-7923-1475-1
A.N. Shiryaev (ed.): Selected Works of A.N. Kolmogorov, Vol. 2: Probability
Theory and Mathematical Statistics. 1992,598 pp. ISBN 90-277-2795-X
Yu.I. Neimark and P.S. Landa: Stochastic and Chaotic Oscillations. 1992,502 pp.
ISBN 0-7923-1530-8
Other Mathematics and Its Applications titles of interest:

Y. Sakamoto: Categorical Data Analysis by AlC. 1992,260 pp.


ISBN 0-7923-1429-8
Lin Zhengyan and Lu Zhuarong: Strong Limit Theorems. 1992,200 pp.
ISBN 0-7923-1798-0
J. Galambos and I. Katai (eds.): Probability Theory and Applications. 1992, 350
pp. ISBN 0-7923-1922-2
N. Bellomo, Z. Brzezniak and L.M. de Socio: Nonlinear Stochastic Evolution
Problems in Applied Sciences. 1992,220 pp. ISBN 0-7923-2042-5
A.K. Gupta and T. Varga: Elliptically Contoured Models in Statistics. 1993, 328
pp. ISBN 0-7923-2115-4
B.E. Brodsky and B.S. Darkhovsky: Nonparametric Methods in Change-Point
Problems. 1993,210 pp. ISBN 0-7923-2122-7
V.G. Voinov and M.S. Nikulin: Unbiased Estimators and Their Applications.
Volume 1: Univariate Case. 1993,522 pp. ISBN 0-7923-2382-3
V.S. Koroljuk and Yu.V. Borovskich: Theory ofU-Statistics. 1993,552 pp.
ISBN 0-7923-2608-3
A.P. Godbole and S.G. Papastavridis (eds.): Runs and Patterns in Probability:
Selected Papers. 1994,358 pp. ISBN 0-7923-2834-5
Yu. Kutoyants: Identification of Dynamical Systems with Small Noise. 1994, 298
pp. ISBN 0-7923-3053-6
M.A. Lifshits: Gaussian Random Functions. 1995, 346 pp. ISBN 0-7923-3385-3
M.M. Rao: Stochastic Processes: General Theory. 1995,635 pp.
ISBN 0-7923-3725-5
Yu.A. Rozanov: Probability Theory, Random Processes and Mathematical Statis-
tics. 1995,267 pp. ISBN 0-7923-3764-6
L. Zhengyan and L. Chuanrong: Limit Theory for Mixing Dependent Random
Variables. 1996,352 pp. ISBN 0-7923-4219-4
A.V. Ivanov: Asymptotic Theory ofNonlinear Regression. 1997,333 pp.
ISBN 0-7923-4335-2

You might also like