Professional Documents
Culture Documents
To cite this article: Gottfried E. Noether (1984) Nonparametrics: The Early YearsImpressions and Recollections, The American
Statistician, 38:3, 173-178, DOI: 10.1080/00031305.1984.10483194
Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content) contained in the
publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations
or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any
opinions and views expressed in this publication are the opinions and views of the authors, and are not the
views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be
independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses,
actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever
caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions
Nonparametrics: The Early Years-
Impressions and Recollections
GOTIFRIED E. NOETHER *
Bibliography of Nonparametric Statistics (Savage 1962) the distributions are unknown, as the non-parametric case. (p. 264)
contains an entry dated 1710. In that year John
Wolfowitz then went on to explain:
Arbuthnot-scholar, scientist, mathematician, literary
figure, and physician to Queen Anne-had a paper The literature of theoretical statistics deals principally with the
parametric case. The reasons for this are perhaps partly historic
published entitled "An Argument for Divine Provi-
and partly the fact that interesting results could more readily be
dence, Taken from the Constant Regularity Observed expected to follow from the assumption of normality. Another
in the Births of Both Sexes." In the paper, Arbuthnot reason is that, while the parametric case was for long developed on
performed what surely must have been the first sign test an intuitive basis, progress in the nonparametric case requires the
and likely the first test ever of a statistical hypothesis. use of modern notions. However, the needs of theoretical com-
pleteness and of practical research require the development of the
From records of christenings of infants in the city of
theory of the non-parametric case. (p. 265)
London, Arbuthnot had noted that for each of the 82
years from 1629 to 1710, the number of male births It seems clear from this quotation that Wolfowitz's
exceeded the number of female births. On the basis of main reason for adding the term nonparametric to the
these observations, he rejected what he called the "hy- statistical vocabulary was to call attention to the need
pothesis of chance," which assigned probability 1/2 to for research in a new field. Indeed, the second part of
an excess of male over female births in anyone year. Wolfowitz's paper constituted an attempt to apply the
Rather, Arbuthnot concluded, divine providence had likelihood ratio principle, which Neyman and Pearson
ordained an excess of male births over female births to had proposed 10 years earlier for the solution of para-
offset a higher male death rate and thus ensure equal metric problems, to the nonparametric case. In retro-
proportions of adult males and females. The idea of "an spect the attempt was not very successful. Seven years
omnipresent activating deity who maintains mean sta- later, while reviewing nonparametric inference at the
tistical values" formed the foundation of much of the first Berkeley Symposium, Wolfowitz (1949) observed
statistics of the 18th century (Eisenhart and Birnbaum that a small beginning had been made but that as yet no
1967). general theory of nonparametric tests existed.
We shall see that there are additional isolated in- General acceptance of the term nonparametric was
stances of the use of non parametric methods during the rather slow. During the 1940's only a few mathematical
closing years of the 19th century and the early years of statisticians at Columbia and Princeton Universities
the 20th century. But many present-day writers on non- used it in papers published almost exclusively in the
parametrics consider the Hotelling and Pabst (1936) Annals of Mathematical Statistics. The first use of the
paper on rank correlation to be the real start of the term in the Journal of the American Statistical Associ-
discipline now generally known as nonparametric statis- ation seems to be in Noether (1949a).
tics. There are even some statisticians who would post- Soon after Wolfowitz called attention to nonpara-
pone the start of nonparametrics until 1945, the year metrics, Scheffe (1943) responded with a paper that not
only proposed a theoretical framework for the develop-
ment of non parametric theory but also provided a fairly
Gottfried E. Noether is Professor, Department of Statistics, Uni- complete list of then-existing procedures that could be
versity of Connecticut, Storrs, cr 06268. This article is based on the
Pfizer Colloquium Lecture at the University of Connecticut,
called nonparametric.
presented by the author on April 28, 1983. The Pfizer Colloquium This is how Scheffe viewed nonparametric hypothesis
series is under the sponsorship of Pfizer Central Research. testing and estimation in 1943. He discussed hypothesis
Scheffe's discussion of nonparametric estimation was the problem of distribution-free tolerance intervals.
considerably shorter than the discussion of hypothesis The underlying problem had been posed by Shewhart,
testing. Before getting down to details, Scheffe found it an engineer at the Bell Telephone Laboratories. For the
necessary to define estimation in the nonparametric one-dimensional case, the problem was solved by Wilks
case. I may be reading too much into Scheffe's words, (1942) with the help of order statistics. An utterly sim-
but I sense some uneasiness on his part in using the term ple but ingenious idea by Wald (1943) made it possible
nonparametric in connection with estimation. This is to use the Wilks approach with multivariate data as
how he led up to estimation: well. It is interesting to note that most present-day
Let e be a real number determined by a distribution F (a functional writers on nonparametrics do not even mention toler-
of F). Thus e might be the mean of the distribution, in which case ance intervals.
e would be defined for all distributions possessing a first moment. The bibliography attached to Scheffe's survey paper
We shall not call e a parameter in order to avoid confusion with the included 58 titles. We can gain an idea of the rapid
parametric case. (p. 320)
growth of nonparametrics during subsequent years by
Scheffe clearly recognized a potential problem with the looking at the bibliography of Savage (1953, 1962). The
term nonparametric. I shall say more about this prob- first edition of this bibliography, published 10 years
lem later on. after the Scheffe survey, had 999 entries; a second edi-
By 1943 the subject of point estimation had received tion, published nine years after the first edition, con-
practically no attention from the non parametric view- tained about 3,000 titles. No further editions of this
point. Scheffe simply mentioned that the ideas of un- useful bibliography were published, and I am unaware
biasedness and consistency of point estimates carried of any other comprehensive literature counts.
over from the parametric to the nonparametric case My personal initiation into nonparametrics occurred
without change. He added that under very general con- in the fall of 1946, when Wolfowitz offered at Columbia
ditions, the sample median was a consistent estimate of University what must have been one of the first courses
the population median. (if not the first ever) in nonparametric statistics. Two
Confidence intervals had fared only slightly better years later I had the additional good fortune to be able
than point estimation. Scheffe mentioned only three to attend the guest lectures that Pitman gave at Col-
problems: confidence intervals for the median of a pop- umbia University. My lifelong interest in nonparamet-
ulation, confidence intervals for the difference of two rics dates from these Wolfowitz and Pitman lectures.
medians, and confidence limits for an unknown distri- Let me look at these two sets of lectures in some
bution function. detail. On the whole, Wolfowitz covered the material
The confidence interval for a population median mentioned in the Scheffe survey, supplementing it with
bounded by two order statistics was proposed by new results that he and Wald had obtained in the mean-
Thompson (1936). For the difference of two medians, time. One particular such result stands out in my mind:
Scheffe proposed an interval based on separate con- Condition W for the asymptotic normality of linear
fidence intervals for the two medians; but he added that forms under randomization (Wald and Wolfowitz
the procedure was not very efficient. Alternatively, in 1944). Asymptotic normality of relevant test statistics
the case of a shift model, Pitman's randomization test under the null hypothesis had been taken for granted all
for the two-sample problem furnished a confidence in- along. But earlier arguments had usually relied merely
terval for the median difference. Ten years later Lincoln on the study of the first four moments of the test statis-
followed immediately from the fact that the normal dis- tool I had needed in my thesis work on the effect of
tribution had finite moments of all orders. transformations of the observations in the Wald-
In Noether (1949b) I further weakened Condition W. Wolfowitz test of randomness. Due to circumstances,
This weaker condition, after further simplification, now Pitman never published his work on relative efficiency
plays an important role in the theory of linear rank beyond its inclusion in a restricted and highly prized set
statistics (Hajek 1961). of lecture notes (Pitman 1948). As a result, Noether
Whereas Wolfowitz provided a broad survey of what (1950), which resulted from my thesis, remained for
was known in 1946 about nonparametrics, Pitman's lec- several years the only generally available reference to
tures in 1948 focused on how to compare nonparametric Pitman efficiency.
methods with corresponding normal theory methods Noether (1955) extended Pitman's results and called
and with each other. Most statisticians of that time took attention to the close connection between Pitman test
it for granted that non parametric methods were what efficiency and classical estimation efficiency. As early
they liked to call wasteful of information. As we shall as 1936, Hotelling and Pabst, in dealing with the Spear-
see later, even Wilcoxon, who in 1945 proposed what man rank correlation coefficient, had used estimating
has become the most widely used non parametric pro- efficiency as a measure of testing efficiency; and one
cedure in existence, was convinced that his method did year later Cochran (1937) used the same approach for
not use all the information available. But he was willing comparing the sign test with the t test. But none of these
to make a sacrifice for the sake of simplicity. statisticians ever elaborated on what they meant by test-
Wolfowitz's response to criticism implying the waste- ing efficiency, nor did they go beyond normal popula-
fulness of nonparametric methods was straightforward tions for their comparisons. It was Pitman's approach
and to the point: The only kind of information a non- that permitted the first realistic evaluation of non-
parametric procedure is likely to waste is information parametric tests of hypotheses. Fifteen years later, Leh-
that is unavailable anyway. In his Columbia University mann (1963) completed the picture by showing that
lectures, Pitman went a step further by proposing a Pitman efficiency also provided the information re-
quantitative measure for the comparison of two com- quired for the comparison of competing confidence in-
peting tests of the same hypothesis based on the com- terval procedures.
parison of respective sample sizes that produce equal Thus starting in the late 1930's, a relatively small
power for equal alternatives. This quantity is now group of theoretical statisticians were seriously inter-
known as Pitman relative efficiency. ested in establishing firm foundations for a new branch
For the comparison of the sign test with the t test, the of statistics, which became known as nonparametric sta-
Pitman relative efficiency turned out to be a disap- tistics. Unrelated to these efforts, and in some cases
pointingly low 2hr (or .64) if observations came from a preceding them by many years, applied scientists in var-
normal population. A statistician who under such cir- ious fields of research attempted to get away from the
cumstances uses the sign test in place of the t test indeed restraints of the normal distribution. Some of these re-
wastes one observation out of every three. But this is searchers were motivated by the conscious realization
not the complete picture, as Pitman pointed out as early that methods based on the normality assumption might
as 1948. Since statisticians use the t test not only when lead to misleading conclusions or might otherwise be
sampling normal populations, supplementary informa- inappropriate, and others, simply by a desire to reduce
tion is needed. For populations with sufficiently long computational drudgery. Galton's preference for the
- - - (1955), "On a Theorem of Pitman," The Annals of Mathe- WALSH, JOHN E. (1968), Handbook of Nonparametric Statistics
matical Statistics, 26, 64-68. (Vols. 1-3), Princeton, N.J.: Van Nostrand.
- - (1967), "Needed-A New Name" (Letter to the Editor), The WILCOXON, FRANK (1949), Some Rapid Approximate Statistical
American Statistician, 21, 41. Procedures, New York: American Cyanamid Company.
- - - (1974), "W.U. Behrens and Nonparametric Statistics," Bio- WILKS, S.S. (1942), "Statistical Prediction With Special Reference
metrische Zeitschrift, 16,97-101. to the Problem of Tolerance Limits," The Annals of Mathematical
- - - (1979), Comments on Professor J. Neyman's Pfizer Collo- Statistics, 13, 400-409.
quium Lecture, "Clustering: Reminiscences of Some Episodes in WOLFOWITZ, J. (1942), "Additive Partition Functions and a Class
My Research Activities," Washington, D.C.: American Statistical of Statistical Hypotheses," The Annals of Mathematical Statistics,
Association, Continuing Education Videotapes. 13. 247-279.
- - - (1982), "Discussion of Koopmans' New Introductory - - - (1949), "Non-Parametric Statistical Inference," Proceedings
Course," in Teaching of Statistics and Statistical Consulting, eds. of the Berkeley Symposium on Mathematical Statistics and Proba-
J .S. Rustagi and D.A. Wolfe, New York: Academic Press. bility, Berkeley, Calif.: University of California Press, 93-113.
PITMAN, E.J.G. (1948), Lecture Notes on Non-Parametric Statistics
(dittoed), New York: Columbia University.