You are on page 1of 8

Test 4

The Single-Sample Test for Evaluating


Population Skewness
(Parametric Test Employed with IntervalIRatio Data)
I. Hypothesis Evaluated with Test and Relevant Background
Information
Hypothesis evaluated with test Does a sample of n subjects (or objects) come from a population distribution that is symmetrical (i.e., not skewed)?
Relevant background information on test Prior to reading this section the reader should
review the discussion of skewness in the Introduction. As noted in the Introduction, skewness
is a measure reflecting the degree to which a distribution is asymmetrical. From a statistical
perspective, the skewness of a distribution represents the third moment about the mean (m3),
which is represented by Equation 4.1 (which is identical to Equation 1.16).
(Equation 4.1)
Skewness can be employed as a criterion for determining the goodness-of-fit of data with
respect to a normal distribution. Various sources (e.g., D'Agostino (1970, 1986), D'Agostino
and Stephens (1986), D'Agostino et al. (1990)) state that in spite of the fact that it is not as
commonly employed as certain alternative goodness-of-fit tests, the single-sample test for
evaluating population skewness provides an excellent test for evaluating a hypothesis of
goodness-of-fit for normality, when it is employed in conjunction with the result of the singlesample test for evaluating population kurtosis (Test 5). The results of the latter two tests are
employed in theD'Agostino-Pearson test of normality (Test 5a), which is described in Section
VI of the single-sample test for evaluating population kurtosis. D'Agostino (1986),
DYAgostinoet al. (1990) and Zar (1999) state that the D'Agostino-Pearson test of normality
provides for a more powerful test ofthe normality hypothesis than does either the KolmogorovSmirnov goodness-of-fit test for a single sample (Test 7) or the chi-square goodness-of-fit
test (Test 8) (both ofwhich are described later in the book). D'Agostino et al. (1990) state that
because of their lack of power, the latter two tests should not be employed for assessing
normality. Other sources, however, take a more favorable attitude towards the
Kolmogorov-Smirnovgoodness-of-fit test for a single sample and the chi-square goodnessof-fit test as tests of goodness-of-fit for normality (e.g., Conover (1980, 1999), Daniel (1990),
Hollander and Wolfe (1999), Marascuilo and McSweeney (1977), Siege1 and Castellan (1988),
and Sprent (1993)).
In the Introduction it was noted that since the value computed for m3 is in cubed units,
it is often converted into the unitless statistic gl. The latter, which is an estimate of the
population parameter y, (where y represents the lower case Greek letter gamma), is commonly
employed to express skewness. When a distribution is symmetrical (about the mean), the value
of gl will equal 0. When the value of g, is significantly above 0, a distribution will be
Copyright 2004 by Chapman & Hal/CRC

Handbook of Parametric and Nonparametric Statistical Procedures

174

positively skewed, and when it is significantlybelow 0, a distribution will be negatively skewed.


Although the normal distribution is symmetrical (with g, = O), as noted earlier, not all
symmetrical distributions are normal. Examples of nonnormal distributions that are
symmetrical are the t distribution and the binomial distribution, when irl = - 5 . (The meaning
of the notation n, = .5 is explained in Section I of the binomial sign test for a single sample
(Test 9).)
It was also noted in the Introductionthat some sources (e.g., D'Agostino (1970,1986) and
D9Agostinoet al. (1990)) convert the value of gl into the statistic J^. The latter is an estimate
ofa population parameter designated
(where represents the lower case Greek letter beta),
which is also employed to represent skewness. When a distribution is symmetrical (such as in
the case of a normal distribution), the value of
will equal 0. When the value of
is significantly above 0, a distribution will be positively skewed, and when it is significantly below
0, a distribution will be negatively skewed.
The single-sample test for evaluating population skewness is the procedure for
determining whether or not a gl andlor J^ value deviate significantly from 0. The normal
distribution is employed to provide an approximation of the exact sampling distribution for the
Thus, the test statistic computed for the single-sample test for
statistics gl and
evaluating population skewness is a z value.'

fi

Jb,

Jb,

Jb,.

11. Example
Example 4.1 A researcher wishes to evaluate the data in three samples (comprised of 10
scoresper sample)for skewness. Specifically, the researcher wants to determine whether or not
the samples meet the criteria for symmetry as opposed to positive versus negative skewness.
The three samples will be designated Sample E, Sample F, and Sample G . The researcher has
reason to believe that Sample E is derivedfroma symmetricalpopulationdistribution, Sample
Ffrom a negatively skewedpopulation distribution, and Sample G from a positively skewed
population distribution. The datafor the three distributions are presented below.
Distribution E: 0,0,0,5,5,5,5, 10, 10, 10
DistributionF: 0,1,1,9,9,10,10,10,10, 10
Distribution G: 0,0,0,0,0, 1, 1,9,9, 10
Are the data consistent with what the researcher believes to be true regarding the
underlyingpopulation distributions?

111. Null versus Alternative Hypotheses


Null hypothesis
(The underlying population distribution the sample represents is symmetrical -in which case
the population parameters yl and J^ are equal to 0.)

Alternative hypothesis

Hf y ,

O m H,:J^ 0

(The underlying population distribution the sample represents is not symmetrical -in which
case the population parameters yl and J^ are not equal to 0. This is a nondirectional alternative hypothesis, and it is evaluated with a two-tailed test. In order to be supported, the
absolute value ofz must be equal to or greater than the tabled critical two-tailed z value at the
prespecified level of significance. Thus, either a significant positive z value or a significant
negative z value will provide support for this alternative hypothesis.)
Copyright 2004 by Chapman & Hal/CRC

Test 4

(The underlying population distribution the sample represents is positively skewed -in which
case the population parameters yl and JJ^ aregreater than 0. This is a directional alternative
hypothesis, and it is evaluated with a one-tailed test. It will only be supported if the sign of z
is positive, and the absolute value of z is equal to or greater than the tabled critical one-tailed
z value at the prespecified level of significance.)

(The underlying population distributionthe sample represents is negatively skewed -in which

case the population parameters yl and ,f^[ are less than 0. This is a directional alternative
hypothesis, and it is evaluated with a one-tailed test. It will only be supported if the sign of z
is negative, and the absolute value of z is equal to or greater than the tabled critical one-tailed
z value at the prespecified level of significance.)
Note: Only one of the above noted alternative hypotheses is employed. If the alternative
hypothesis the researcher selects is supported, the null hypothesis is rejected.

IV. Test Computations


The three distributions presented in Example 4.1 are identical to Distributions E, F, and G
employed in the Introduction to demonstrate the computation ofthe values m3, gl, and
Employing Equations 1.19/1.20,1.21, and 1.22, the following values were previously computed
for m,, gl, and
for Distributions E, F, and G: m3 = 0, m 3 = -86.67, m3 = 86.67,
glE = 0, glF = - 1.02, glG= 1.02, and
= 0,
= -.86, b = .86 (the computation of the latter values is summarized in Tables 1.3-1.5).
Equations 4.2-4.8 (which are presented in Zar (1999, pp. 115-1 16)) summarize the steps
that are involved in computing the test statistic (which, as is noted above, is a z value) for the
single-sample test for evaluating population skewness. Zar (1999) states that Equation 4.8
provides a good approximation of the exact probabilities for the sampling distribution of g,
(which is employed to compute the value of
that is used in Equation 4.2), when n a 9.
Note that in Equations 4.6 and 4.8, the notation In represents the natural logarithm ofa number
(which is defined in Endnote 13 in the Introduction).

fi.

fi

Jb^ fi

fi

(Equation 4.2)

(Equation 4.3)

(Equation 4.4)

D = f l
E=- 1

^rTD

Copyright 2004 by Chapman & Hal/CRC

(Equation 4.5)
(Equation 4.6)

Handbook of Parametric and Nonparametric Statistical Procedures

(Equation 4.7)

(Equation 4.8)
Employing Equations 4.2-4.8, the values zE = 0 , z p = -1.53, and zG
computed for Distributions E, F,and G.
Distribution E

Distribution F

Copyright 2004 by Chapman & Hal/CRC

1.53 are

Test 4

Distribution G

V. Interpretation of the Test Results


The obtained values zE = 0, zF = - 1.53, and zg = 1.53 are evaluated with Table A1 (Table
of the Normal Distribution) in the Appendix. In Table A1 the tabled critical two-tailed .05
~ and the tabled critical one-tailed .05 and .O1
and .O1 values are z 0 , = 1.96 and z , =~ 2.58,
values are zO5= 1.65 and z.ol = 2.33. Since the computed absolute values zE = 0,
zF = 1.53, and zg = 1.53 are all less than the tabled critical two-tailed value z , ~ =, 1.96 and
the tabled critical one-tailed value z , =~ 1.65,
~
the null hypothesis cannot be rejected,
regardless of which alternative hypothesis is employed.
The computation ofthe value zE = 0 for Distribution E is consistent with the fact that the
latter distribution is employed to represent a symmetrical distribution. Thus, the nondirectional
+ 0) is not supported. Whenever a distribution
alternative hypothesis Hl: yl + 0 (or H l :
has perfect symmetry, gl (as well as
will equal 0, and consequently the value computed
for z will also equal 0.
Although not statistically significant, the data for Distribution F are consistent with the
directional alternative hypothesis H l : yl < 0 (or Hl:
< 0). Similarly, although not
statistically significant, the data for Distribution G are consistent with the directional

Jli[

fi),

Jl^

Copyright 2004 by Chapman & Hal/CRC

Handbook of Parametric and Nonparametric Statistical Procedures

178

fi

alternative hypothesis Hl:


yl > 0 (or HI:
> 0). Note that because g 1 = -1.02 and
b = - 3 6 are negative numbers, a negative z value is obtained for Distribution F (which is
hypothesized to represent a negatively skewed distribution). In the same respect, since
= 1.02 and b
= .86 are positive numbers, a positivezvalue is obtained for Distribution
g10
G (which is hypothesized to represent a positively skewed distribution). Whenever a
distribution is negatively skewed, the computed value of z will be a negative number, and
whenever a distribution is positively skewed, the computed value ofz will be a positive number.
The fact that the values zF = -1.53 and zG = 1.53 are not statistically significant (although
they are not that far removed from the tabled critical one-tailed .05 value zor = 1.65), in large
part may be a attributed to the fact that a small sample size (i.e., n = 10) is employed to
represent each distribution. A small sample size severely reduces the power of a statistical test,
thus making it more difficult to obtain a statistically significant result (i.e., in this case, a
significant deviation from symmetry).
It should be noted that in most instances when a researcher has reason to evaluate a
distribution with regard to skewness, he will employ a sample size which is much larger than
the value n = 10 employed in Example 4.1. Section VII discusses tables that document the exact
sampling distribution for the gl statistic,and contrasts the results obtained with the latter tables
with the results obtained in this section.

1^

VI. Additional Analytical Procedures for the Single-Sample Test for


Evaluating Population Skewness and/or Related Tests
1. Note on the D'Agostino-Pearson test of normality (Test 5a) Most researchers would not
consider the result of the single-sample test for evaluating population skewness, in and of
itself, as sufficient evidence for establishing goodness-of-fit for normality. As noted in Section
I, a procedure is presented in Section VI of the single-sample test for evaluating population
kurtosis, which employs the z value based on the computed value of gl (which is employed to
that is used in Equation 4.2) and a z value based on computed value
compute the value of
of gy (which is a measure of kurtosis that is discussed in the Introduction and in the singlesample test for evaluating population kurtosis) to evaluate whether or not a set of data is
derived from a normal distribution. The latter procedure is referred to as the D'AgostinoPearson test of normality.

Jl>[

VII. Additional Discussion of the Single-Sample Test for Evaluating


Population Skewness
I. Exact tables for the single-sample test for evaluating population skewness Zar (1999)
has derived exact tables for the absolute value of the gl statistic for sample sizes in the range
9 s n s 1000. By employing the exact tables, one can avoid the tedious computations that are
described in Section IV for the single-sample test for evaluating population skewness (which
employs the normal distribution to approximate the exact sampling distribution). In Zar's
(1999) tables, the tabled critical two-tailed .05 and .O1 values for gl are gl = 1.359 and
.05
= 1.846, and the tabled critical one-tailed .05 and .O1 values are gl = 1.125 and
g1.0,
.05
= 1.643. In order to reject the null hypothesis, the computed absolute value of gl must
^lo,
beequal to or greater than the tabled critical value (and if a directional alternative hypothesis
is evaluated, the sign of gl must be in the predicted direction). The probabilities derived for
Example 4.1 (through use of Equation 4.8) are extremely close to the exact probabilities listed
Copyright 2004 by Chapman & Hal/CRC

Test 4

1 79

in Zar (1999). The probability for Distribution E is identical to Zar's (1999) exact probability.
With respect to Distributions F and G, the probabilities listed in Zar's (1999) tables for the
values g , = - 1.02 and g , = 1.02 are very close to the tabled probabilities in Table A1 for
the computed values ip = - 1.53 and zQ = 1.53. Note that the absolute value g, = 1.02 just
falls short of being significant at the .05 level if a one-tailed analysis is conducted. In the case
of Example 4.1, the same conclusions regarding the null hypothesis will be reached, regardless
of whether or not one employs the normal approximation or Zar's (1999) tables.
2. Note on a nonparametric test for evaluating skewness Zar (1999, pp. 119-120) describes
a nonparametric procedure for evaluating skewnesdsyrnmetry around the median of a distribution (as opposed to the mean). The latter test is based on the Wilcoxon signed-ranks test
(Test 6), which is one of the nonparametric procedures described in this book.

VIII. Additional Examples Illustrating the Use of the Single-Sample


Test for Evaluating Population Skewness
No additional examples will be presented in this section,

References
Conover, W. J. (1980). Practical nonparametric statistics (2nd ed.). New York: John Wiley
& Sons.
Conover, W. J. (1999). Practical nonparametric statistics (3rd 4 . ) . New York: John Wiley
& Sons.
D'Agostino, R B. (1970). Transformation to normality of the null distribution of g,.
Biometrika, 57,679-68 1.
D'Agostino, R. B. (1986). Tests for the normal distribution In D'Agostino, R. B. and
Stephens, M. A. (Eds.), Goodness-of-fit techniques (pp. 367-419). New York: Marcel
Dekker.
D'Agostino, R. B. and Stephens, M. A. (Eds.) (1986). Goodness-of-fittechniques. New York:
Marcel Dekker.
D'Agostino, R. B., Belanger, A. and D'Agostino Jr., R B. (1990). A suggestion for using
powerful and informative tests of normality. American Statistician, 44, 3 16-321.
Daniel, W. W. (1990). Applied nonparametric statistics (2nd 4.). Boston: PWS-Kent Publishing Company.
Hollander, M. and Wolfe, D. A. (1999). Nonparametric statistical methods. New York: John
Wiley & Sons.
Marascuilo, L. A. and McSweeney, M. (1977). Nonparametric and distribution-free method
for the social sciences. Monterey, CA: BrooksICole Publishing Company.
Siegel, S. and Castellan, N. J., Jr. (1988). Nonparametric statistics for the behavioral
sciences (2nd ed.). New York: McGraw-Hill Book Company.
Sprent, P. (1993). Applied nonparametric statistical methods (2nd ed.). London: Chapman
& Hall
Tabachnick, B. G. and Fidell, L. S. (2001). Using multivariate statistics (4th ed.). Boston:
Allyn & Bacon.
Zar, J. H. (1999). Biostatistical analysis (4th ed.). Upper Saddle River, NJ: Prentice Hall.

Copyright 2004 by Chapman & Hal/CRC

Handbook of Parametric and Nonparametric Statistical Procedures

180

Endnotes
1.

The reader should take note of the fact that the test for evaluating population skewness
described in this chapter is a large sample approximation. In point of fact, sources are not
in agreement with respect to what equation provides for the best test of the hypothesis of
value deviates significantly from 0. The general format
whether or not a g, and/or
ofalternative equationsfor evaluating skewnesswhich are employed in other sources (e.g.,
statistical software packages such as SPSS, SAS, S-Plus) involves the computation of a z
value by dividing the value computed for skewness (represented by g,) by the estimated
population standard error. The fact that a different z value can result from use of one or
more ofthese alternative equations derives from the fact that sources are not in agreement
with respect to what statistic provides the best estimate of the population standard error
(SE). (As an example, Tabachnick and Fidell(2001) note that the value of the standard
error can be approximated with the equation SE = ^/6/n, and thus z = g, l SE.) In view of
the latter, use ofexact tables ofthe underlying sampling distribution (discussed in Section
VII) allows for the optimal analysis of the hypothesis of whether or not a g, and/or ifb[
value deviates significantly from 0.

J1)[

Copyright 2004 by Chapman & Hal/CRC

You might also like