You are on page 1of 18

UNIVERSITY OF DAR ES SALAAM

MBEYA COLLEGE OF HEALTH AND ALLIED


SCIENCES

ER 200: EPIDEMIOLOGY AND BIOSTATISTICS


MODULE 2: BASIC BIOSTATISTICS AND
DEMOGRAPHY

Topic 2.2: The Normal Distribution


Normal (Gaussian) Distribution

 It is defined as a continuous frequency distribution of infinite


range (can take any values not just integers as in the case of
binomial and Poisson distribution).
 This is the most important probability distribution in statistics
and important tool in analysis of epidemiological data and
management science.
 It links frequency distribution to probability distribution
 If a random variable X follows a Normal distribution with ,
X~N(µ,σ2); then its probability density function is given by:
Normal (Gaussian) Distribution
 The mean of the r.v.x which follows Normal distribution is ‘µ’
and its variance is ‘σ2’
 Since the expression of the Normal probability density
function (pdf) depends on , then we shall get different curves
and different probabilities of different values of µ and σ.
 This being the case, probabilities under the Normal pdf is
normally obtained using a special statistical table.
 Under these tables, various probabilities have been tabulated
for the so called STANDARD NORMAL DISTRIBUTION
which has a mean and standard deviation
 Therefore, to obtain probabilities under any normal pdf, we
need to change original variable to standardized variable given
by:
Normal (Gaussian) Distribution

 Z indicates how many standard deviations away from the


mean the point x lies.
 The standardized Normal pdf is given by:

 The mean of the standardized Normal variable is ‘0’ and its


variance is ‘1’.
Properties of the Normal Distribution
1. The curve of the Normal distribution is bell shaped and
symmetric about the line X=µ (Two halves of the curve
are the same).
2. Under the Normal distribution, the mean, mode, and
median coincide i.e mean = median = mode.
3. No portion of the curve lies below the abscissa axis. This
is because the Normal curve is constructed from
probabilities.
4. The total area under the Normal curve is 1. In particular:
 The area under ± 1 standard deviation covers 68.26%
of the area under the curve. That is:
Properties Cont…
 The area under ± 2 standard deviation covers 95% of the
area under the curve. i.e

 The area under ± 3 standard deviation covers 99.7% of the


area under the curve. i.e
Uses of statistical tables to obtain Probabilities for
Normal random Variables
 This involves approximating the area under the normal curves
for the variables which are normally distributed.
 There are different kinds of statistical tables which can be used
to approximate the area under the Normal curves.
 Here, the most important to understand are the properties of
the Normal curve, particularly; property 1 and 4; that is the
total area under the curve is 1 and the curve is symmetric
about X=µ or Z=0
 These two properties together implies that the line X=µ or Z=0
divides the curve into two equal halves each with the area of
0.5, making a total area of 1
Examples

Example 1:
Using statistical tables, evaluate the following:

Example 2:
i. what proportion of the area under the normal curve falls
beyond a z-score of 1.29?
ii. What proportion of the area under the normal curve falls
between a z-score of 1.29 and the mean?
Examples Cont…
Example 3
Use Table to find the following areas under the standard
normal curve.
i. The area that lies to the left of Z = -0.58.
ii. The area that lies between Z = -1.16 and Z = 2.71.
iii. The area that lies to the right of Z = 0.31
Example 4
Suppose that the random variable X is normally distributed
with mean of 100 and standard deviation of 5. What is the
probability of getting an X value 110 or more?
Examples Cont…

Example 5
Compute the value of Z1 which results to the following
probabilities.
Examples Cont…
Example 6
Potassium blood levels in healthy humans are normally
distributed with a mean of 17.0 mg/100 ml, and standard
deviation of 1.0 mg/100 ml. Elevated levels of potassium
indicate an electrolyte balance problem, such as may be caused
by Addison’s disease. However, a test for potassium level
should not cause too many “false positives”. What level of
potassium should we use so that only 2.5 % of healthy
individuals are classified as “abnormally high”?
Examples Cont…
Example 7
The duration of normal human pregnancy follows an
approximately Normal distribution, it lasts 280 days on an
average, and has a standard deviation of 10 days. In what
percentage of the healthy pregnant women is it expected that
they bear their babies at least one week before the average
time of pregnancy?
We can consider as particularly late-born those 10% of the
healthy women's babies who are born latest. At least how
many days later are these babies born than the average time of
pregnancy?
Distributions Related to the Normal Distribution
Three important distributions:
1. The Chi-square (χ2) distribution
2. t distribution
3. F distribution

1. The Chi-square (χ2) distribution


Theorem 1:
Let Z~N(0,1). Then, if X=Z2 we say that X follows a Chi-
square distribution with 1 degree of freedom. i.e
X~ χ12.
Theorem 2:
Let Z1, Z2,…, Zn be independent random variables with
Zi~N(0,1). If Y=∑ Zi2, then Y follows the Chi-square
distribution with n degrees of freedom. We write:
Y~ χn2.

Theorem 3:
Let X1, X2,…, Xn be independent random variables with
Xi~N(µ,σ2). It follows directly form the previous theorem that
if:
then Y~ χn2.
Theorem 4:
Let X1, X2,…, Xn be independent random variables with
Xi~N(µ,σ2). Define the sample variance and sample mean as:

Then,
2. The t (student’s t) Distribution
Theorem 5:
Let Z~N(0,1) and Y~ χn2. If Z and Y are independent then the ratio:
follows the t (or student’s t) distribution with n
degrees of freedom. We write X~ tn

Theorem 6:
Let X1, X2,…, Xn be independent random variables with Xi~N(µ,σ2).
3. The F Distribution
Theorem 7:
Let and . If U and V are independent, then:

Other Useful Relationships


1.

2.

You might also like