You are on page 1of 6

Biostatistics Reading Notes

Chapter 2 the mean and standard deviation (sd) provide enough data to describe a population and
the amount of variability among members of the population.
When the value of the variable is more likely than not to fall below or above the mean,
one should report the median and values of at least two other percentiles.
After collecting data and plotting the distribution we look for parameters of the
distribution:
Mean:

Measures of variability:
To determine the variation of data about the mean, you have the positive side an
negative side of the mean (above and below the mean). Squaring that gives you a
positive number. Variance is the square of the deviation around the mean.

Units are square of the units of observation.

There is also the standard deviation, s.

Units are the units of observation.

Normal Distribution
The normal distribution describes most populations, with 68% of data falls within 1 sd of
mean and 95% within 2 sd of mean. The height of the normal distribution curve at any
given value of X is

Median
The middle value when observations are listed in order. 50% of data falls below
median. It is the (n+1)/2 observation when observations are listed in order.
Other percentiles
25th percentile - lowest quarter of observations (n+1)/4 observation.
Another other percentile - (n+1)/(100/p) observation
Calculating the percentile points of a population is a good way to see how close to a
normal distribution it is. If the percentiles usually have high numbers starting them off,
you know that it may be skewed.
For a normal distribution, the percentile points are as shown:

How to make estimates from a limited sample


Since it is impossible to study a whole population, you usually have to work with a
sample of the population.
The mean is the same, except using the the data from the sample.
Standard deviation of a sample is slightly different.
Mean

Standard Deviation

How good are these estimates?


Samples only give us an estimate of the entire population of focus. In order to see how
accurate the statistical parameters are for a sample, you calculate the standard error.
The more samples you have, the less you can say that your standard error decreases.
The mean of the means of many samples will give you the mean of the population.
Standard error for a population:

Standard error for a sample:

Note that the sigma or s with the small subscript of x with a bar over refers to standard
error! It does not refer to standard deviation. Standard deviation refers to the range of
deviation around the mean of a population or within a sample. The standard error
describes the range within which one can say that their mean is 95% accurate of the
entire population from which a sample was drawn.
By calculating what happens when we collect many samples from a population and
calculate their means and standard deviations, we come up with the Central Limit
Theorem:

The distribution of sample means will be approximately normal regardless of the


distribution of values in the original population from which the samples were drawn.
The mean value of the collection of all possible sample means will equal the mean
of the original population.
The standard deviation of the collection of all possible means of samples of a given
size, called the standard error of the mean, depends on both the standard deviation
of the original population and the sizeof the sample.

Chapter 3 - How to test differences between groups

You might also like