Professional Documents
Culture Documents
Descriptive Statistics
Statistical procedures used to summarise, organise, and simplify data. This process should be carried out in such a way that reflects overall findings Raw data is made more manageable Raw data is presented in a logical form Patterns can be seen from organised data
Frequency tables Graphical techniques Measures of Central Tendency Measures of Spread (variability)
We can describe our data by using a Frequency Distribution. This can be presented as a table or a graph. Always presents:
The set of categories that made up the original category The frequency of each score/category Three important characteristics: shape, central tendency, and variability
X 11 10 9 8 7 6 5 4 3
f 1 2 1 2 2 4 3 3 2
fX 11 20 9 16 14 24 15 12 6
Highest Score is placed at top All observed scores are listed Gives information about distribution, variability, and centrality X = score value f = frequency fx = total value associated with
frequency 7f = N 7X =7fX
X 11 10 9 8 7 6 5 4 3
f 1 2 1 2 2 4 3 3 2
fX 11 20 9 16 14 24 15 12 6
Frequency tables can display more detailed information about distribution Percentages and proportions p = fraction of total group
associated with each score (relative frequency) p = f/N As %: p(100) =100(f/N)
X 95-99 90-94 85-89 80-84 75-79 70-74 65-69 60-64 55-59 50-54
f 1 1 0 1 2 4 7 0 6 3
Class intervals represent Continuous variable of X: E.g. 51 is bounded by real limits of 50.5-51.5 If X is 8 and f is 3, does not mean they all
have the same scores: they all fell somewhere between 7.5 and 8.5
X 11 10 9 8 7 6 5 4 3
f 1 2 1 2 2 4 3 3 2
cf 20 19 17 16 14 12 8 5 2
X values = raw scores, without context Percentile rank = the percentage of the sample with scores below or at the particular value This can be represented be a cumulative frequency column Cumulative percentage obtained by:
c% = cf/N(100)
Frequency
limits of intervals Histograms can be modified to include blocks representing individual scores
memory score
8 7 6 5 4 3 2 1
10
11
12
0 45 49 54 59 64 69 74 79 84 89 94 99 score
Frequency Distribution Graph presents all the info available in a Frequency Table (can be fitted to a grouped frequency table) Uses Histograms Bar width corresponds to real
Frequency
% #
% "
% !
"
Asymptotes of the perfect curve never quite meet the horizontal axis Normal distribution is an assumption of parametric testing
Mode = 6: Median = 6: Mean = 6.35 The mean is the preferred measure of central tendency, except when There are extreme scores or skewed distributions Non interval data Discrete variables
Describing Variability
Describes in an exact quantitative measure, how spread out/clustered together the scores are Variability is usually defined in terms of distance How far apart scores are from each other How far apart scores are from the mean How representative a score is of the data set as a whole
A more sophisticated measure of variability is one that shows how scores cluster around the mean Deviation is the distance of a score from the mean
X - Q, e.g. 11 - 6.35 = 3.65, 3 6.35 = -3.35
A measure representative of the variability of all the scores would be the mean of the deviation scores 7(X - Q) Add all the deviations and divide by n n However the deviation scores add up to zero (as mean serves as balance point for scores)
To remove the +/- signs we simply square each deviation before finding the average. This is called the Variance:
7(X - Q) n
= 106.55 20
= 5.33
The numerator is referred to as the Sum of Squares (SS): as it refers to the sum of the squared deviations around the mean value
Describing Variability
The standard deviation is the most common measure of variability, but the others can be used. A good measure of variability must: Must be stable and reliable: not be greatly affected by little details in the data
Extreme scores Multiple sampling from the same population Open-ended distributions Both the variance and SD are related to other statistical techniques
Descriptive statistics
A researcher is investigating short-term memory capacity: how many symbols remembered are recorded for 20 participants:
4, 6, 3, 7, 5, 7, 8, 4, 5,10 10, 6, 8, 9, 3, 5, 6, 4, 11, 6
What statistics can we display about this data, and what do they mean? Frequency table: show how often different scores occur Frequency graph: information about the shape of the distribution Measures of central tendency and variability
Descriptive statistics
5
X 11 10 9 8 7 6 5 4 3
f 1 2 1 2 2 4 3 3 2
fX 11 20 9 16 14 24 15 12 6
0 1 2 3 4 5 6 7 8 9 10 11 12