Professional Documents
Culture Documents
Page 2
Statistics
Descriptive statistics:
Summarizing and describing characteristics
of collections of numbers precisely and
accurately;
Presenting information in a convenient,
usable, and understandable form.
Inferential statistics (Analytical):
Making calculated interpretations or
judgments about properties of large groups
on the basis of samples, with designated
confidence levels.
Page 3
Types of Data
Quantitative
Numerical
Can be measured
Continuous
Qualitative
Categorical or Nominal
Page 4
Types of Variables
Continuous
Pulse
Blood Glucose Level
Discrete or Categorical
Ordinal
Order in the Family
Severity of Pain
Nominal
Sex
Race
Binomial
No or Yes
Absent or Present Page 5
STATISTICAL HANDLING OF NUMBERS
1. Ordering Descriptive
(Organizing)
2. Averaging Descriptive
(Generalization)
3. Finding Variability Descriptive
(measuring relationship)
4. Comparing Inferential
(differences and effects)
Page 6
1. Ordering
Frequency distribution
Relative frequency/percentage
Ratio
Page 7
1. Ordering
Simple Frequency Distribution Tables
Page 8
1. Ordering Length of Stay (LOS) (Stroke)
N=50
X
X X
X X
X X X X
X X X X
X X X X
X X X X X X X
X X X X X X X X
X X X X X X X X X X X X
X X X X X X X X X X X X X X X
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
LOS (Days)
Page 9
1. Ordering
Frequency Terms
Page 10
1. Ordering
Frequency Distribution Table
Test Scores
56-65 42
66-75 70
76-85 99
86-95 74
96-105 52
106-115 40
116-125 22
____ ___
i = 10 N = 399
Page 11
1. Ordering
Frequency Distribution Table
Page 12
2. AVERAGING:
Measuring Central Tendency
Page 13
2. AVERAGING:
Measuring Central Tendency
Page 14
2. AVERAGING:
Measuring Central Tendency
Page 17
2. AVERAGING:
Measuring Central Tendency
Mode (Mo.)
What is
When to use
Limitations
Page 18
2. AVERAGING:
Measuring Central Tendency
The Mode
The mode is simply the most frequently occurring
value in a data set.
Used with data grouped into class intervals, where
Mo. Is the midpoint of the interval with the greatest
frequency.
Some things to remember about the mode:
The mode is the least utilized measure of
central tendency
For some types of data it is the most useful.
Page 19
2. AVERAGING:
Measuring Central Tendency
Median (Mdn.)
What is
Calculate
Odd : (n+1) /2
Even : The mean off (n / 2 )
, (n / 2)+1
When to use
Limitations
Page 20
2. AVERAGING:
Measuring Central Tendency
The Median
The median is the middle item in a set of numbers.
To calculate the Median:
Rank the numbers from largest to smallest.
Locate the middle item.
Use with ranked order measures.
Some things to remember about the median:
The median is not affected by extreme, or
outlying, numbers.
The median usually takes on a realistic,
meaningful value. Page 21
Comparison between Mean, Mode, Median
Page 22
Exercise
Pulse (Beats/Min)
Group (1)
60, 65, 70, 75, 75, 60, 80, 90,
65, 60, 85
Group (2)
55, 65, 60, 70 ,85 , 100, 75, 90,
95,105
Page 23
3. Variability Indices
Measures of Dispersion
Definition:
The variation or scattering of data around
an average value, usually the arithmetic
mean ( M ).
The greater the spread of a distribution,
the greater the dispersion or variability (the
more heterogeneous).
Page 24
3. Variability Indices
Measures of Dispersion
Range
Average Deviation
Standard Deviation (SD)
Percentiles
Page 25
3. Variability Indices
Measures of Dispersion
Range (R)
Page 28
3. Variability Indices
Measures of Dispersion
Standard
Deviation (SD)
sd =
Page 29
3. Variability Indices
Measures of Dispersion
40
35
30
25
20
15
10
Page 30
3. Variability Indices
Measures of Dispersion
Standard Deviation(SD)
A measure of the spread of a distribution;
a computed value describing the amount of
variability in a particular distribution.
The more the values cluster around the mean, the
smaller the amount of variability or deviation.
A way to recognize deviation from the normal range
of variation.
Page 31
3. Variability Indices
Measures of Dispersion
Standard Deviation(SD)
Page 33
How To Calculate Standard Deviation(SD)
Page 34
4. Data Comparison Techniques
Tests of Statistical Significance
p-value Page 35
4. Data Comparison Techniques
Tests of Statistical Significance
p-value Page 36
4. Data Comparison Techniques
Tests of Statistical Significance
In scientific and medical applications, the null
hypothesis plays a major role in testing the
significance of differences in treatment and
control groups.
The assumption at the outset of the experiment
is that no difference exists between the two
groups (for the variable being compared): this
is the null hypothesis in this instance.
p-value Page 37
4. Data Comparison Techniques
Tests of Statistical Significance
Confidence interval
Level of significance:
p-value Page 38
Concepts related to tests of Significance
Confidence Level for Change
The confidence level ranges from 0% to
100%. The higher the confidence level, the
greater the certainty that a change took place.
Page 42
4. Data Comparison Techniques
Tests of Statistical Significance
Page 43
4. Data Comparison Techniques
Tests of Statistical Significance
The t-Test
Comparing Means Based on Variances
The t-Test compares two sets of like things, using
averages to see if they indicate real difference, or
a difference likely to have occurred by chance.
Ex.: Comparing the average number of packs of
cigarettes smoked per month by smokers who had
a heart attack (MI) before age 50 vs. smokers who
didnt have a heart attack.
Page 44
4. Data Comparison Techniques
Tests of Statistical Significance
Regression Analysis
Regression analysis is a statistical technique that
allows one to compare the entire distribution of
observations of one measurement (or variable) with
the entire distribution of another measure, in order
to determine how strongly the two variables are
interrelated (correlated).
Regression analysis is a way of evaluating the
kinds of data found in scatter diagrams.
Page 45
4. Data Comparison Techniques
Tests of Statistical Significance
Regression Analysis
Page 46
4. Data Comparison Techniques
Tests of Statistical Significance
An r approaching +1.0 indicates a strong positive
relationship between the measures, with both sets
of measure numbers increasing or both decreasing
together.
An r approaching -1.0 indicates a strong negative
relationship, with the numbers of one of the
measures increasing, and the numbers of the other
measure decreasing.
Measures with no significant relationship will have
an r of approximately zero (0).
Regression Analysis
Page 47
4. Data Comparison Techniques
Tests of Statistical Significance
Correlation Examples
Regression Analysis
Page 48
Page 49
Page 50
Page 51
Page 52
4. Data Comparison Techniques
Tests of Statistical Significance
Regression Analysis
Issues to remember
Comparing Two Distributions
Correlation
+ve Correlation
-ve Correlation
No Correlation
Correlation Coefficient r
Regression Analysis Page 53
Important Notes
Page 54
Graphs and Curves
Bar graphs usually present categorical and
numeric variables grouped in class intervals.
Page 56
Pictographs
A pictograph uses picture symbols to convey the meaning of
statistical information. Pictographs should be used carefully
because the graphs may, either accidentally or deliberately,
misrepresent the data. This is why a graph should be visually
accurate.
Page 57
A pie chart is a way of summarizing a set of
categorical data or displaying the different values of
a given variable (e.g., percentage distribution).
Page 58
Line graphs
Line graphs compare two variables: one is plotted along the
x-axis (horizontal) and the other along the y-axis (vertical).
line graphs can also depict multiple series which are usually
the best candidate for time series data and frequency
distribution.
Page 59
Scatterplots
In science, the scatterplot is widely used to present
measurements of two or more related variables. It is
particularly useful when the variables of the y-axis are
thought to be dependent upon the values of the variable of
the x-axis (usually an independent variable).
Page 60
Normal Distribution curves
Symmetrical Curves
the two sides of the curve are identical if the polygon is
folded in half perpendicular to the baseline ( bell-shaped or
rectangular)
Page 61
Skew. The skew of a distribution refers to how the curve
leans. When a curve has extreme scores on the right hand side of
the distribution, it is said to be positively skewed. In other words,
when high numbers are added to an otherwise normal distribution,
the curve gets pulled in an upward or positive direction. When the
curve is pulled downward by extreme low scores, it is said to be
negatively skewed (J Shape). The more skewed a distribution is,
the more difficult it is to interpret.
Page 62
Histograms
The histogram is a popular graphing tool. It is used
to summarize discrete or continuous data that are
measured on an interval scale.
Page 63
Histographs
A histograph, or frequency polygon, is a graph
formed by joining the midpoints of histogram column
tops. These graphs are used only when depicting
data from the continuous variables shown on a
histogram.
Page 64
Graphic representation of Relationship
Scatter Diagram
Flowchart
Cause and Effect Diagram
Page 65