Professional Documents
Culture Documents
A Tutorial Of Chapters 1 4
c. 2009 by Dr. Donald F. DeMoulin and Dr. William Allen Kritsonis
These Slides May Not Be Altered or Modified
8)
9)
10)
Introduction Review of the Basics Research Rules Statistical Symbols Statistical Terms Data Strength Measures of Central Tendency 1) Mean 2) Median 3) Mode Measures of Variability 1) Range 2) Variance 3) Standard Deviation Distribution Types Putting it all together
Introduction
Many wonder why a statistical concept is so hard to grasp?
Have you ever tried to understand a native from Italy, France, Russia, China, or Japan It is because they speak a foreign language and something that is unfamiliar to your vocabulary In this realm, statisticians, most of the time, speak in a foreign language; a language we will call Statonese (stat-n-eaze)
3
Introduction
For ExampleHave you ever heard
Four out of five dentists recommend Brand X toothpaste to help fight cavities Nine out of 10 doctors stranded on a desert island recommend Brand Y aspirin for headache pain
Five out of six farmers reported significant increases in yield from using Brand Z fertilizer
These are just a few of the many thousands of examples for statistical applications
Introduction
But, have you ever thought:
1)
What makes up the four out of five doctorsare the four employed by Brand X toothpaste Who are the nine out of ten doctorsand why are they stranded on a desert island What constitutes a significant increase in yield
5
2)
3)
Introduction
Deciphering what the numbers are and gaining an understanding of statistical procedures and concepts in order to make a somewhat accurate, independent judgment of reports, statements, and claims is what statistics is all about
Our discussions will minimize the guesswork about statistics and maximize the WHEN, the WHY, and the HOWthe basis for statistical applications
6
Introduction
The goal of these Power Point slides is to bring statistical concepts, applications, and explanations to you in a language that can be understood of how statistical procedures are developed, analyzed, and interpreted
Which participating turtle in race two logged a whopping 51 seconds for completing the race?
Turtle Y4
10
So, X would simply mean to sum all values of the variable X _ X_______ X1 = 45 X2 = 41 X3 = 30 X4 = 28 X5 = 59 X = 203
11
12
X= 8,871
14
All we do is multiply X and Y and put them in column labeled XY Fill in the missing spaces and add the column XYAlgebraically, it would be (45)(25) + (41)(36) + (30)(56) + (28)(51) + (59)(43) = 1,125 + 1,476 + 1,680 + 1,428 + 2,537 = 8,246
15
Research Rules
There are two rules in research
Rule #1 is that credibility and believability are vital components in research In essence the researcher must be credible by conducting his/her research with integrity, honesty and within proper research etiquette This leads to the next component of Rule #1 where the results (data input, data analysis and statistical interpretation) must be believable which involves proper coding and the use of the appropriate statistical procedure for analysis Credibility and believability are the two critical aspects of any research for without them, the entire research process undertaken becomes an insignificant exercise
17
Research Rules
18
19
Cowboy Proverb
These are critical rules in research because if you do not have credibility as a researcher, the results that are produced lack believability
Cowboy Proverb
Research Rules
In other words, dont expect believability and credibility with data that is polluted, tainted or contaminated
By following the rules of research, you maximize your credibility as a researcher and the believability of your results
22
Statistical Symbols
or or
S2----------symbol for the variance of a population S----------symbol for the standard deviation of a population s2 ----symbol for the variance of a sample s------symbol for the standard deviation of a sample
The caret top ^ denotes sample
or or
Statistical Terms
descriptive procedures of the raw data and subjecting them to a higher order statistical procedure to reasonably infer results to a corresponding population by following certain rules and assumptions
24
Statistical Terms
conform to any stringent assumptions, and therefore have more latitude in proceduresbecause stringent assumptions are not strictly adhered to, we cannot confidently generalize the results to a population
25
Statistical Terms
A variable is defined as a property of an event or item that can be changed or can take on different values A dependent variable is called the measured, outcome, or criterion variable An independent variable is the variable that is changed, altered, or manipulated by the experimenter during research
26
Statistical Terms
A qualitative variable refers to nonnumerical qualities, attributes, items such as gender, eye color, etc. A quantitative variable is concerned with numerical qualities such as the number of items falling into various categories or measurable data
27
Data Strength
Data are considered nominal strength if the assignment of numbers to objects does no more than identify the objects
Data considered ordinal strength contain elements of the nominal scale of measurement plus the inclusion of an ordering of objects thereby implying magnitudecontaining objects that are labeled, but also objects that are ranked in accordance to importance
Military rank would be an example of ordinal data or lining up people according height with 1 being the smallest to 10 being the tallest
28
Data Strength
Data considered Interval strength contain all the elements of nominal and ordinal scales (labeling and ordering) plus equal intervals between each item
A thermometer would be an example of interval strength data since the distance between 20 and 30 degrees is the same distance between 50 and 60 degreeshowever 60 degrees is not twice as warm as 30 degrees since we can have minus degrees in temperature
Data considered Ratio strength contain all elements of nominal, ordinal, and interval strength (labeling, ordering, equal distance between items) plus the inclusion of an absolute zero
Height and weight are examples of ratio strength data since there is no negative weight or height
29
30
Measures of Variability
33
Measures of Variability
Range
Variance
2 =
X - (X) n . n X - (X) n . n
Standard Deviation
34
Measures of Variability
Group 1 (X) 72 73 76 76 78 X 5,184 5,329 5,776 5,776 6,084 (Y) 67 72 76 76 84 Group 2 Y 4,489 5,184 5,776 5,776 7,056
Y = 375 N=5
Y = 28,281
35
Measures of Variability
Group 1
2
X - (X) = n .
n 2 =
28,149 (375) 5 .
2 =
28,149 (140,625) 5 .
28,149 28,125 2 =
2 = 24 5
2 = 4.8
= 4.8 = 2.19
Group 2
2
Y - (Y) = n .
n
28,281 (375) 5 .
28,281 (140,625) 5 .
28,281 28,125 2 =
2 = 156
2 = 31.2
= 31.2 = 5.59
36
Distribution Types
Normal Skewed
37
Distribution Types
Also known as symmetrical, standard normal and z-normal distributions Normal Distribution
Right Half is Mirror Image of Left Half
Leptokurtic
High-Peaked
Mesokurtic
Middle Peaked
Platykurtic
Low-Peaked
If a distribution is not symmetrical, then it is asymmetrical or skewed where the right half is not the mirror image of the left half
Tail points to the negative end of the number line
38
Distribution Types
The remaining .003 percent is considered outliers that do not conform to the standard normal population distribution above 3 standard deviations + or the mean
The remaining .003 percent is considered outliers that do not conform to the standard normal population distribution above 3 standard deviations + or the mean
Roughly 68% of all scores fall within one standard deviation + or the mean Roughly 95% of all scores fall within two standard deviations + or the mean Roughly 99.7% of all scores fall within three standard deviations + or the mean
39
=0
=1
= 50
= 10
20 = 50 44 30 46 40 48 50 50 60 52 70 54 80 56
Range = 60(80 20 = 60)
Moving more towards platykurtic shape Range = 12 (56 - 44 = 12) Moving more towards leptokurtic shape
=2
The mean and standard deviation help determine the height (kurtosis) of a distribution through the variability of scores dispersed throughout the data set 40
41
Data Strength
Data Tests
Ratio
Parametric
Mean
Interval __________________________________________________________
Ordinal
Median
One Sample z-testOne Sample t-test Independent t-testDependent t-testANOVA Pearson CorrelationRepeated Measures ANOVA
Mann-Whitney UWilcoxon TSpearman Rho Kruskal-Wallis HFriedman ANOVA (ranks) Chi-Square Goodness-Of-FitChi Square Test of Independence
Non Parametric
Nominal
Mode
42
Descriptive Statistics
Computer-Generated Analysis
(X) 75 76 76 77 78
(Y) 62 75 76 76 85
43
Computer-Generated Results
Descriptive Statistics Column 1 Mean Std. Dev. Std. Error Count Minimum Maximum # Missing Variance Coef. Var. Range Sum Sum Squares 76.400 1.140 .510 5 75.000 78.000 0 1.300 .015 3.000 382.000 76.393 76.386 .272 -1.044 76.000 1.500 76.000 76.400 1.000
Column 2 74.800 8.228 3.680 5 62.000 85.000 0 67.700 .110 23.000 374.000 74.422 74.027 -.518 -.431 76.000 6.500 76.000 74.800 1.000
29190.000 28246.000
Skewness is an asymmetrical distributionif skewness is positive (negative), the data are skewed to the right (left)the larger the number, the greater the skew notice that Mean, Median and Mode are almost identical giving an almost perfect normal distribution Kurtosis refers to how peaked the distribution is when kurtosis = 3, it is a normal height distribution (Mesokurtic)Kurtosis > 3 is a high peaked distribution (Leptokurtic)Kurtosis < 3 is a low peaked distribution (Platykurtic)
Geom. Mean Harm. Mean Skew ness Kurtosis Median IQR Mode 10% Tr. Mean MAD
44