Data Analysis (1) Qual Qual

Data Analysis (1/3)
Typical steps for a Statistical study

1. Define the goals
How this
lecture fits up
to this
point...
2. Collect the data

3. Organize the data
4. Present the data
5. Describe the data

6. Analyze the data
7. Interpret results
Is there a correlation/association/relationship/interaction
between the variables?
Dependence:
o
o
Independence:
o
No relationship exists
between variables
Change in one variable is

NOT accompanied by
change in the other
variable.
Relationship exists between variables

Change in one variable is accompanied by change
in the other variable (these two things seem to
happen at the same time).
No relationship
Positive relationship
As x increases, Y
increases
Negative relationship
As x increases, Y
decreases
As x increases, Y
doesnt change
To study a relationship, one variable is called the

DEPENDENT variable and the other is called the
INDEPENDENT variable.
Which statement makes more sense:
1) The age of a bus can influence the maintenance cost.
2) The maintenance cost can influence the age of the bus.
The dependent
variable may be
explained/predict
ed/influenced/
understood by the
independent
variable.
Y axis is the
DEPENDENT
variable (Cost)
which is influenced.
X-axis is the INDEPENDENT

Variable (Age) which influences the
dependent variable.
Data Analysis and Tools
Numerical
Categorical
Dependent Variable (Y)
The data and the type of research question you want

answered will determine the most appropriate analytical
procedure to select.
Chi-Square
Lets go!
Independent Variable (X)
Numerical Categorical
There are two contingency table tests:
(1) Two-way table contingency test

(also called test of independence)
(2) One-way table contingency test

(also called goodness of fit test)
Two-way table contingency test

1. Is there a relationship between the

variables?
Visually (Stacked Bar Graph)
Mathematically (statistical test)
2. If there is a relationship, how
STRONG is the relationship?
8
Format convention
The independent variable is on the horizontal axis (X)
The dependent variable is on the vertical axis (Y).
Ex. Variables GENDER and VIEW OF LIFE
Which sentence makes more sense?
Does gender have an effect on View of life (Is life exciting , routine, or dull?)
Does View of life (Is life exciting , routine, or dull?) have an effect on gender?
Independent variable: Gender

Dependent variable: View of Life
Bad Presentation
We will study the differences in outcomes of the dependent

variable (view of life) across the independent variable
(gender). We will compare two groups (male and female)
responses.
Good Presentation
Is there a relationship/difference between the variables?
No Relationship
The 100%f stacked bar chart does
NOT significantly CHANGE for
different categories of the IV.
Relationship
The 100%f stacked bar chart DOES
significantly CHANGE for different
categories of the IV. (at least one has to
change for some relationship to be
detected).
Dependent variable: Political Party
Independent variable: Area of residence


variables?
11
Is there a relationship between the variables?
A sample is
taken and
organized into a
two way
contingency
table.
House Style
Split-Level
Ranch
Total
House Location
Urban
Rural
63
49
15
33
78
82
Total
112
48
160
Research Question: Is
there a difference in
house styles (DV) at
different locations (IV)?
Is there a significant
difference?
100
80
Split
Split
Ranch
Ranch
Urban
Rural
60
40
20
How to we test this claim?

1. Form hypothesis.
Ho null hypothesis: No relationship exists (variables are Independent)
Ha alternative hypothesis: Relationship exists (variables are Dependent)
2. Calculate the chi kai square (2) statistic.

Row total Column total
Expected cell count=
2
total number of cells
statistic
observed Expected
each cell
Expected
3. Find the (2) significant value in the table.

2significant = 2,df
=.05 (default value used in any statistical program) we will discuss this in detail soon!
df = (#rows-1)(# columns1)
4. Compare 2 statistic to 2 significant.

2Statistic > 2Significant REJECT Ho. Accept Ha.
2Statistic < 2Significant FAIL TO REJECT Ho.
Remember to state
your result in the
context of the
specific problem!
Find the (2) significant value in the table
df = (#rows-1)
x (# columns1)
Solution
Step 1: Form Hypothesis
Ho null hypothesis: No relationship exists

(Variables are Independent)
Ha alternative hypothesis: Variables are
related (Dependent)
Step 3: Find
2 significant
= (2 - 1)(2 - 1) = 1
=.05 (default value)
2significant= 2,df = 2.05,1 = 3.841
Step 4: Compare 2
statistic to 2 significant
Step 2: Calculate chisquared (2) statistic

Expected Count for each cell =
The chi-square statistic compares the
observed count in each table to the
count that would be expected under
the assumption of no association
between the variables.
Observed
Expected
Location
Style Urban Rural Total
Split
63
49 112
Ranch
15
33
48
Total
78
82 160
2 Statistic (7.62) > 2 Significant

(3.841 )
Reject Ho. There is evidence of
relationship between the variables.
Remember to state your result in the context of the specific
problem! The style of house differs depending on the location.
= 11278 = 55
160
Style
Split
Ranch
Total
observed Expected
2
statistic
Location
Urban Rural Total
55
57 112
23
25
48
78
82 160
Expected
each cell
63 55
55
49 57
57
15 23 33 25
23
25
7.62
Your turn!
The market research group for Albers Brewery of Tuscon, AZ, wants to know whether
preferences of beer type (light, regular, dark) differ among gender (male, female).
If beer preference is independent of gender, one advertising campaign will be initiated.
However, if beer preference depends on the gender of the beer drinker, the firm will
tailor its promotions to different target markets.
Are the variables related?
A survey was conducted and the following
data was collected:
100%
75%
50%
25%
Light
Stacked
bar graph
50%
Regular
25%
0
43%
Light
43%
Regular
25%
Dark
14% Dark
At the .05 level of significance, is there a statistically significant

difference between beer preference for males and females? What
about at the .01 level of significance?
Your turn! Solution

Ho null hypothesis: No relationship exists (Variables
are Independent)

Expected Count for each cell =
Ha alternative hypothesis: Variables are related

(Dependent)
Step 3: Find
2 significant
= (3 - 1)(2 - 1) = 2
22,.05 = 5.991
22,.01= 9.210
Step 4: Compare 2
2
statistic

observed Expected
Expected
each cell
20 27 40 37
27
37
20 16 30 23
16
23
30 33 10 14
33
14
6.604
REJECT Ho at the .05 level of significance (2 Statistic (6.604) > 2 Significant (5.991)). There is a
difference in beer preferences for men and woman. More females prefer light beer than men. Men
prefer regular beer over light/dark and females prefer light/regular over dark beer.
FAIL TO REJECT Ho at the .01 level of significance (2 Statistic (6.604) < 2 Significant (9.210)).
There is not enough evidence to reject Ho. Any differences in cell frequencies could be explained by
chance.
What is Significance Level?

(also called a type 1 error).
The significance level is: how often you are wrong
The most common value is .05. This means there
is a 5% chance that we are wrong in our findings
from testing the claim. Conversely, there is a 95%
chance that we are correct.
What should be? It is subjective. Other common
values for are .01 (99% confident in our results)
and .10 (90% sure of our results).
Important wording of conclusion

REJECT vs FAIL TO REJECT
Lets use the legal system as an example. The defendant is on trial for murder. A
person is presumed innocent until proven guilty.
Ho = Innocent
Ha = Guilty
We assume Ho to be true. If there is enough evidence to prove Ha then we REJECT Ho
and Ha is true.
Verdict: GUILTY
REJECT Ho.
There is enough evidence to
convict (guilty).
Verdict: NOT GUILTY

FAIL TO REJECT Ho.
We do not say Accept Ho. There is
NOT enough evidence to convict but
we are not proving innocence.
How much evidence we need is related to how confident we want to be in our results.
(the level of significance) is how often we are wrong (also called type 1 error).
Small Claims Court for endangerment of a child: Less evidence needed to convict, =.05
means there is a 5% chance you are wrong. Casey Anthony verdict is Reject Ho (GUILTY)
Jury for 1st degree murder: More evidence needed to convict: =.01 means there is a 1%
chance you are wrong. Casey Anthony verdict is Fail to Reject Ho (NOT GUILTY)
Statistically Significant
The value of used depends
on how confident you want to
be in your results.
What is statistically
significant to one person
might not be to another.
Statisticians
Has to have at least a 95% chance of being
true to be considered worth telling people
about (why =.05 is default for any
statistical program).
Manager
If something has a 90% chance of being
true ( =.1), it is probably better to act as
if it were true rather than false!
Type I and Type II errors

Unfortunately, neither the legal system nor statistical testing are perfect.
Remember that Ho = Innocent and Ha = Guilty.
The jury finds the
defendant GUILTY
But we are WRONG
and an innocent
person goes to jail!!!
Which is worse? How you The jury finds the

feel about it depends on
defendant NOT GUILTY
the level of that you
But we are WRONG and a
choose. & have an
Inverse Relationship. If
guilty person is set free (ex.
we increase , we
OJ Simpson, Casey Anthony)!!!
decrease , and vice
versa!
This is Type I error.

If =5%, there is a 5%
This is Type II error.

If =5%, there is a 5% chance
chance that this error

will occur.
That is, we are wrong to
REJECT Ho.
that this error will occur.

That is, we are wrong to FAIL
TO REJECT Ho.
Maybe this will help you remember
Type I and Type II errors
We will learn how to calculate Type II error in another lecture
Chi-Square
2
( )
Distribution
The chi-square distribution is defined by the degrees of

freedom (df).
df = (# outcomes of row 1) (# outcomes of column 1)
As the number of possible

outcomes of a variable
increases, the curve
approaches a normal
distribution.
Compare 2 statistic to 2 significant

or p-value to
REJECT Ho
FAIL TO REJECT Ho
statistic < significant

p-value >
Fail to Reject Ho
CHISQ.TEST Excel function
p value
Since .04 (p-value) < .05 ()
Reject Ho
Chi-square Assumptions
The sample size is large
(expected frequency of each cell is > 5)
Your turn!
Yes! We satisfy the assumption. If we did not, we cannot

trust the results of this test!
What if Assumptions are not met?

Possibly Combine categories to increase
values in each cell!
Here, there are substantially
fewer older adults than any
other group. We could
combine the middle age and
older adult categories into a
not young category. Then
we would have 2x3 cross tab
with larger n values.
Young
Music 14
News 46
Sports 7
Not Young
12
23
12
Fishers exact test can be used if

E(x) <5but only for 2x2 tables


variables?
28
Effect size is a measure of the strength of a relationship

There are two families of effect sizes (r and d)
d family
r family
Measuring the association
(CORRELATION) between the
variables.
How much can the change
(variance) in one variable be
explained by the other?
Quantifying the size of the

difference between two
groups
How BIG is the

difference?
100
80
Split
Split
Ranch
Ranch
Urban
Rural
60
40
20
Some important points

1. We are only concerned with effect size if the result
of the (chi-square) test was statistically significant.
2. The size of the p-value is no indication of the
strength of the association (ex. small P-value does not
imply strong association)
3. We are covering only the most widely used

statistical tools (there are still many more but this is a
basic course in statistics and those tools are for
another course).
Some common measures in the r family
Phi Fi (2x2 tables)

A value of .1 is considered a weak
General Guidelines for interpreting strength: (small) effect, .3 a moderate effect
and .5 a strong (large) effect.
Cramers V (not 2x2 tables)
The common measures in the r family

observed Expected

Phi
Expected
2
statistic
each cell
63 55 49 57
55
57
15 23 33 25
23
7.62
25
7.62
0.22
n
160
The relationship is WEAK!
Squaring phi will give you the variance that can be explained.
Whether the house location is urban or rural explains
(.22x.22=.05) 5% of the variance in the style of house built.
statistic
2
observed Expected
20 27 40 37
2
27
37
20 16 30 23
16
23
30 33 10 14
33
6.604
Cramers V
Expected
each cell
14
2
n df
6.604
0.15
150 2
The relationship is WEAK!
Squaring Cramers V will give you the variance that can be

explained. The gender of a person explains (.15x.15=.023)
2.3% of the variance in beer preference.
The d family (amount of difference)

2x2 table
How BIG is the
difference?
Not 2x2 table

The chi-squared test shows a
relationship existsbut where does
the relationship (difference) occur?
100%
80%
Split
Split
60%
75%
40%
20%
100%
Rural
Rural
Urban
Rural
50%
25%
Light
50%
Regular
25%
0
Measure of effect size:

Odds ratio (OR)
43%
Light
43%
Regular
25%
Dark
14% Dark

Adjusted standardized residuals
Odds Ratio
Group 1
Group 2
Total
Outcome 1
a+c
Outcome 2
b+d
a+b
c+d
a+b+c+d
Total
House Style
Split-Level
Ranch
Total
House Location
Urban
Rural
63
49
15
33
78
82
Total
112
48
160
ad 63 33 2079
OR
2.83
bc 15 49 735
Group 1 had odds of having outcome 1 OR times (more if OR>1; less
OR<1) than those who were in group 2.
Urban locations had odds of having a split-level house style 2.83 times more
than those who were in the rural area.
No universal agreement regarding what constitutes a strong or weak association:
OR > 2.0 is moderately strong; OR > 5.0 is strong
Weak associations are more likely to be explained by undetected biases or
confounders.
OR used to COMPARE STUDIES
Community-Based
Case-Control
Cohort
Hospital-Based
Case-Control
Oral Contraceptive Use and Ovarian Cancer

Hildreth et al,
Rosenberg et al,
La Vecchia et al,
Tzonou et al,
Booth et al,
Hartge et al,
WHO,
Wu et al,
Prazzini et al,
Newhouse et al,
Casagrande et al,
Cramer et al,
Willet et al,
Weiss,
Risch et al,
CASH,
Harlow et al,
Shu et al,
Walnut Creek,
Vessey et al,
Beral et al,
1981
1982
1984
1984
1989
1989
1989
1988
1991
1977
1979
1982
1981
1981
1983
1987
1988
1989
1981
1987
1988
+ ve Association
-ve Association
0.0
0.5
1.0
1.5
2.0
Odds Ratio
Hankinson SE et al. Obstet Gynecol. 1991;80:708-714.
2.5
3.0
3.5
www.contraceptiononline.org
Measure of effect sizes for medical studies

Question of interest: Is smoking related to cancer?
Group 1
Group 2
Total
Outcome 1
a+c
Outcome 2
b+d
a+b
c+d
a+b+c+d
Total
Odds ratio (OR)

Relative Risk
RR
ad 30 90
OR
3.9
bc 70 10
a / (a b) 30 /100
3
c / (c d ) 10 /100
People that smoke have odds of developing the

cancer 3.9 times (390%) higher than those that dont
smoke
Those that smoke are 3 times (or 300%) more likely to develop
lung cancer than those that dont smoke.
Relative risk improvement/reduction
RRR
a / (a b) c / (c d ) (30 /100) (10 /100)
2
c / (c d )
10 /100
Absolute risk improvement/reduction
ARR (a / (a b)) (c / (c d ))
(30 /100) (10 /100) .2
There is a 200% increase in lung

cancer cases for smokers compared
to those that dont smoke.
Out of every 10 people that smoke, on average 2 will likely get

cancer.
Number needed to treat/harm (NNT/NNH)

NNT 1/ ARR 1/ .2 5
One would need to harm 5 patients ( with smoke) see one case of
cancer develop.
ABSOLUTE vs RELATIVE Risk

Remember our calculation for the smoking example:
(X
Those that
dont
smoke and
got cancer
10
100
3) or 300% more likely to develop lung

cancer than those that dont smoke (RR).
There is a 200% increase in lung cancer
cases for smokers compared to those that
dont smoke (RRR).
Out of every 10 people that smoke, on average
2 will likely get cancer , 20% (ARI).
There are different ways of describing the

same risk which can profoundly affect how we
perceive it. Ultimately, when deciding on
whether to take a treatment, ideally you
should decide with your doctor if the reduction
in the ACTUAL (absolute risk) outweighs the
risks, side-effects and costs of treatment.
Group Group
A
B
RRI
Those that
smoke and
got cancer
30
100
ARI
10%
30%
200% 20%
1%
3%
200%
2%
.1%
.3%
200%
.2%
The RRR sounds better for

marketing purposes!...
The d family (amount of difference)

2x2 table
How BIG is the
difference?
Not 2x2 table

The chi-squared test shows a
relationship existsbut where does
the relationship (difference) occur?
100%
80%
Split
Split
60%
75%
40%
20%
100%
Rural
Rural
Urban
Rural
50%
25%
Light
50%
Regular
25%
0

Odds ratio (OR)
43%
Light
43%
Regular
25%
Dark
14% Dark

Adjusted standardized residuals
The d family for not 2x2 tables

To determine which of the categories are major contributors to the
statistical significance, the adjusted
is computed for each cell:
OE
E
(1
nrow
n
)(1 column
ntotal
ntotal
standardized residual
20 27
27
2.3
50
80
(1
)(1
)
)
150
150
Light
Regular
Dark
adjusted
standardized
male female
-2.3
3.5
0.9
-0.9
1.6
-1.9
A statistically significant relationship was found (chi-square test

rejected Ho). SPECIFICALLY, there is a statistical difference in male and
female responses for those that chose Light beer (look for values
that are above +2 or below -2).
There are two contingency table tests:
(1) Two-way table contingency test

(2) One-way table contingency test

(also called goodness of fit test)

variable and a specific distribution?
Visually (Bar Graph)
2. What is the effect size?
42
One-way contingency table test

is called a Goodness of Fit test
A hypothesis is a belief about the results of a statistical study.
The Goodness of Fit tests if the outcomes of a variable
follows a hypothesized distribution (or put another way the
Relationship of variable to a specific distribution).
The Republican party is significantly
larger than any other.
There is an equal amount of
individuals in each party.
The Democrat party is significantly
larger than any other.

44
Example
Is it safer to fly in the front, middle, or
back of the airplane?
Matt McCormick, a survival expert for
the National Transportation Safety
Board, told Travel Magazine that
There is no one safe place to sit.
In an effort to test this claim, United

Airlines recorded the seat position for
87 fatalities.
Collected Raw
Data must be
organized into a
frequency table.
Seat
Back
Middle
Front
Total
f
23
35
29
87
United Airlines compared

these actual results to
hypothesized results, which is
the belief that fatality is the
same whether one sits in the
front, back, or middle of an
airplane.
87/3 = 29
Front is 29
Middle is 29
Back is 29
This is a uniform
distribution!
How to we test this claim?
1. Form hypothesis.
Ho null hypothesis: The data are consistent with a specified distribution.

Ha alternative hypothesis: The data are NOT consistent with a specified distribution.
2. Calculate chi-squared (2) statistic.

2
observed
exp ected
exp ected
3. Find 2 significant in table.

2,df
df = # outcomes 1
4. Compare 2 statistic to 2 significant.

2 statistic > 2 significant, Reject Ho, Accept Ha.
2 statistic < 2 significant, Fail to Reject Ho.
Solution
Ho null hypothesis: The data are consistent

with a specified distribution. There is no one
safe place to sit.

Outcome
f
Hypothesized
(observed) distribution
f (expected)
Ha alternative hypothesis: The data are

NOT consistent with a specified distribution.
There is a safe place to sit.
Back
23
29
Step 3: Find 2 significant
Middle
35
29
df = # outcomes 1 = 3 1 = 2
2,df = 2.05,2 = 5.991
Front
29
29
Total
87
87
Step 4: Compare
to 2 significant
statistic
2.48<5.991
2 statistic < 2 significant
Fail to Reject Ho.
There is not enough evidence to refute the
claim that there is no one safe place to sit!.
observed
exp ected
23 29
exp ected
35 29
29
29
1.24 1.24 0 2.48
29 29
29
Your turn! Goodness of Fit

Is Sudden Infant Death Syndrome (SIDS) Seasonal? Data from King County,
Washington regarding the number of deaths from SIDS for each season:
Season
Winter
78
Spring
71
Summer
87
Fall
86
Total
322
Ho null hypothesis: Data follows

hypothesized distribution (uniform SIDS
deaths for all seasons are the same.
Ha alternative hypothesis: Data doesnt
follow the hypothesized distribution.
Step 3: Find 2 significant

df = # outcomes 1 = 4 1 = 3
2,df = 2.05,3 = 7.815
Step 4: Compare 2
2 statistic
(2.10) <
Fail to Reject Ho.
2 significant
(7.815)

Season
Winter
fo
78
fe
322/4
=80.5
Spring
71
80.5
Summer
87
80.5
Fall
86
80.5
Total
322
322
78 80.5 71 80.5
2
statistic
80.5
80.5
87 80.5 86 80.5
80.5
2.10
80.5
Conclusion: Sudden infant death syndrome proportions

across seasons are not statistically different from whats
expected by chance (i.e. all seasons being equal).
CHISQ.TEST Excel function

output compares p value to
p value (.55)
statistic (2.10) < significant (7.815)

p value (.55) > (.05)
Fail to Reject Ho
Goodness of Fit Test Assumptions

Test
Assumptions
Exact Binomial test

2 outcomes
(we didnt learn by hand) Samples up to n=1000
Chi-square test
Large sample: E(x)>5

51
Goodness of fit effect size

Just like the test of Independence, use Cramers V
(more than 2 outcomes) or Phi (2 outcomes).
Unlike the test of independence, compute the
effect size whether you Reject or Fail to Reject Ho.
The interpretation is different from the test of independence!
A value of .1 is considered a close to perfect

fit
.3 a moderate effect
.5 a weak fit.
Goodness of fit effect size

Is Sudden Infant Death Syndrome (SIDS) Seasonal? Data from King County,
Washington regarding the number of deaths from SIDS for each season:
78 80.5 71 80.5
2
2
statistic
80.5
80.5
87 80.5
80.5
2.10
86 80.5
80.5
Cramers V
V
2
n df
Season
Winter
78
Spring
71
Summer
87
Fall
86
Total
322
2.10
.05
322 3
Interpretation:
A value of 0 indicates that the sample proportions are exactly equal
(a perfect fit) to the hypothesized proportions (i.e., O = E). As v
increases, the degree of departure from a perfect fit increases.
Since V=.05, there is a small effect, or small departure from fit
Why are we learning this?

We can use the GOODNESSOF-FIT TEST to validate
the use of a specific
distribution for:
SIMULATION
PROBABILITY
To test if a sample of data

came from a population
with a specific distribution
is called a GOODNESSOF-FIT TEST.
Typical steps for a Statistical study

1. Define the goals
ReminderHow
this lecture fits in
with everything
we have learned
so far...
Research Question
2. Collect the data

Research Designs
3. Organize the data
Tables for each variable (f, %f, cf, %cf )

Table for 2 qualitative variables (f contingency table, %f
total, %f independent (column) variable)
4. Present the data
Graphs of each variable (pareto pie, bar, ogive, histogram,

boxplot)
Graphs for 2 qualitative variables (Stacked & Clustered bar
graph)
Graph for 1 quantitative variable and 1 qualitative variable
(Comparative Boxplot )
5. Describe the data
Statistics and Parameters
6. Analyze the data
Statistical tests (chi-square, Fishers Exact)

Effect size (OR, Adjusted Standardized Residual, Phi, Cramers V)
7. Interpret results
55
End of the Lecture!
Remember
If you need helpcall me, see me, or email me.

Data Analysis (1) Qual Qual

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Analysis (1) Qual Qual

Uploaded by

Copyright:

Available Formats

Data Analysis (1/3)

Typical steps for a Statistical study

2. Collect the data

5. Describe the data

Change in one variable is

Relationship exists between variables

To study a relationship, one variable is called the

X-axis is the INDEPENDENT

Data Analysis and Tools

Dependent Variable (Y)

The data and the type of research question you want

There are two contingency table tests:

(1) Two-way table contingency test

(2) One-way table contingency test

Two-way table contingency test

1. Is there a relationship between the

Independent variable: Gender

We will study the differences in outcomes of the dependent

Is there a relationship/difference between the variables?

Dependent variable: Political Party

Independent variable: Area of residence

Two-way table contingency test

1. Is there a relationship between the

Is there a relationship between the variables?

How to we test this claim?

2. Calculate the chi kai square (2) statistic.

3. Find the (2) significant value in the table.

4. Compare 2 statistic to 2 significant.

Find the (2) significant value in the table

Step 1: Form Hypothesis

Ho null hypothesis: No relationship exists

Step 2: Calculate chisquared (2) statistic

Row total Column total

2 Statistic (7.62) > 2 Significant

At the .05 level of significance, is there a statistically significant

Your turn! Solution

Step 2: Calculate chisquared (2) statistic

Ha alternative hypothesis: Variables are related

Row total Column total

What is Significance Level?

Important wording of conclusion

Verdict: NOT GUILTY

Type I and Type II errors

Which is worse? How you The jury finds the

This is Type I error.

This is Type II error.

chance that this error

that this error will occur.

Maybe this will help you remember

Type I and Type II errors

We will learn how to calculate Type II error in another lecture

The chi-square distribution is defined by the degrees of

As the number of possible

Compare 2 statistic to 2 significant

statistic < significant

CHISQ.TEST Excel function

Yes! We satisfy the assumption. If we did not, we cannot

What if Assumptions are not met?

Fishers exact test can be used if

Two-way table contingency test

1. Is there a relationship between the

Effect size is a measure of the strength of a relationship

Quantifying the size of the

How BIG is the