Data Analysis: Analyzing Individual Variables and Basics of Hypothesis Testing

Data Analysis: Analyzing Individual Variables and Basics of Hypothesis Testing
Chapter 19
Dr S.L Gupta
Data Analysis: Two Key Considerations

(1) Is the variable to be analyzed by itself (univariate analysis) or in relationship to other variables (multivariate analysis)? (2) What level of measurement was used?
If you can answer these two questions, data analysis is easy...
Dr S.L Gupta
CATEGORICAL MEASURES: A commonly used expression for nominal and ordinal measures. CONTINUOUS MEASURES: A commonly used expression for interval and ratio measures.
Dr S.L Gupta
Basic Univariate Statistics: Categorical Measures

FREQUENCY ANALYSIS: A count of the number of cases that fall into each of the response categories.
Dr S.L Gupta
Frequency Analysis
Dr S.L Gupta
Use of Percentages
Percentages are very useful for interpreting the results of categorical analyses and should be included whenever possible.
Unless your sample size is VERY large, however, report percentages as whole numbers (i.e., no decimals)
Dr S.L Gupta
Frequency Analysis
Researchers almost always work with valid percentages which are simply percentages after taking out cases with missing data on the variable being analyzed.
Note: In the example, there were no missing cases. As a result, the Percent column entries were identical to the Valid Percent column entries.
Dr S.L Gupta
Uses of Frequency Analysis

Univariate categorical analysis Identify blunders and cases with excessive item nonresponse Identify outliers Determine empirical distribution of a variable
Dr S.L Gupta
Frequency Analysis
Dr S.L Gupta
Dr S.L Gupta
Dr S.L Gupta
Confidence Interval
A projection of the range within which a population parameter will lie at a given level of confidence based on a statistic obtained from a probabilistic sample.
This is why you need to draw a probability sample!
Dr S.L Gupta
Confidence Intervals for Proportions
where z = z score associated with the desired level of confidence; p = the proportion obtained from the sample; and n = the number of valid cases overall on which the proportion was based.
CONFIDENCE INTERVAL:
Dr S.L Gupta

EXAMPLE: In Exhibit 19.2, we saw that
30% of the people in the sample had financed the most recent car purchase. Assuming that the 100 respondents had been secured using a probability sampling plan, what is the 95% confidence interval for the population parameter?
Dr S.L Gupta
Therefore, we can be 95% confident that the proportion of people in the population who would respond that they had financed their most recent car purchase is between .21 and .39, inclusive.
Dr S.L Gupta
CAUTION in Interpreting Confidence Intervals

The confidence interval only takes sampling error into account. It DOES NOT account for other common types of error (e.g., response error, nonresponse error). The goal is to reduce TOTAL error, not just one type of error.
Dr S.L Gupta
Basic Univariate Statistics: Continuous Measures

DESCRIPTIVE STATISTICS: Statistics that describe the distribution of responses on a variable. The most commonly used descriptive statistics are the mean and standard deviation.
Dr S.L Gupta
Converting Continuous Measures to Categorical Measures

Sometimes it is useful to convert continuous measures to categorical measures.
This is legitimate, because measures at higher levels of measurement (in this case, continuous measures) have all the properties of measures at lower levels of measurement (categorical measures).
Why do this? Ease of interpretation for managers

Dr S.L Gupta

TWO-BOX TECHNIQUE: A technique for converting an interval-level rating scale into a categorical measure usually used for presentation purposes. The percentage of respondents choosing one of the top two positions on a rating scale is reported.
Dr S.L Gupta

Please rate the quality of service provided by Better Smiles Dental Office on the following scales:
very poor
Dental technicians Receptionist Dentist (2) (10) (17)
poor neutral good (6) (16) (17) (36) (18) (35) (32) (36) (21)
very good
(24) (20) (10)
Frequency count of respondents selecting each response category shown in red

Dr S.L Gupta
two-box Dental technicians 56%
mean (s.d.) 3.70 (0.97)
Receptionist
Dentist (n=100)
56%
31%
3.40
2.90
(1.25)
(1.21)
Dr S.L Gupta
Confidence Intervals for Means
where z = z score associated with the desired level of confidence; s = the sample standard deviation; and n = the total number of cases used to calculate the mean.
CONFIDENCE INTERVAL:
Dr S.L Gupta

EXAMPLE: A sample of 100 car owners
revealed that the mean number of family members was 4.0, with a sample standard deviation of 1.9 family members. Assuming that the 100 respondents had been secured using a probability sampling plan, what is the 95% confidence interval for the mean number of family members in the population?
Dr S.L Gupta
Therefore, we can be 95% confident that the mean number of family members in the population lies somewhere between 3.6 and 4.4, inclusive.
Dr S.L Gupta
Hypothesis Testing
THE ISSUE: How can we tell if a particular result in the sample represents the true situation in the population or simply occurred by chance?
Hypotheses
Unproven propositions about some phenomenon of interest.
Dr S.L Gupta
Hypothesis Testing
Null Hypothesis (Ho) The hypothesis that
a proposed result is not true for the population. Researchers typically attempt to reject the null hypothesis in favor of some alternative hypothesis.
Alternative Hypothesis (HA) The
hypothesis that a proposed result is true for the population.
Typical Hypothesis Testing Procedure

Specify Null and Alternative Hypotheses after Analyzing the Research Problem
Choose an Appropriate Statistical Test Considering the Research Design and after Determining the Sampling Distribution That Applies Given the Chosen Test Statistic Specify the Significance Level (Alpha) for the Problem Being Investigated
Collect the Data and Compute the Value of the Test Statistic Appropriate for the Sampling Distribution
Determine the Probability of the Test Statistic under the Null Hypothesis Using the Sampling Distribution Specified in Step 2 Compare the Obtained Probability with the Specified Significance Level and Then Reject or Do Not Reject the Null Hypothesis on the Basis of the Comparison Dr S.L Gupta
Significance Level ()
The acceptable level of Type I error selected by the researcher, usually set at 0.05. Type I error is the probability of rejecting the null hypothesis when it is actually true for the population.
Dr S.L Gupta
p-value
The probability of obtaining a given result if in fact the null hypothesis were true in the population. A result is regarded as statistically significant if the p-value is less than the chosen significance level of the test.
Dr S.L Gupta
Common Misinterpretations of What Statistically Significant Means

Viewing p-values as if they represent the probability that the results occurred because of sampling error (e.g., p=.05 implies that there is only a .05 probability that the results were caused by chance).
Assuming that statistical significance is the same thing as managerial significance.
Viewing the or p levels as if they are somehow related to the probability that the research hypothesis is true (e.g., a p-value such as p>.001 is highly significant and therefore more valid than p<.05).
Dr S.L Gupta
Testing Hypotheses about Individual Variables

Chi-square Goodness-of-Fit Test for Frequencies: A statistical test to determine whether some observed pattern of frequencies corresponds to an expected pattern.
Dr S.L Gupta

Kolmogorov-Smirnov Test: A statistical test used with ordinal data to determine whether some observed pattern of frequencies corresponds to some expected pattern; also used to determine whether two independent samples have been drawn from the same population or from populations with the same distribution.
Dr S.L Gupta

Z-test for Comparing Sample Proportion against a Standard
where p = proportion from the sample, = the proportion standard to be achieved, p = the standard error of the proportion, and n = number of respondents in the sample.
Dr S.L Gupta

t-test for Comparing Sample Mean against a Standard (Small Sample, n 30)
where x = sample mean, = the population standard, sx = the standard error of the mean, s = sample standard deviation, and n = sample size.
Dr S.L Gupta

z-test for Comparing Sample Mean against a Standard (Large Sample, n > 30)
where x = sample mean, = the population standard, sx = the standard error of the mean, s = sample standard deviation, and n = sample size.
Dr S.L Gupta

Data Analysis: Analyzing Individual Variables and Basics of Hypothesis Testing

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Analysis: Analyzing Individual Variables and Basics of Hypothesis Testing

Uploaded by

Copyright:

Available Formats

Data Analysis: Analyzing Individual Variables and Basics of Hypothesis Testing

Data Analysis: Two Key Considerations

Basic Univariate Statistics: Categorical Measures

Uses of Frequency Analysis

Confidence Intervals for Proportions

Confidence Intervals for Proportions

Confidence Intervals for Proportions

CAUTION in Interpreting Confidence Intervals

Basic Univariate Statistics: Continuous Measures

Converting Continuous Measures to Categorical Measures

Why do this? Ease of interpretation for managers

Converting Continuous Measures to Categorical Measures

Converting Continuous Measures to Categorical Measures

Frequency count of respondents selecting each response category shown in red

Converting Continuous Measures to Categorical Measures

two-box Dental technicians 56%

mean (s.d.) 3.70 (0.97)

Confidence Intervals for Means

Confidence Intervals for Means

Confidence Intervals for Means

Alternative Hypothesis (HA) The

hypothesis that a proposed result is true for the population.

Typical Hypothesis Testing Procedure

Common Misinterpretations of What Statistically Significant Means

Assuming that statistical significance is the same thing as managerial significance.

Testing Hypotheses about Individual Variables

Testing Hypotheses about Individual Variables

Testing Hypotheses about Individual Variables

Testing Hypotheses about Individual Variables

Testing Hypotheses about Individual Variables

You might also like