You are on page 1of 36

Data Analysis: Analyzing Individual Variables and Basics of Hypothesis Testing

Chapter 19

Dr S.L Gupta

Data Analysis: Two Key Considerations


(1) Is the variable to be analyzed by itself (univariate analysis) or in relationship to other variables (multivariate analysis)? (2) What level of measurement was used?
If you can answer these two questions, data analysis is easy...
Dr S.L Gupta

CATEGORICAL MEASURES: A commonly used expression for nominal and ordinal measures. CONTINUOUS MEASURES: A commonly used expression for interval and ratio measures.

Dr S.L Gupta

Basic Univariate Statistics: Categorical Measures


FREQUENCY ANALYSIS: A count of the number of cases that fall into each of the response categories.

Dr S.L Gupta

Frequency Analysis

Dr S.L Gupta

Use of Percentages
Percentages are very useful for interpreting the results of categorical analyses and should be included whenever possible.
Unless your sample size is VERY large, however, report percentages as whole numbers (i.e., no decimals)

Dr S.L Gupta

Frequency Analysis
Researchers almost always work with valid percentages which are simply percentages after taking out cases with missing data on the variable being analyzed.
Note: In the example, there were no missing cases. As a result, the Percent column entries were identical to the Valid Percent column entries.
Dr S.L Gupta

Uses of Frequency Analysis


Univariate categorical analysis Identify blunders and cases with excessive item nonresponse Identify outliers Determine empirical distribution of a variable

Dr S.L Gupta

Frequency Analysis

Dr S.L Gupta

Dr S.L Gupta

Dr S.L Gupta

Confidence Interval
A projection of the range within which a population parameter will lie at a given level of confidence based on a statistic obtained from a probabilistic sample.
This is why you need to draw a probability sample!

Dr S.L Gupta

Confidence Intervals for Proportions

where z = z score associated with the desired level of confidence; p = the proportion obtained from the sample; and n = the number of valid cases overall on which the proportion was based.
CONFIDENCE INTERVAL:

Dr S.L Gupta

Confidence Intervals for Proportions


EXAMPLE: In Exhibit 19.2, we saw that
30% of the people in the sample had financed the most recent car purchase. Assuming that the 100 respondents had been secured using a probability sampling plan, what is the 95% confidence interval for the population parameter?

Dr S.L Gupta

Confidence Intervals for Proportions

Therefore, we can be 95% confident that the proportion of people in the population who would respond that they had financed their most recent car purchase is between .21 and .39, inclusive.
Dr S.L Gupta

CAUTION in Interpreting Confidence Intervals


The confidence interval only takes sampling error into account. It DOES NOT account for other common types of error (e.g., response error, nonresponse error). The goal is to reduce TOTAL error, not just one type of error.

Dr S.L Gupta

Basic Univariate Statistics: Continuous Measures


DESCRIPTIVE STATISTICS: Statistics that describe the distribution of responses on a variable. The most commonly used descriptive statistics are the mean and standard deviation.

Dr S.L Gupta

Converting Continuous Measures to Categorical Measures


Sometimes it is useful to convert continuous measures to categorical measures.
This is legitimate, because measures at higher levels of measurement (in this case, continuous measures) have all the properties of measures at lower levels of measurement (categorical measures).

Why do this? Ease of interpretation for managers


Dr S.L Gupta

Converting Continuous Measures to Categorical Measures


TWO-BOX TECHNIQUE: A technique for converting an interval-level rating scale into a categorical measure usually used for presentation purposes. The percentage of respondents choosing one of the top two positions on a rating scale is reported.

Dr S.L Gupta

Converting Continuous Measures to Categorical Measures


Please rate the quality of service provided by Better Smiles Dental Office on the following scales:

very poor
Dental technicians Receptionist Dentist (2) (10) (17)

poor neutral good (6) (16) (17) (36) (18) (35) (32) (36) (21)

very good
(24) (20) (10)

Frequency count of respondents selecting each response category shown in red


Dr S.L Gupta

Converting Continuous Measures to Categorical Measures

two-box Dental technicians 56%

mean (s.d.) 3.70 (0.97)

Receptionist
Dentist (n=100)

56%
31%

3.40
2.90

(1.25)
(1.21)

Dr S.L Gupta

Confidence Intervals for Means

where z = z score associated with the desired level of confidence; s = the sample standard deviation; and n = the total number of cases used to calculate the mean.
CONFIDENCE INTERVAL:

Dr S.L Gupta

Confidence Intervals for Means


EXAMPLE: A sample of 100 car owners
revealed that the mean number of family members was 4.0, with a sample standard deviation of 1.9 family members. Assuming that the 100 respondents had been secured using a probability sampling plan, what is the 95% confidence interval for the mean number of family members in the population?

Dr S.L Gupta

Confidence Intervals for Means

Therefore, we can be 95% confident that the mean number of family members in the population lies somewhere between 3.6 and 4.4, inclusive.

Dr S.L Gupta

Hypothesis Testing

THE ISSUE: How can we tell if a particular result in the sample represents the true situation in the population or simply occurred by chance?

Hypotheses
Unproven propositions about some phenomenon of interest.

Dr S.L Gupta

Hypothesis Testing
Null Hypothesis (Ho) The hypothesis that
a proposed result is not true for the population. Researchers typically attempt to reject the null hypothesis in favor of some alternative hypothesis.

Alternative Hypothesis (HA) The

hypothesis that a proposed result is true for the population.

Typical Hypothesis Testing Procedure


Specify Null and Alternative Hypotheses after Analyzing the Research Problem
Choose an Appropriate Statistical Test Considering the Research Design and after Determining the Sampling Distribution That Applies Given the Chosen Test Statistic Specify the Significance Level (Alpha) for the Problem Being Investigated

Collect the Data and Compute the Value of the Test Statistic Appropriate for the Sampling Distribution
Determine the Probability of the Test Statistic under the Null Hypothesis Using the Sampling Distribution Specified in Step 2 Compare the Obtained Probability with the Specified Significance Level and Then Reject or Do Not Reject the Null Hypothesis on the Basis of the Comparison Dr S.L Gupta

Significance Level ()
The acceptable level of Type I error selected by the researcher, usually set at 0.05. Type I error is the probability of rejecting the null hypothesis when it is actually true for the population.

Dr S.L Gupta

p-value
The probability of obtaining a given result if in fact the null hypothesis were true in the population. A result is regarded as statistically significant if the p-value is less than the chosen significance level of the test.

Dr S.L Gupta

Common Misinterpretations of What Statistically Significant Means


Viewing p-values as if they represent the probability that the results occurred because of sampling error (e.g., p=.05 implies that there is only a .05 probability that the results were caused by chance).

Assuming that statistical significance is the same thing as managerial significance.

Viewing the or p levels as if they are somehow related to the probability that the research hypothesis is true (e.g., a p-value such as p>.001 is highly significant and therefore more valid than p<.05).

Dr S.L Gupta

Testing Hypotheses about Individual Variables


Chi-square Goodness-of-Fit Test for Frequencies: A statistical test to determine whether some observed pattern of frequencies corresponds to an expected pattern.

Dr S.L Gupta

Testing Hypotheses about Individual Variables


Kolmogorov-Smirnov Test: A statistical test used with ordinal data to determine whether some observed pattern of frequencies corresponds to some expected pattern; also used to determine whether two independent samples have been drawn from the same population or from populations with the same distribution.

Dr S.L Gupta

Testing Hypotheses about Individual Variables


Z-test for Comparing Sample Proportion against a Standard

where p = proportion from the sample, = the proportion standard to be achieved, p = the standard error of the proportion, and n = number of respondents in the sample.
Dr S.L Gupta

Testing Hypotheses about Individual Variables


t-test for Comparing Sample Mean against a Standard (Small Sample, n 30)

where x = sample mean, = the population standard, sx = the standard error of the mean, s = sample standard deviation, and n = sample size.

Dr S.L Gupta

Testing Hypotheses about Individual Variables


z-test for Comparing Sample Mean against a Standard (Large Sample, n > 30)

where x = sample mean, = the population standard, sx = the standard error of the mean, s = sample standard deviation, and n = sample size.

Dr S.L Gupta

You might also like