The major use of inferential statistics is to use information from a sample to infer
something about a population.
The methods of inference used to support or reject claims based on sample data are known as tests of significance. A population is a collection of data whose properties are analyzed. The population is the complete collection to be studied, it contains all subjects of interest. A sample is a part of the population of interest, a sub-collection selected from a population. A parameter is a numerical measurement that describes a characteristic of a population, while a sample is a numerical measurement that describes a characteristic of a sample. In general, we will use a statistic to infer something about a parameter. A confidence interval gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data. The significance level for a given hypothesis test is a value for which a P-value less than or eual to is considered statistically significant. Typical values for are !.", !.!#, and !.!". These values correspond to the probability of observing such an e$treme value by chance. %very test of significance begins with a null hypothesis H0. H0 represents a theory that has been put forward, either because it is believed to be true or because it is to be used as a basis for argument, but has not been proved. &or e$ample, in a clinical trial of a new drug, the null hypothesis might be that the new drug is no better, on average, than the current drug. 'e would write H0( there is no difference between the two drugs on average. The alternative hypothesis, Ha, is a statement of what a statistical hypothesis test is set up to establish. &or e$ample, in a clinical trial of a new drug, the alternative hypothesis might be that the new drug has a different effect, on average, compared to that of the current drug. 'e would write Ha( the two drugs have different effects, on average. The alternative hypothesis might also be that the new drug is better, on average, than the current drug. In this case we would write Ha( the new drug is better than the current drug, on average. Type I error A type I error occurs when one rejects the null hypothesis when it is true. The probability of a type I error is the level of significance of the test of hypothesis, and is denoted by )alpha) Type II error A type II error occurs when one rejects the alternative hypothesis *fails to reject the null hypothesis+ when the alternative hypothesis is true. The probability of a type II error is denoted by )beta) The power of a test is the probability that a fixed level significance test will reject the null hypothesis H0 when a particular alternative value of the parameter is true. The Standard Error of the Mean *,%*-++ is analogous to the ,tandard .eviation *,.+, in that it is an estimate of variability. It is different, however, in that while the ,tandard .eviation gives one a sense of how much variability there is in the individual values that make up one single sample, the ,tandard %rror of the -ean gives one a sense of how much variability there is in the means of small samples *of n individual values+ of a larger population. Standard Error of the Mean: Signifcance Level A Type I error occurs when the researcher rejects a null hypothesis when it is true. The probability of committing a Type I error is called the significance level, and is often denoted by /. egrees of freedom !df" of an estimate is the number of independent pieces of information on which the estimate is based. #ow to $onduct #ypothesis Tests All hypothesis tests are conducted the same way. The researcher states a hypothesis to be tested, formulates an analysis plan, analyzes sample data according to the plan, and accepts or rejects the null hypothesis, based on results of the analysis. State the hypotheses. %very hypothesis test reuires the analyst to state a null hypothesis and an alternative hypothesis. The hypotheses are stated in such a way that they are mutually e$clusive. That is, if one is true, the other must be false0 and vice versa. %ormulate an analysis plan. The analysis plan describes how to use sample data to accept or reject the null hypothesis. It should specify the following elements. o ,ignificance level. 1ften, researchers choose significance levels eual to !.!", !.!#, or !."!0 but any value between ! and " can be used. o Test method. Typically, the test method involves a test statistic and a sampling distribution. 2omputed from sample data, the test statistic might be a mean score, proportion, difference between means, difference between proportions, z-score, t-score, chi-suare, etc. 3iven a test statistic and its sampling distribution, a researcher can assess probabilities associated with the test statistic. If the test statistic probability is less than the significance level, the null hypothesis is rejected. &naly'e sample data. 4sing sample data, perform computations called for in the analysis plan. o Test statistic. 'hen the null hypothesis involves a mean or proportion, use either of the following euations to compute the test statistic. Test statistic 5 *,tatistic - 6arameter+ 7 *,tandard deviation of statistic+ Test statistic 5 *,tatistic - 6arameter+ 7 *,tandard error of statistic+ where Parameter is the value appearing in the null hypothesis, and Statistic is the point estimate of Parameter. As part of the analysis, you may need to compute the standard deviation or standard error of the statistic. 6reviously, we presented common formulas for the standard deviation and standard error. 'hen the parameter in the null hypothesis involves categorical data, you may use a chi-suare statistic as the test statistic. Instructions for computing a chi-suare test statistic are presented in the lesson on the chi- suare goodness of fit test. o 6-value. The 6-value is the probability of observing a sample statistic as e$treme as the test statistic, assuming the null hypotheis is true. Interpret the results. If the sample findings are unlikely, given the null hypothesis, the researcher rejects the null hypothesis. Typically, this involves comparing the 6-value to the significance level, and rejecting the null hypothesis when the 6-value is less than the significance level. (arametric Test( A statistical test that involves data that are assumed to be distributed according to some distribution, usually normal, whose parameters are known, such as data from interval or ratio scales. )onparametric tests are also referred to as distribution*free tests. These tests have the obvious advantage of not reuiring the assumption of normality or the assumption of homogeneity of variance. They compare medians rather than means and, as a result, if the data have one or two outliers, their influence is negated.