74 views

Uploaded by Fanny Sylvia C.

- Formula Rio
- print 4.pdf
- bayes_en (6)
- HO 3 Core HO 8 With Answers
- Effectsofsoftdrinksondentalhealth-Areview
- final paper seniors (2)
- ch7 205 (1)
- An Observatory Note on Tests for Normality Assumptions
- 2016 Understanding the Rol of P Values
- state code analysis
- Levine Smume7 Ch09
- Lab File.docx
- EFFECT OF CONFLICTS ON THE LIVELIHOOD ACTIVITIES OF RURAL DWELLERS IN OWO FOREST RESERVE OF ONDO STATE
- stats analytical report
- Hypos Testing
- EFFECTIVE COVERAGE OF LEARNING DOMAINS IN STUDENTS ASSESSMENT BY LECTURERS OF CROSS RIVER UNIVERSITY OF TECHNOLOGY CALABAR NIGERIA
- IMPROVING QUALITY OF STATISTICAL PROCESS CONTROL BY DEALING WITH NON-NORMAL DATA IN AUTOMOTIVE INDUSTRY / Test statystyczny sprawozdający hipotezę dobroci pearsona
- Thesis(1) Title
- Inferential Statistics
- Statistical Methods for Linguistic Research_part i

You are on page 1of 8

Chapter 3

it is valid even when the assumption is not met. Valid means that the uncertainty

measures, such as confidence intervals and p-values, are very nearly equal to those under

the assumption. The t procedures assume that the data are normal, that the two

populations have equal variances, and that all the observations are independent of one

another.

Departures from the Normality Assumption: The t procedures are robust to departures

from normality. Data depart from normality when their distribution is not symmetrical,

bell- or mound-shaped, or when the tails are either too long or short. While this is

subjective to some extent, assessing how severe is the departure from normality is an

important part of your training.

Departures from the Equal Variances Assumption: These departures can be more

serious. This condition is best checked by looking at histograms of both samples as well

as the sample standard deviations. Often, so long as the sample sizes are similar the

uncertainty measures will still be reasonably close to the true ones.

Departures from Independence: This can be caused by serial or spatial correlation or by

cluster effects. These assumptions can usually be easily checked by considering the

experimental design and data collecting procedures. Data that fail to meet the

independence assumption require methods other than those presented here.

What are some examples of data sets that violate the independence assumption?

Resistance: A statistical procedure is resistant if it does not change very much when a

small part of the data changes. The t procedures are not resistant to outliers. Outliers

should be identified and the analysis performed with and without the outlier and the

results reported in the published results of the experiment.

Transformations of the Data: sometimes departures from normality can be corrected by

transformations. The most common transformation is the log transform. There are two

common log transformations. It is common for writers to use log to mean both the natural

log and the log base 10. In my notes, I will often use ln to mean natural log and log to

mean log 10, or log to mean either natural or log 10. Your text will use log to mean

natural log and will use log10 to mean log 10, though the use of log10 will be rare. Please

do not let this confuse or upset you.

found for log transformed data, i.e.,

y = log( x ) + δ ,

δ

Thus, e is the multiplicative effect on the original scale of measurement. To test for a

treatment effect, we use the usual t-test on the log-transformed data.

Population Model: Estimating the Ratio of Population Medians

When drawing inferences about population means using a log or natural log transform,

there is a problem with transforming back to the original scale of the data because the

mean of the log transformed sample is not the log of the mean of the original. So taking

the antilog does not give an estimate of the mean on the original scale.

mean[log(Y)] = median[log(Y)]

and

median[log(Y)] = log[median(Y)].

In words, The median, or 50th percentile, of the log transformed values is the log of the

50th percentile of the original values. So, when we transform back to the original scale,

we are now drawing inferences about the median.

If we denote the averages of the log transformed means as Z1 , Z 2 then the difference of

these two quantities estimates the log of the ratio of their medians. That is,

⎡ median ( Y2 ) ⎤

Z 2 − Z1 = log ⎢ ⎥ where Y1 and Y2 represent the two samples and hence the

⎢⎣ median ( Y1 ) ⎥⎦

right-hand side is an estimate of the log of the ratio of the two population medians.

Other transformations: There are many transformations one can try. There are rules of

thumb, but it usually boils down to trial and error. Some common transformations are

square root, reciprocal, and the arcsine.

Chapter 4

Alternatives to t-tools

Welch’s t-Test for Comparing Two Normal Populations with Unequal Variance.

As mentioned earlier, the standard error for the estimate of the difference between two

population means when the variances are not assumed equal is given by:

s12 s22

SEW ( y1 − y2 ) = + .

n1 n2

The degrees of freedom are difficult in this case and the exact d.f., and hence the exact

distribution is not known. The best approximation, Satterthwaite’s approximation, is

given by

⎡⎣ SEW ( y2 − y1 ) ⎤⎦

4

dfW =

⎡⎣ SE ( y2 ) ⎤⎦ ⎡⎣ SE ( y1 ) ⎤⎦

4 4

+

n2 − 1 n1 − 1

The t statistic and p-value are then calculated in exactly the same way as for the pooled

variance test.

Wilcoxon Sum Rank Test

Say you believe that students who go home to their families for Thanksgiving Weekend

actually do better on their exams because they need to decompress more than they need

to study. Say you took a random sample of 8 students who went home for Thanksgiving

and 8 who stayed in Missoula and studied, and then obtained their final exam scores out

of a total of 200 possible.

Went Home Studied

113.25 137.9134

95.94 142.4956

90.04 129.6706

104.44 115.4934

119.21 183.4077

106.88 123.5596

94.99 94.7618

131.09 102.0240

To perform the sum-rank test, we need to rank the two samples together and find the

ranks. Once we have found the ranks, the test statistic can be calculated.

n ( N + 1) n n ( N + 1)

µW = 1 , var (W ) = 1 2

2 12

Our two samples are just large enough (each >7) so we can perform this test by

calculating a z-score.

W − µw

z= .

SD(W )

Home Studied Ranks

Ranked Ranked

90.0400 1

94.7618 2

94.9900 3

95.9400 4

102.0240 5

104.4400 6

106.8800 7

113.2500 8

115.4934 9

119.2100 10

123.5596 11

131.0900 12

129.6706 13

137.9134 14

142.4956 15

183.4077 16

We now sum the ranks for the ‘home’ data. This yields W=1+3+4+6+7+8+10+12=51.

51 − 68

z= = −0.1992

85.33

So, we fail to reject the null hypothesis of no difference between the two groups.

2) Drop the zeros from the list.

3) Order the absolute differences from smallest to largest and assign ranks.

4) The signed rank statistic, S, is the sum of the ranks from the pairs for which the

difference is positive.

samples Æ choose ‘Wilcoxon’.

Exact p-value for the Signed-Rank Test: The exact p-value is the proportion of all

assignments of outcomes within each pair that lead to a test statistic as least as extreme as

the observed. In the schizophrenia example, we assign “affected” or “not affected” to

each pair, regardless of true status, in 215 different ways. Then calculate the statistic under

these different assignments. The distribution of the statistics calculated in this way is then

the true distribution under the null hypothesis of no treatment effect.

expected value and standard deviation:

µs=n(n+1)/4

SD(S)=[n(n+1)(2n+1)/24]1/2

We then compare the usual z-statistic to the quantiles of the normal distribution to obtain

a p-value in the usual way.

- Formula RioUploaded byMARINA
- print 4.pdfUploaded byNataliAmiranashvili
- bayes_en (6)Uploaded bymuralidharan
- HO 3 Core HO 8 With AnswersUploaded bysurabhi279
- Effectsofsoftdrinksondentalhealth-AreviewUploaded byViren Trivedi
- final paper seniors (2)Uploaded byapi-249234743
- ch7 205 (1)Uploaded byBheemanagouda Biradar
- An Observatory Note on Tests for Normality AssumptionsUploaded byTamayo Pepe
- 2016 Understanding the Rol of P ValuesUploaded bySamuel Andrés Arias
- state code analysisUploaded byallswell
- Levine Smume7 Ch09Uploaded byArush Bhatnagar
- Lab File.docxUploaded byGaurav Sharma
- EFFECT OF CONFLICTS ON THE LIVELIHOOD ACTIVITIES OF RURAL DWELLERS IN OWO FOREST RESERVE OF ONDO STATEUploaded bySteven Jones
- stats analytical reportUploaded byapi-255547696
- Hypos TestingUploaded byJorie Roco
- EFFECTIVE COVERAGE OF LEARNING DOMAINS IN STUDENTS ASSESSMENT BY LECTURERS OF CROSS RIVER UNIVERSITY OF TECHNOLOGY CALABAR NIGERIAUploaded byjournalije
- IMPROVING QUALITY OF STATISTICAL PROCESS CONTROL BY DEALING WITH NON-NORMAL DATA IN AUTOMOTIVE INDUSTRY / Test statystyczny sprawozdający hipotezę dobroci pearsonaUploaded byManagement Systems in Production Engineering
- Thesis(1) TitleUploaded bydaamolized282068
- Inferential StatisticsUploaded byzephers2002
- Statistical Methods for Linguistic Research_part iUploaded bythyagosmesme
- J61.pdfUploaded bydongnv90
- 52.Format. Hum - A Study of Awareness and Use of TechnologyUploaded byImpact Journals
- The Death Penalty and Selected Factors from the IOOW2000 ResearchUploaded byAlex Kochkin
- Quizes 9.1a&cUploaded by于伊然
- Official Econometrics Report_Trangntt50Uploaded byNguyen Trang
- ch2Uploaded byMulugeta Aklilu
- StatUploaded byGrace S. Teli Teli
- wofkt3researchexperimentpaperUploaded byapi-302363741
- IC3 paperUploaded byzahid_497
- term project part 7Uploaded byapi-291870758

- Chapter 20Uploaded byFanny Sylvia C.
- Chapter 10Uploaded byFanny Sylvia C.
- Chapter 14Uploaded byFanny Sylvia C.
- Chapter 12Uploaded byFanny Sylvia C.
- Hypo%26PowerLectureUploaded byFanny Sylvia C.
- Model- vs. design-based sampling and variance estimationUploaded byFanny Sylvia C.
- Non%26ParaBootUploaded byFanny Sylvia C.
- SampleSizeCalcRevisitedUploaded byFanny Sylvia C.
- Chapter 11Uploaded byFanny Sylvia C.
- Chapter 8Uploaded byFanny Sylvia C.
- Chapter 21Uploaded byFanny Sylvia C.
- Charles TaylorUploaded byFanny Sylvia C.
- Chapter 9Uploaded byFanny Sylvia C.
- Chapter 13Uploaded byFanny Sylvia C.
- ReviewChaps1-2Uploaded byFanny Sylvia C.
- Chapter 7Uploaded byFanny Sylvia C.
- Chapter 7Uploaded byFanny Sylvia C.
- An Ova PowerUploaded byFanny Sylvia C.
- Clustering in the Linear ModelUploaded byFanny Sylvia C.
- Good Article on Standard Error vs Standard DeviationUploaded byAshok Kumar Bharathidasan
- Intro BootstrapUploaded byMichalaki Xrisoula
- R Matrix TutorUploaded byFanny Sylvia C.
- The not so Short Introduction to LaTeXUploaded byoetiker
- Chapter 6Uploaded byFanny Sylvia C.
- Data Modeling: General Linear Model &Statistical InferenceUploaded byFanny Sylvia C.
- Chapter5p2LectureUploaded byFanny Sylvia C.
- GRM: Generalized Regression Model for Clustering Linear SequencesUploaded byFanny Sylvia C.
- Chapter 5Uploaded byFanny Sylvia C.
- Bio Math 94 CLUSTERING POPULATIONS BY MIXED LINEAR MODELSUploaded byFanny Sylvia C.

- Plant Food Case(1)Uploaded byYashica Gupta
- Interpret All Statistics and Graphs ForUploaded byMarkJasonPerez
- Descriptive vs Inferential StatisticsUploaded byyashar2500
- Choosing the Right StatisticsUploaded byRomell Ambal Ramos
- YMS Ch7: Random Variables AP Statistics at LSHS Mr. MoleskyUploaded byInTerp0ol
- 10.1.1.182.8659.pdfUploaded byTony
- Npar TestsUploaded bynurwahyuti
- Rainy and Dry Days as a Stochastic Process (Albaha City)Uploaded byInternational Organization of Scientific Research (IOSR)
- Non Parametric Tests ICSSR RMC DIB&C Dr.S.selvaRaniUploaded byssnc1
- 09 Sampling DistributionsUploaded byKayla Sharice Mariano
- Factor SpssUploaded by1989_renu
- IPPTCh009Uploaded byRene
- lda_at_workUploaded byAnkit Jain
- Bayes ClassificationUploaded bymalleswarasastry
- Lecture 1Uploaded byLatoya Anderson
- Essentials+of+Business+Research+Method.pdfUploaded byram9261891
- Mid Term2Winter2007Uploaded bysanjay_css
- 4Uploaded bymifcom
- What Does r² measure.pdfUploaded byGerson Miranda
- Econometrics-I-19 (2).pptxUploaded byLIZ KATHERIN
- Exactitud y PrecisiónUploaded byAle Belletti
- syllabus statistics for mechanical engineeringUploaded bydfsdfsdfdf4646545
- Class 03Uploaded byloc1409
- IT Assignment 2Uploaded bysyed02
- Hansen SingletonUploaded byDiploma IV Ilmu Keuangan
- Lecture 10 Fall 2016Uploaded byAaron Hayyat
- Out Put FactorUploaded byCésar Villacura Herrera
- DSC2008 Exam 2014 Apr (With Answers)Uploaded byDaryl Stark
- Mkt3417 Assignment 2Uploaded byKevin Yeo
- StatUploaded byRani Gil