Professional Documents
Culture Documents
todays lecture
timetable, assessment, housekeeping how to get through why psychology often requires statistics normal distributions z-scores sampling distribution of the mean
Course coordinator
Sakinah Alhadad
office: email: S319, Social Sciences Building s.alhadad@psy.uq.edu.au
consult hours: Wednesdays 1pm 3pm or by appointment, or in the lecture break, or after the lecture
Tutorial allocation
Tutorials
assessment
Mid Mid-semester exam (30%) (in lecture 5, week 3)
z
MC (theory) and calculation analysis, interpretation and writewrite-up of results of a psychological experiment MC (theory) calculation
Assignment (15%)
z
material
Howell (2007). Statistical methods for psychology. 6th Ed. Other stats textbooks, the internet anywhere that might help!
Electronic copy on BB
Optional material
Publication Manual for the American Psychological Society (5th ed ed) ) UQ library: Z253 .A38 2001
Findlay, B (2006). How to write psychology reports and essays (4th ed.). Frenchs Forests: Pearson UQ library: BF76.8 .F57 2006 (most are at Ipswich campus but you can order it in)/ in)/ BF76.8 .F57 2008
To get through...
you should understand first year statistics (but dont panic if you didnt) attend lectures & tutes Remember if you are doing 4 courses you are expected to work on your coursework 40hours per week this should be considered a fulltime job In the summer, this means that you should commit 20 hours per week to this course. keep up with lecture material
on Blackboard lecture notes use textbook/s
attend tutorials and actually do the exercises (need to do stats to understand) prepare for assessment and consult tutors if you are having problems (during consultation hours tutors are partpart-time)
tips
statistics is very much like learning a new language rehearse until basic words and symbols are automatic to you Accept that a certain amount of doubt and confusion will occur in a stats subject. This doesnt necessarily mean theres something wrong, or that youre not coping dont panic if youre not following something in a lecture. Often things dont sink in first time round think about it and follow it up in tutorials when you come to something that stresses you out (e.g., formulae) teach yourself to slow down rather than speed up keep trying to relate your stats back to realreal-world examples Make friends form learning groups, but dont become dependent on others others take responsibility for your own learning (in honours you have to do your stats) Make sure you really understand the concepts from 2010 this basic knowledge is expected in 3010 (and courses after that)
z z
BUT PLEASE NOTE: z Additional material created by individuals tutors may not be available z Access to course materials is a courtesy to students not an obligation
tutors
Understanding the lecture material or the material covered in tutorials Exam/assignment questions Catch-up sessions/private tutoring for those who have missed class Via email:
Your message should include your name and a contact phone number.
Are available:
z
In person:
Tutors will provide office hours close to exam periods. See BB site closer to the time.
PLEASE NOTE:
z
Tutors are employed on a casual basis and will read their email during consultation times. Responses may not be instant. Use your student account UQ and PSY spam filters may automatically remove non-student account emails before reaching their destination
other help
Forum
z
A peer learning resource: communication primarily between students However, the assignment is not to be discussed on the forum The forum will also be monitored by teaching staff
lecture 1 - preview
A
general review from earlier courses standard normal distribution finding areas under the normal curve sampling distribution of the mean
Qualitative (categorical)
One variable: Goodness of fit chi-squared Two variables: Contingency table chi-squared
Two Groups
Multiple Groups
Form of relationship
Degree of relationship
linear regression power Dependent Groups: Matched samples t-test Wilcoxons MP signedranks and Sign test Independent Groups: Independent groups t-test Dependent Groups: Repeatedmeasures ANOVA Independent Groups: One-way ANOVA
Spearmans rho
Friedman
KruskalWallis
samples are small number of scores selected from the entire population
descriptive statistics: e.g.: mean (x(x-bar: X ) and standard deviation (s or SD)
distribution - a graphical representation that associates a frequency or probability with each value of a variable the normal distribution is the most important distribution in statistics the dependent variable is often assumed to be normally distributed (e.g., for parametric tests) allows inference about values of a variable in the population
number of pe people
z z z
bell shaped unimodal and symmetrical (mode=mean=median) tails extend indefinitely (although often cant show this on graphs) area under the curve = 1 (100%)
how to do a zz-transformation
computing z z-scores
Z=
So, a Z score is simply describing how far a data point (e.g., a persons score) lies away from the mean, expressed in standard deviation units
computing z z-scores
Z=
X-X z= s
your score as a z-score
z=
X-
z-transformation
if you convert your raw scores into zz-scores and plot the z z-scores on a frequency distribution, its called a standard normal distribution (mean = 0; and SD = 1)
-3
-2
-1
z-transformation
the standard normal distribution
(1) allows comparison of performance across different tests
Suppose I got 7 out of 10 in an Art History essay, and 62% in Parasitology exam. Am I doing better in Art History or Parasitology Parasitology? ? That depends on the mean and standard deviation of these assessments.
Suppose the Art History essay stats are: N = 37 X = 4.97 s = 2.09 z Parasitology test stats are: N = 60 X = 49.04 s = 8.31
z
-3
-2
-1
so
z-transformation alters the mean and SD of a variable, but not the relative location of scores z-transforming data of a normal distribution results in standard normal distribution (mean = 0; and SD = 1) z-scores represent the number of standard deviations that X is from mean
z-transformation
the standard normal distribution
(1) allows comparison of performance across different tests (2) tells you how many people score above or below you on a certain measure
Luckily someone sat down and created tables to provide these probabilities And even more fortunately, because any normal distribution can be transformed into the standard normal distribution (i.e., zz-transformation) only one set of tables is needed
-3
-2
-1
-3
-2
-1
-3
-2
-1
for example
lets
say you want to find out how much caffeine you drink relative to the rest of the university student population
= 115mg/day, =15mg
lets
f(X)
(115)
125
f(X)
(115)
125
z = 125_-_115 15 z = 10 = 0.667 15
f(X)
(115)
125
z = 125_ 125_-_115 15 z = 10 = 0.667 15 area = .2486 (about 25% of scores fell between 115 and 125)
(115)
125
(115)
125
f(X)
(115)
125
f(X)
(115)
125
Or alternatively.
f(X)
100
(115)
f(X)
100
(115)
f(X)
105
(115)
125
z you intake = 0.667 z yet another student = 105_ 105_-_115 = -0.667 15 . 2486 + .2486 = .4972
f(X)
(115)
125
139
Z another student = 139_ 139_-_115 = 24 = 1.600 15 15 Z you = .667 (area = .2486) .4452 - .2486 = .1966
(area = .4452)
X z=
z = X X = z
f(X)
(115)
125
z = X -
z = X X = z
Then, remember that .80 = .50 + .30 Then, use the table to find the z-score for .30
f(X)
(115)
125
z = X X = z
So, given this equation
And the knowledge that the z-score of .30 is .84 X = 115 + (15) (.84) X = 127.600
A person with a score above 80% of other peoples scores has a score of 127.6
z-transformation
the standard normal distribution
(1) allows comparison of performance across different tests (2) tells you how many people score above or below you on a certain measure (3) allows you to make inferences concerning the probability that different scores will be obtained
its all well and good to compare individual scores with a mean, but research usually involves looking at groups of scores (i.e., a sample of scores) example: Lets say you want to know if students with a high GPA drink more (or less) caffeine than average in this case you need to compare a mean (X) for the sample with the population mean ( )
example
lets say you collect a sample of 50 students whose GPA is 6.0 or better. You find that the mean for this sample is 105mg. Suppose we know the mean for the population is 115mg. What can we conclude from this? you might be tempted to say that caffeine consumption negatively effects studying but the difference between the sample mean and the population mean could be caused by other factors
inferential statistics
using sample data to make inferences about population parameters if nothing else is known, the statistics of a sample (e.g., the mean) are the best estimates of the population parameters (e.g., height of UQ students based on this class).
But samples may fail to provide good estimates of population for two reasons:
1) sampling bias
due to faulty sampling methods, some important subgroups of the population may be overover- or underunder -represented in our sample
z
Systematic variation (e.g., inadvertently got low caffeine consumers, more women study psychology)
z z
2) sampling error
no matter how careful we are, no two samples from the same population will be identical - by chance there would be natural variation in scores (sampling error). if I took a random sample of 50 students from anywhere, it would be a complete fluke if the mean for that sample was exactly the population mean (115mg). the term sampling error implies a mistake but this is misleading its a natural thing and cant be helped. so the question is not whether the sample mean differs from the population mean (it almost always will) but how likely is it that the difference we observed could have occurred by chance.
statistical inference
statistical inference is the foundation of hypothesis testing we use sample data to make inferences about population parameters this allows the researcher to determine the probability that a sample is from one population and not another it enables the researcher to evaluate the veracity (truth) of a hypothesis as if a whole population was available instead of just a small (but hopefully) representative sample
sampling distributions
the distribution of a statistic that we would expect if we drew an infinite number of samples (of a given size) from the population
sampling distributions
the distribution of a statistic that we would expect if we drew an infinite number of samples (of a given size) from the population sampling distributions have means and SDs can have a sampling distribution for any statistic, but the most common is the sampling distribution of the mean
1, 2, 3, 4
(everyone either has 1, 2, 3 or 4 coffees per day)
= 2.5 = 1.118
mean
1.0 1.5 2.0 2.5 1.5 2.0 2.5 3.0
sample
3,1 3,2 3,3 3,4 4,1 4,2 4,3 4,4
mean
2.0 2.5 3.0 3.5 2.5 3.0 3.5 4.0
X = 2.5, X = 0.791
The X stands for the sampling distribution of the mean
3 frequency 2
X =
standard error of the mean is the standard deviation of the distribution of sample means
X
it represents the typical or average distance between a sample mean X and the mean of the population it is used to define and accurately measure sampling error
X =
e.g.
X
N
The standard deviation of the scores from the whole population The number of people in each sample
1.118 X = = 0.791 2
20 X = 25
20 X = 5
X = 4.00
z =
X -
were actually asking about the mean of our sample relative to the sampling distribution of the mean (of the population) what is the likelihood that our mean comes from this population?
z =
X -
were actually asking about the mean of our sample relative to the sampling distribution of the mean (of the population) what is the likelihood that our mean comes from this population?
~.6%
.0062 + .0062 = .0124 1.2% of samples are expected to differ 10 or more points
.0062
.0062
-3
-2
-1
To Sum
The SE is the sd of sample means It is a measure of how representative a sample is likely to be of the population A large SE (relative to the sample mean) indicates that there is a lot of variability between the means of different samples ~ sample may not be representative of the population A small SE indicates that sample means are similar to the population mean ~ our sample is likely to be an accurate reflection of the population
lecture 1 - review
A
general review standard normal distribution finding areas under the normal curve sampling distribution of the mean