PSYC2010

PSYC2010
Psychological Research Methodology 2 Lecture 1
todays lecture

timetable, assessment, housekeeping how to get through why psychology often requires statistics normal distributions z-scores sampling distribution of the mean
Revision from First year
Course coordinator
Sakinah Alhadad
office: email: S319, Social Sciences Building s.alhadad@psy.uq.edu.au
consult hours: Wednesdays 1pm 3pm or by appointment, or in the lecture break, or after the lecture
Tutorial allocation
Tutorials
start tomorrow sign on via my sisi-net
assessment
Mid Mid-semester exam (30%) (in lecture 5, week 3)
z
MC (theory) and calculation analysis, interpretation and writewrite-up of results of a psychological experiment MC (theory) calculation
Assignment (15%)
z
Final exam (55%)

z z
Final grade cutcut-offs

7 6 5 4 3 2 1 85 - 100% 75 - 84.99% 65 - 74.99% 50 - 64.99% 45 - 49.9 30 - 44.9% 0 - 29.9%
material
Textbook z Field (2009 OR 2005). Discovering statistics using SPSS

Other references:
z
Howell (2007). Statistical methods for psychology. 6th Ed. Other stats textbooks, the internet anywhere that might help!
Calculator for tutorials and for exams

z z
Must not have statistical functions NonNon -programmable
Tutorial workbook (handed out in tutes tutes) )

z
Electronic copy on BB
Optional material
APA manual (a must have really)
Publication Manual for the American Psychological Society (5th ed ed) ) UQ library: Z253 .A38 2001
Findlay (bit more useruser-friendly)
Findlay, B (2006). How to write psychology reports and essays (4th ed.). Frenchs Forests: Pearson UQ library: BF76.8 .F57 2006 (most are at Ipswich campus but you can order it in)/ in)/ BF76.8 .F57 2008
To get through...

you should understand first year statistics (but dont panic if you didnt) attend lectures & tutes Remember if you are doing 4 courses you are expected to work on your coursework 40hours per week this should be considered a fulltime job In the summer, this means that you should commit 20 hours per week to this course. keep up with lecture material
on Blackboard lecture notes use textbook/s
attend tutorials and actually do the exercises (need to do stats to understand) prepare for assessment and consult tutors if you are having problems (during consultation hours tutors are partpart-time)
tips

statistics is very much like learning a new language rehearse until basic words and symbols are automatic to you Accept that a certain amount of doubt and confusion will occur in a stats subject. This doesnt necessarily mean theres something wrong, or that youre not coping dont panic if youre not following something in a lecture. Often things dont sink in first time round think about it and follow it up in tutorials when you come to something that stresses you out (e.g., formulae) teach yourself to slow down rather than speed up keep trying to relate your stats back to realreal-world examples Make friends form learning groups, but dont become dependent on others others take responsibility for your own learning (in honours you have to do your stats) Make sure you really understand the concepts from 2010 this basic knowledge is expected in 3010 (and courses after that)
Access to lecture/tute material

z
See the PSYC2010 BB site for:

Lecture Slides General materials used in tutorials Practice exams Lecture notes posted Monday evening; tutorial notes Wednesday evening/ Thursday morning (after tute) This course will be podcast (assuming I dont muck it up) Occasionally symbols miraculously change between my PC and yours please let me know if my slides look different to yours
Note that I will not put the pictures on your slides.. They just get too big, and are not necessary..
z z
BUT PLEASE NOTE: z Additional material created by individuals tutors may not be available z Access to course materials is a courtesy to students not an obligation
tutors
Can help you with:

z z
Understanding the lecture material or the material covered in tutorials Exam/assignment questions Catch-up sessions/private tutoring for those who have missed class Via email:
Your message should include your name and a contact phone number.
Cannot help you with:

z
Are available:
z
In person:
Tutors will provide office hours close to exam periods. See BB site closer to the time.
PLEASE NOTE:
z
Tutors are employed on a casual basis and will read their email during consultation times. Responses may not be instant. Use your student account UQ and PSY spam filters may automatically remove non-student account emails before reaching their destination
other help
Forum
z
A peer learning resource: communication primarily between students However, the assignment is not to be discussed on the forum The forum will also be monitored by teaching staff
Talk to me, and remain anonymous
Give me feedback throughout the semester, anonymously
lecture 1 - preview
A
general review from earlier courses standard normal distribution finding areas under the normal curve sampling distribution of the mean
Data type? Quantitative (measurement)

Question about relationship
Qualitative (categorical)
One variable: Goodness of fit chi-squared Two variables: Contingency table chi-squared
Hypothesis testing: differences
Single sample compared to population If only pop. mean known: t-test
Comparison between groups
If pop. variance known: z-test
Two Groups
Multiple Groups
Form of relationship
Degree of relationship
linear regression power Dependent Groups: Matched samples t-test Wilcoxons MP signedranks and Sign test Independent Groups: Independent groups t-test Dependent Groups: Repeatedmeasures ANOVA Independent Groups: One-way ANOVA
Pearson correlation & point biserial correlation
Multiple Comparisons A priori (planned): Bonferroni t & Linear Constrasts
Spearmans rho
Wilcoxons Rank Sum test
Friedman
KruskalWallis
Post hoc: Scheffe Test
= parametric tests = non-parametric tests
Recall: samples and populations
population of scores on a variable (not of people) population = the entirety of scores

descriptive parameters: e.g.: mean (mu: ) and standard deviation (sigma: )
samples are small number of scores selected from the entire population
descriptive statistics: e.g.: mean (x(x-bar: X ) and standard deviation (s or SD)
Recall: Normal distribution
distribution - a graphical representation that associates a frequency or probability with each value of a variable the normal distribution is the most important distribution in statistics the dependent variable is often assumed to be normally distributed (e.g., for parametric tests) allows inference about values of a variable in the population
number of pe people
70 60 50 40 30 20 10 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 score
characteristics of the normal distribution
z z z
bell shaped unimodal and symmetrical (mode=mean=median) tails extend indefinitely (although often cant show this on graphs) area under the curve = 1 (100%)
Recall: standard deviation and standard scores (z(z-scores)

z
z-transformation transforms a normal distribution into a standard normal distribution
how to do a zz-transformation
computing z z-scores
Z=
score - mean standard deviation
So, a Z score is simply describing how far a data point (e.g., a persons score) lies away from the mean, expressed in standard deviation units
computing z z-scores
Z=
score - mean standard deviation

your score (e.g. your height)
X-X z= s
your score as a z-score
z=
X-
z-transformation
if you convert your raw scores into zz-scores and plot the z z-scores on a frequency distribution, its called a standard normal distribution (mean = 0; and SD = 1)
-3
-2
-1
z-transformation
the standard normal distribution
(1) allows comparison of performance across different tests
(1) allows us to compare scores from different distributions
Suppose I got 7 out of 10 in an Art History essay, and 62% in Parasitology exam. Am I doing better in Art History or Parasitology Parasitology? ? That depends on the mean and standard deviation of these assessments.
Suppose the Art History essay stats are: N = 37 X = 4.97 s = 2.09 z Parasitology test stats are: N = 60 X = 49.04 s = 8.31
z
Example: converting to zz-scores

_ ZE = X X = 7 4.97 = 0.971 s 2.09 _ ZP = X X = 62 49.04 = 1.560 s 8.31
Art History +0.97 Parasitology +1.56
-3
-2
-1
so
z-transformation alters the mean and SD of a variable, but not the relative location of scores z-transforming data of a normal distribution results in standard normal distribution (mean = 0; and SD = 1) z-scores represent the number of standard deviations that X is from mean
z-transformation
(1) allows comparison of performance across different tests (2) tells you how many people score above or below you on a certain measure
(2) finding areas under the curve

and can be used to calculate the probability
that values will lie within a specified interval and we can use the following formula to work out that probability.
1 (X)2 / f(X) = (e) 2

or not
finding areas under the curve
Luckily someone sat down and created tables to provide these probabilities And even more fortunately, because any normal distribution can be transformed into the standard normal distribution (i.e., zz-transformation) only one set of tables is needed
50% of scores above/below mean
-3
-2
-1
68.26% between -1 and +1 SDs
-3
-2
-1
95.44% between -2 and +2 SDs
-3
-2
-1
for example
lets
say you want to find out how much caffeine you drink relative to the rest of the university student population
= 115mg/day, =15mg
lets
say your intake is 125 mg/day...
use of tables of areas under normal curve

1. finding the area between and score above it
f(X)
(115)
125
i) convert 125 to z-score

z
z = 125_ 125_-_115 15 z = 10 = 0.667 15
ii) use tables to find area between mean and z

f(X)
(115)
125

z
z = 125_-_115 15 z = 10 = 0.667 15

f(X)
(115)
125

z
z = 125_ 125_-_115 15 z = 10 = 0.667 15 area = .2486 (about 25% of scores fell between 115 and 125)

z

2. finding the area beyond the score
f(X)
(115)
125
i) use tables to find area beyond z

2. finding the area beyond the score
f(X)
(115)
125
i) use tables to find area beyond z
~25% will lie above z = .67
3. finding the area below a score above the mean
f(X)
(115)
125
i) use tables to find area between mean and z

z
area = . 2486 area =. = 7486 (equals 1- . 2514)
ii) add .50

z

3. finding the area below a score above the mean
f(X)
(115)
125
i) use tables to find area between mean and z

z
area = . 2486 area =. 7486 (equals 1- . 2514)
ii) add .50

z
Or alternatively.

4. finding the area between and score below it
f(X)
100
(115)
e.g., another student drinks 100 ml of caffiene i)convert 100 to z-score

z
z = 100 - 115 15 z = -15 = -1.000 15
ii) ignore negative sign, and look up tables

4. finding the area between and score below it
f(X)
100
(115)

z
z = 100 - 115 15 z = -15 = -1.000 15
ii) ignore negative sign, and look up tables

5. area between scores on the opposite side of the mean
f(X)
105
(115)
125
i) convert two scores to z-scores

z z
z you intake = 0.667 z yet another student = 105_ 105_-_115 = -0.667 15 . 2486 + .2486 = .4972
ii) find out the two areas and add them

z

6. area between scores on the same side of the mean
f(X)
(115)
125
139
i) convert scores to zz-scores

z z
Z another student = 139_ 139_-_115 = 24 = 1.600 15 15 Z you = .667 (area = .2486) .4452 - .2486 = .1966
(area = .4452)
ii) take difference between two areas

z
finding a score when the area is known

We know that a score is above 80% of others, but do not know the score
X z=
First, rework the equations
z = X X = z
Then, remember that .80 = .50 + .30
f(X)
Then, use the table to find the z-score for .30
(115)
125

We know that a score is above 80% of others, but do not know the score
z = X -
First, rework the equations
z = X X = z
Then, remember that .80 = .50 + .30 Then, use the table to find the z-score for .30
f(X)
(115)
125
z = X X = z
So, given this equation

And the knowledge that the z-score of .30 is .84 X = 115 + (15) (.84) X = 127.600
A person with a score above 80% of other peoples scores has a score of 127.6
z-transformation
(1) allows comparison of performance across different tests (2) tells you how many people score above or below you on a certain measure (3) allows you to make inferences concerning the probability that different scores will be obtained
3) comparing a sample with a population
its all well and good to compare individual scores with a mean, but research usually involves looking at groups of scores (i.e., a sample of scores) example: Lets say you want to know if students with a high GPA drink more (or less) caffeine than average in this case you need to compare a mean (X) for the sample with the population mean ( )
example
lets say you collect a sample of 50 students whose GPA is 6.0 or better. You find that the mean for this sample is 105mg. Suppose we know the mean for the population is 115mg. What can we conclude from this? you might be tempted to say that caffeine consumption negatively effects studying but the difference between the sample mean and the population mean could be caused by other factors
inferential statistics
using sample data to make inferences about population parameters if nothing else is known, the statistics of a sample (e.g., the mean) are the best estimates of the population parameters (e.g., height of UQ students based on this class).
But samples may fail to provide good estimates of population for two reasons:
(1) sampling bias and (2) sampling error
1) sampling bias
due to faulty sampling methods, some important subgroups of the population may be overover- or underunder -represented in our sample
z
Systematic variation (e.g., inadvertently got low caffeine consumers, more women study psychology)
e.g., a classic example

z
z
1948 telephone poll for US elections.

Thomas Dewey (Republican) was predicted to win by large margin Harry S Truman (Democrat) won easily Why?
z z
2) sampling error
no matter how careful we are, no two samples from the same population will be identical - by chance there would be natural variation in scores (sampling error). if I took a random sample of 50 students from anywhere, it would be a complete fluke if the mean for that sample was exactly the population mean (115mg). the term sampling error implies a mistake but this is misleading its a natural thing and cant be helped. so the question is not whether the sample mean differs from the population mean (it almost always will) but how likely is it that the difference we observed could have occurred by chance.
statistical inference
statistical inference is the foundation of hypothesis testing we use sample data to make inferences about population parameters this allows the researcher to determine the probability that a sample is from one population and not another it enables the researcher to evaluate the veracity (truth) of a hypothesis as if a whole population was available instead of just a small (but hopefully) representative sample
sampling distributions
the distribution of a statistic that we would expect if we drew an infinite number of samples (of a given size) from the population
sampling distributions
the distribution of a statistic that we would expect if we drew an infinite number of samples (of a given size) from the population sampling distributions have means and SDs can have a sampling distribution for any statistic, but the most common is the sampling distribution of the mean
sampling distribution of the mean

population of four scores:
(how many cups of coffee 2010 students drink in a day)
1, 2, 3, 4
(everyone either has 1, 2, 3 or 4 coffees per day)
= 2.5 = 1.118

Draw all possible samples (n = 2) with replacement: sample
1,1 1,2 1,3 1,4 2,1 2,2 2,3 2,4
mean
1.0 1.5 2.0 2.5 1.5 2.0 2.5 3.0
sample
3,1 3,2 3,3 3,4 4,1 4,2 4,3 4,4
mean
2.0 2.5 3.0 3.5 2.5 3.0 3.5 4.0
X = 2.5, X = 0.791
The X stands for the sampling distribution of the mean
sampling distribution of the mean in diagram form

5 4
3 frequency 2
0 1 1.5 2 2.5 mean 3 3.5 4

the distribution resembles the normal distribution, not the original population the mean of the sampling distribution is equal to the mean of the actual population
X =
standard error of the mean
standard error of the mean is the standard deviation of the distribution of sample means
X
it represents the typical or average distance between a sample mean X and the mean of the population it is used to define and accurately measure sampling error

how to calculate the standard error of the mean.
X =
e.g.
X
N
The standard deviation of the scores from the whole population The number of people in each sample
1.118 X = = 0.791 2
making inferences from sampling distribution of the mean

Example: You want to test the theory that high doses of caffeine improve statistical performance. To test this, you take a random sample of 25 PSYC2010 students and give them high doses of caffeine throughout the semester the mean result for this sample is 80%. Over the years you know that PSYC2010 students have averaged 70% (standard deviation = 20). Question: Given a normally distributed population, with = 70 and x= 20, what is the probability of obtaining a sample (N=25) with a mean of 80 or higher?

calculate the standard error of the mean
X =
N
20 X = 25
20 X = 5
X = 4.00
on average we expect a sample to have a mean

that differs 4 points from the true population mean

transform the sample mean of 80 to a z-score using standard error of the mean as the denominator (not standard deviation of the scores)
z =
X -
80 - 70 z = 4.00 10 z = = 2.50 4.00
were actually asking about the mean of our sample relative to the sampling distribution of the mean (of the population) what is the likelihood that our mean comes from this population?

transform the sample mean of 80 to a z-score using standard error of the mean as the denominator (not standard deviation of the scores)
z =
X -
80 - 70 z = 4.00 10 z = = 2.50 4.00
were actually asking about the mean of our sample relative to the sampling distribution of the mean (of the population) what is the likelihood that our mean comes from this population?

use tables to determine area beyond z
~.6%

what is the probability that the sample mean will differ from the population mean by 10 points or more?
.0062 + .0062 = .0124 1.2% of samples are expected to differ 10 or more points
.0062
.0062
-3
-2
-1
To Sum
The SE is the sd of sample means It is a measure of how representative a sample is likely to be of the population A large SE (relative to the sample mean) indicates that there is a lot of variability between the means of different samples ~ sample may not be representative of the population A small SE indicates that sample means are similar to the population mean ~ our sample is likely to be an accurate reflection of the population
lecture 1 - review
A
general review standard normal distribution finding areas under the normal curve sampling distribution of the mean

PSYC2010

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PSYC2010

Uploaded by

Copyright:

Available Formats

PSYC2010

Psychological Research Methodology 2 Lecture 1

Revision from First year

start tomorrow sign on via my sisi-net

Final exam (55%)

Final grade cutcut-offs

Textbook z Field (2009 OR 2005). Discovering statistics using SPSS

Calculator for tutorials and for exams

Must not have statistical functions NonNon -programmable

Tutorial workbook (handed out in tutes tutes) )

APA manual (a must have really)

Findlay (bit more useruser-friendly)

Access to lecture/tute material

See the PSYC2010 BB site for:

Can help you with:

Cannot help you with:

Talk to me, and remain anonymous

Give me feedback throughout the semester, anonymously

Data type? Quantitative (measurement)

Hypothesis testing: differences

Single sample compared to population If only pop. mean known: t-test

Comparison between groups

If pop. variance known: z-test

Pearson correlation & point biserial correlation

Multiple Comparisons A priori (planned): Bonferroni t & Linear Constrasts

Wilcoxons Rank Sum test

Post hoc: Scheffe Test

= parametric tests = non-parametric tests

Recall: samples and populations

population of scores on a variable (not of people) population = the entirety of scores

Recall: Normal distribution

70 60 50 40 30 20 10 0 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 score

characteristics of the normal distribution

Recall: standard deviation and standard scores (z(z-scores)

z-transformation transforms a normal distribution into a standard normal distribution

score - mean standard deviation

score - mean standard deviation

(1) allows us to compare scores from different distributions

Example: converting to zz-scores

Art History +0.97 Parasitology +1.56

(2) finding areas under the curve

1 (X)2 / f(X) = (e) 2

finding areas under the curve

50% of scores above/below mean

68.26% between -1 and +1 SDs

95.44% between -2 and +2 SDs

say your intake is 125 mg/day...

use of tables of areas under normal curve

i) convert 125 to z-score

z = 125_ 125_-_115 15 z = 10 = 0.667 15

ii) use tables to find area between mean and z

use of tables of areas under normal curve

i) convert 125 to z-score

ii) use tables to find area between mean and z

use of tables of areas under normal curve

i) convert 125 to z-score

ii) use tables to find area between mean and z

use of tables of areas under normal curve

i) use tables to find area beyond z

use of tables of areas under normal curve

i) use tables to find area beyond z

~25% will lie above z = .67

3. finding the area below a score above the mean

use of tables of areas under normal curve