You are on page 1of 21

SAMPLING METHODS An Illustrative Review Let us illustrate each of the sampling methods using the same hypothesis: Students

with low self-esteem demonstrate lower achievement in school subjects. Target population: ll eighth-graders in !aifornia. ccessible population: ll eighth-graders in the San "rancisco #ay rea $seven counties%. "easible sample si&e: n ' ())*(+). ,. Simple random sampling -dentify all eight graders in all public and private schools in the seven counties $estimated number ' .)))%. % sign each student a number and then use a table of random numbers to select a sample of (+) The difficulty here is that it is time-consuming to identify every eighth-grader in the #ay rea/ and to contact $probably% ()) different schools in order to administer instruments to one or two students in those schools. (. !luster random sampling -dentify all pupils and private schools having an eighth grade in the seven counties. .))) students 01)) classes ' 1) students0class $estimated% * ,+) schools. ( classes0school $estimated%

ssign each of the ,+) schools a number/ and then randomly select four schools $2 schools 3 ( classes per school 4 1) students per class ' (2)%. !luster random sampling is much more feasible than simple random sampling to implement/ but it is limited because of the use of only four schools/ even though they are to be selected randomly. "or e3ample/ the selection of only four schools may e3clude the selection of private school students. 1. Strati"ied random sampling: 5btain data on the number of eighth-grade students in public versus private schools and determine the proportion of each type $e.g./ 6) percent public/ () percent private%. 7etermine the number from each type to be sampled: public 6) percent $())% ' ,8)9 private ' () percent $())% ' 2). :andomly select these numbers from respective subpopulations of public and private students. Stratification may be used to guarantee that the sample is representative on other variables as well. The difficulty with this method is that stratification re;uires that the researcher <now the proportions in each strata of the population/ and it also becomes increasingly difficult as more variables are added. -magine trying to stratify not only on the public-private variable but also $for e3ample% on student ethnicity/ gender/ and socioeconomic status/ and on teacher gender and e3perience. 2. Two#stage random sampling :andomly select (+ schools from the population of ,+) schools/ and then randomly select eight eighth-grade students from each school $n ' 6 4 (+ ' ())%. This method is much more feasible than simple random sampling and more representative than cluster sampling. -t may well be the best choice in this e3ample/ but it still re;uires permission from (+ schools and the resources to collect data from each. !onvenience sampling: Select all eighth- graders in four schools where the researcher has accessibility $n 1) 3 2 3 ( ' (2)%. This method precludes generali&ing beyond these four schools/ unless a strong argument with supporting data can be made for their similarity to the entire group of ,+) schools. +. Purposive sampling Select eight classes from throughout the seven counties on the basis of demographic data showing that they are representative of all eighth-graders. =articular attention must be paid to self-esteem and achievement scores. The problem is that such data are unli<ely to be available and/ in any case/ cannot eliminate possible differences between the sample and the population on other variables*such as teacher attitude and available resources. 8. S$stemati% sampling Select every forty-fifth student from an alphabetical list for each school $()) students in sample * , >.))) students in population * 2+% This method is as inconvenient as simple random sampling and is li<ely to result in a biased sample/ since the forty-fifth name in each school is apt to be in the last third of the alphabet $remember there are an estimated 8) eighth graders in each school%/ introducing probable ethnic or cultural bias. n alternative is to select every si3th school from the list of ,+) schools $,+)08 ' (+ schools% and then every si3th student on the list of eighth-graders $n ' 8)08 ,) students per school%. This method results in n ' (+ 3 ,) ' (+) students/ but it is inferior to random methods because of the possibility of bias inherent in school or student lists/ or both. -t is also no easier to carry out than random methods. There are a few guidelines that we would suggest with regard to the minimum number of subjects needed. "or descriptive studies/ we thin< a sample with a minimum number of ,)) is

essential. "or correlational studies/ a sample of least +) is deemed necessary to establish the e3istence of a relationship. "or e3perimental and causal-comparative studies/ we recommend minimum of 1) individuals per group/ although sometimes e3perimental studies with only ,+ individuals in each group can be defended if they are very tightly controlled9 studies using only , subjects per group should probably be replicated however/ before too much is made of any finding that occur. T&PES O' PRO(A(ILIT& SAMPLING Samples can be drawn from accessible or target populations by various methods. Some of the methods involve probability sampling/ which means that each individual in the population has a <nown probability of being selected. The probabilities are <nown because the individuals are chosen by chance. The following sections describe four types of probability sampling: $,% simple random sampling/ $(% systematic random sampling/ $1% stratified random sampling/ and $2% cluster sampling. Simple Random Sampling simple random sample is a group of individuals drawn by a procedure in which all the individuals in the defined population have an e;ual and independent chance of being selected as a member of the sample. #y independent we mean that the selection of one individual for the sample has no effect on the selection of any other individual. This definition of a simple random sample is ade;uate/ but it does contain a slight flaw. -n reality/ each individual in the defined population cannot have an e3actly e;ual chance of being selected into the sample. To understand why this is so/ suppose there are ,/))) si3th-grade students in our accessible population/ and we want to select a simple random sample of ,)) students. ?hen we select our first research participant/ each student has one chance in ,/))) of being selected. 5nce this student has been selected/ however/ only ... students remaln9 each student now has one chance in ... of being selected as our second participant. Thus/ as each student is selected/ the probability of being selected ne3t increases slightly because the population from which we are selecting has become one participant smaller. The flaw in our initial definition of a simple random sample can be corrected by defining it as a sample selected from a population by a process that provides every sample of a given si&e an e;ual probability of being selected. -n other words/ suppose that your population has ,/))) members and you intend to draw a random sample of +) participants from it. @ow imagine every conceivable sample of +) participants from this population. -f you draw a random sample from the population/ any one of these samples would have an e;ual chance of being the sample you select for your study. The main advantage of randomly selected samples is that they yield research data that can be generali&ed to a larger population within margins of error that can be determined by statistical formulas. :andom sampling also is preferred because it satisfies the logic by which a null hypothesis is tested with inferential statistics $see !hapter +%. Random num)er generators Aarious techni;ues can be used to obtain a simple random sample. Suppose the research director of a large city school system wishes to obtain a random sample of ,)) students from a

population of .B( students currently enrolled in the ninth grade in School 7istrict . "irst/ he would obtain a copy of the district census for ninth-grade students and assign a number to each student. Then he would use a table of random numbers to draw a sample from the census list. Tables of random numbers usually consist of long series of five-digit numbers generated randomly by a computer. Table 8., is a small portion of a typical table. To use the random numbers table/ randomly select a column as a starting point/ then select all the numbers that follow in that column. #ecause there are three digits in .B( $the number of cases in the accessible population/ School 7istrict %/ you only need to use the last three digits of each five-digit number. -f more numbers are needed/ proceed to the ne3t column until sufficient numbers have been selected to ma<e up the desired sample si&e. -n our e3ample/ suppose the researcher selects row , of column + in Table 8., as his starting point and uses the last three digits of each number in that column and each successive column. Thus/ the researcher would select the B1(nd student on the census list/ s<ip the number .61 $there are only .B( cases in the population%/ select the .B)th student/ select the ++2th student/ and so on. This procedure would be followed $with a much larger table of random numbers/ of course% until a sample of ,)) pupils had been selected. nother method for generating a se;uence of random numbers is to use software with this capability lso/ at least one web site has this capability: $http: 0 0wwwrandomi&er.org%. $Cou might find other web sites by entering the <eyword phrase random number generator in a generic search engine.%

S$stemati% Random Sampling Systematic random sampling is easier than simple random sampling/ if the sample to be selected is very large and a list of the accessible or target population is available. Suppose the population has ,))/))) members and you wish to select a sample of ,/))) members from it. "urther suppose the members are listed in a directory. -f you were to use simple random sampling/ you

would need to number the members from , to ,))/))) and then use a table of random numbers to select the sample of ,/))) members. -f you selected a systematic sample instead/ you would first divide the population by the number needed for the sample $,))/))) divided by ,/))) ,))%. Then select at random a number smaller than the number arrived at by the division $in this e3ample/ a number smaller than ,))/ such as 18%. Then/ starting with the 18th member on the list/ select every ,))th name there after from the directory list. The time saved is substantial/ because there is no need to assign a separate number to each member listed in the directory or to wor< bac< and forth between a table of random numbers and the directory. Systematic random sampling should be avoided if there is any possibility of periodicity in the list $that is/ if every nth person on the list shares a characteristic that is not shared by the entire population%. "or e3ample/ suppose you have ,)) class lists/ and you decide to select a sample by choosing the first name on each list. -f the names are in alphabetical order/ your sample most li<ely would include only students whose last name begins with or #. This sample would under represent certain ethnic groups for whom a last name beginning with or # is uncommon. Strati"ied Random Sampling stratified random sample involves a sample selected so that certain subgroups in the population are ade;uately represented in the sample. "or e3ample/ suppose that the population includes ,)/))) students/ of whom ,)) are Laotian. -f you draw a random sample of ()) students from this population/ there is a strong li<elihood that the sample would include no Laotian students or only a very few. Stratified random sampling ensures that a satisfactory representation of Laotian students is included in the sample/ if this is important to your study. -n proportional stratified random sampling/ the proportion of each subgroup in the sample is the same as their proportion in the population. Suppose we are comparing students with different ethnic bac<grounds. Dach ethnic bac<ground*Laotian/ frican merican/ Latino/ and so forth *would be considered a separate stratum/ that is/ subgroup. "urther suppose that Laotians are the smallest ethnic group in the population*,)) students out of ,)/)))/ which e;uals , percent of the population. ?e want to have at least ,) Laotian students in the sample. Therefore/ we would randomly select ,) from all the Laotian students in the population. #ecause Laotian students comprise , percent of the population/ the sample would need to include ,/))) students to be prop ortionally correct $,) students in the sample divided by .),%. The si&e of other strata in the sample could then be determined. "or e3ample/ if the population included (/))) fricanmerican students $which is () percent of the population%/ the sample should include ()) of them $() percent of the predetermined sample si&e of ,/))) students%. variant of this approach is nonproportional stratified random sampling. ?e might decide to select () students of each ethnic bac<ground in the population/ regardless of their proportion in the population. This approach is ;uite acceptable/ as long as we ma<e generali&ations only about the findings for students of each ethnic bac<ground. ?e cannot ma<e generali&ations from the total sample/ because it does not represent accurately the proportional ethnic composition of the population. !luster Sampling -n cluster sampling/ the unit of sampling is a naturally occurring group of individuals. !luster sampling is used when it is more feasible to select groups of individuals $called clusters% rather than individuals from a defined population. "or e3ample/ suppose you wish to administer a

;uestionnaire to a random sample of 1)) students in a population defined as all si3th graders in four school districts. The population includes a total of ,/+)) si3th-graders in +) classrooms/ with an average of 1) students in each classroom. 5ne approach to sample selection would be to draw a simple random sample of 1)) students using a census list of all ,/+)) students. -n cluster sampling/ by contrast/ you might draw a random sample of ten classrooms*assuming 1) students on average per classroom. Thirty students per classroom multiplied by ten classrooms e;uals 1)) students/ which is the desired sample si&e. Thus/ you have achieved the efficiency of only having to access ten classrooms in order to administer a ;uestionnaire to a random sample of 1)) students. -f you had used simple random sampling instead/ you probably would have had to arrange for access to all +) classrooms/ even though some of these classrooms might include only one student in the random sample. variation of this method is multistage %luster sampling/ which involves first selecting clusters and then selecting individuals within clusters. -n the e3ample we have been considering/ suppose you wish to supplement the ;uestionnaires with interviews of individual students. -t is relatively easy to group-administer the ;uestionnaire to every student in the ten classrooms. Eowever/ it would be very time consuming to interview all 1)) students in them. Therefore/ you could institute another sampling procedure $the second stage of multistage cluster sampling%/ in which you randomly select five students/ for e3ample/ from each of the ten classrooms to interview. The interview sample/ then/ will include +) students. !onventional formulas for computing statistics on research data should not be used with samples chosen by cluster sampling. Special statistical formulas are available/ but they are less sensitive to population differences. This disadvantage must be weighed against the possible savings in time and money that can result from cluster sampling. -n summary/ you have a variety of sampling methods to consider in carrying out a ;uantitative research study. Sophisticated variants of the probability sampling methods we have described have been developed/ primarily for use in large-scale survey research. Therefore/ if you plan to carry out a large-scale survey research study/ you should investigate these methods further. NONPRO(A(ILIT& SAMPLING -n nonprobability sampling/ individuals are not selected by chance/ but by some other means. -n the rest of this section/ we describe a common method of nonprobability sampling used in ;uantitative research/ usually called convenience sampling. Later we describe the various sampling methods used in ;ualitative research/ which are grouped under the general heading of purposeful sampling. ?ith one e3ception/ called purposeful random sampling/ all the sampling strategies used in ;ualitative research also are based on non-probability sampling. -t is much more difficult to ma<e valid inferences about a population from non-probability sampling methods/ but these methods are used in more than .+ percent of research studies in the social sciences. Fndoubtedly/ the reason for their prevalence is that it is much easier to select a nonprobability sample than a random sample when studying individuals in their natural environment.

!onvenien%e Sampling Dach of the sampling methods described above involves a defined population and a sample of individuals or groups randomly drawn from that population. -n actuality/ many ;uantitative researchers do not use any of these methods to select a sample. Rat*er+ t*e resear%*er sele%ts a sample t*at suits t*e purposes o" t*e stud$ and t*at is %onvenient, The sample can be convenient for a variety of reasons: the sample is located at or near where the researcher wor<s9 the administrator who will need to approve data collection is a close colleague of the researcher9 the researcher is familiar with the site and might even wor< in it9 some of the data that the researcher needs already have been collected. -n fact/ many research studies that appear in journals involve college students/ because the researcher is a professor and these students provide a convenient sample. -n view of the fact that college students are not representative of the adult population in general/ one would be justified in ;uestioning the universality of certain principles of learning and instruction that appear in te3tboo<s and other sources that are based on such research. :esearchers often need to select a convenience sample or face the possibility that they will be unable to do the study. lthough a sample randomly drawn from a population is more desirable/ it usually is better to do a study with a convenience sample than to do no study at all*assuming/ of course/ that the sample suits the purposes of the study. -f a convenience sample is used/ the researchers and readers of their report must infer a population to which the results might generali&e. The researcher can assist the inference process by providing a careful description of the sample. lthough this recommendation seems obvious/ it sometimes is violated in practice. "or e3ample/ we came across this description of a sample $slightly paraphrased to mas< the researcherGs identity% in a journal issue: The study involved +6 undergraduate seniors majoring in education at a southeastern university. That is the description in its entirety. There is insufficient information in this description to infer whether the results would generali&e to all universities or to a limited subset $e.g./ small private universities%/ and whether the results would generali&e to all education majors or to a limited subset $e.g./ education majors who have completed at least one school practicum/ or those planning to teach at the high school level%. -nferential statistics often are used to analy&e data collected from convenience samples/ even though the logic of inferential statistics re;uires that the sample be randomly drawn from a defined population. Some researchers believe that inferential statistics for these samples cannot be interpreted meaningfully. 5thers believe that it is possible to conceptuali&e a population that the sample represents. They then reason that because the sample is representative of this population/ the sample is e;uivalent to a sample randomly drawn from the population9 therefore/ the use of inferential statistics is justified. Haybe inferential statistics can be used with data collected from a convenience sample if the sample is carefully conceptuali&ed to represent a particular population. @evertheless/ we believe that one should be cautious about accepting findings as valid and ma<ing generali&ations from them on the basis of one study. :epeated replication of the findings is much stronger evidence of their validity and generali&ability than is a statistically significant result in one study.

Determining Sample Si-e "or a .uantitative Resear%* Stud$ The general idea in ;uantitative research is to use the largest sample possible. The larger the sample/ the more li<ely the research participantsG scores on the measured variables will be representative of population scores. -n addition to this general rule/ researchers have developed rules of thumb for determining the minimum number of participants needed for different research methods. -n correlational research/ a minimum of 1) participants is desirable. -n causalcomparative and e3perimental research/ there should be at least ,+ participants in each group to be compared. "or survey research/ Seymour Sudman suggested a minimum of ,)) participants in each major subgroup and () to +) in each minor subgroup. Hathematical procedures are available to ma<e more precise estimates of the sample si&e needed to reject the null hypothesis when in fact it is false/ and to determine the li<ely value of population parameters $typically/ the population mean and standard deviation%. -n addition to statistical power analysis/ you should consider the following three factors in determining an optimal sample si&e for a ;uantitative research study: ,. Su)group anal$sis, -n many ;uantitative studies/ it is desirable to brea< groups into subgroups for further analysis. "or e3ample/ the primary data analysis for an e3periment might involve a comparison of all research participants in the e3perimental group with all research participants in the control group. -n addition/ one might compare all male participants in the e3perimental group with all male participants in the control group/ and similarly for female participants. This type of subgroup analysis sometimes is done as an afterthought/ with the unfortunate conse;uence that the subgroup si&e is too small to produce ade;uate statistical power. Therefore/ it is best to plan subgroup analyses at the design stage of a study so that an ade;uate sample si&e is selected. (. Attrition, ttrition sometimes is a problem in research that e3tends over a substantial period of time. "or e3ample/ in studies of school children/ researchers often find that substantial numbers of them leave the school during the course of a school year9 this is especially true of schools in low-income communities. :obert Ioodrich and :obert St. =ierre estimate that () percent attrition per year is a realistic level for planning. ttrition can be minimi&ed by strategies such as developing research participantsG commitment to the study and establishing good rapport with them. Still/ it is best to increase sample si&e by a certain percentage to account for possible attrition. 1. Relia)ilit$ o" measures, Heasures with low reliability wea<en the power of tests of statistical significance and estimates of population parameters. Therefore/ if you must use measures with low reliability/ you need to increase sample si&e accordingly. The converse is also true: -f you must use a small sample/ you should use measures with high reliability. -n many ;uantitative studies/ there is a cost-benefit trade-off involving sample si&e. "or e3ample/ in some studies it is desirable to use role playing/ depth interviews/ and other time-consuming measurement techni;ues. These techni;ues cannot be used in large-sample studies unless considerable financial support is available. The alternative is to obtain a large sample but to use relatively ine3pensive measures such as ;uestionnaires and standardi&ed tests. Eowever/ a study that probes deeply into the characteristics of a small sample often provides more <nowledge than a study that attac<s the same problem by collecting only superficial data from a large sample. -n some research/ very close matching of subjects on the critical variables concerned in the study is possible. Fnder these conditions/ a small sample often will have good statistical power and can yield important results. The classic study by Eoratio @ewman/ "ran< "reeman/ and Jarl

Eol&inger on the intelligence of identical twins is a good e3ample of such a study. #ecause identical twins have the same genes/ they are ideal for studying the relative influence of heredity and environment on various human characteristics. 5ne phase of their research included only ,. pairs of separated identical twins/ but this sample provided information about the relative influences of heredity and environment on intelligence that would have been difficult to obtain with large samples of less closely matched subjects. SAMPLING IN ./ALITATI0E RESEAR!H @ow we turn to the procedures that ;ualitative researchers use to select cases to provide a basis for building or testing a theory. Kualitative research is more fle3ible with respect to sampling techni;ues than ;uantitative research. This fle3ibility reflects the emergent nature of ;ualitative research design/ which allows researchers to modify their research approach as data are collected. Therefore/ the sampling techni;ues discussed hi this section are suggestive rather than prescriptive/ and they do not necessarily e3haust the possible ways hi which a ;ualitative research sample might be selected. Rationale o" Purpose"ul Sampling The sample si&e in ;ualitative studies typically is small. The sample might be a single case. Hichael =atton describes the type of sampling procedure used in ;ualitative research as purposeful sampling. -n purposeful sampling the goal is to select cases that are li<ely to be information-rich with respect to the purposes of the study. Thus/ the researchers in our hypothetical e3ample decided to identify at least one beginning teacher for a case study. Suppose there are five such teachers in the school district. -n approaching the first teacher/ they find that he is nervous/ uncommunicative/ and willing to participate in the study only if re;uired to do so. The researchers decide not to pursue this teacher as a possible case. The ne3t teacher is much more open and eager to participate/ and is agreeable to the additional demands on his time that data collection will re;uire. This teacher is selected for the study. t this point/ the researchers decide that the intensity of data collection precludes the possibility of including another beginning teacher in the study. Thus/ they settle for a sample of one beginning teacher. They would then go through a similar reasoning process to select one e3perienced teacher for the sample. -t is clear that purposeful sampling is not designed to achieve population validity. The intent is to achieve an in-depth understanding of selected individuals/ not to select a sample that will represent accurately a defined population. T$pes o" Purpose"ul Sampling Kualitative researchers use a variety of purposeful sampling strategies to select their cases for study. Table 8.( summari&es ,8 purposeful sampling strategies described by =atton. -f the researchers are using a single-case design/ only some of the strategies are appropriate/ that is/ the ,( strategies for which the entries in column ( $!ases Selected% begin with the word !ases/ rather than with the words Hultiple !ases/ -f the researchers are using a multiple-case design/ any of these sampling strategies can be used.

Strategies to Sele%t !ases Representing a 1e$ !*ara%teristi% The sampling strategies we have numbered , to B in Table 8.( all involve selection based on a <ey characteristic of the cases to be studied. ,. E2treme or deviant %ase sampling focuses on cases that are unusual or special. The findings of research on e3treme cases can provide an understanding of more typical cases.

(. Suppose a researcher was interested in the characteristics of inservice presenters. -f an e3treme-case sampling approach were used/ the researchers might select star performers in the world of inservice education*the individuals who regularly ma<e <eynote presentations at national conferences and who consult nationally and internationally. These cases might be interesting/ but of little relevance to educators who do inservice presenting on a much smaller scale. Therefore/ the researcher might consider selecting educators who are highly respected as inservice presenters within their school district or local region. These educators still ;ualify as e3ceptional cases/ but they are more li<e the vast majority of inservice presenters. #y studying these less e3treme cases/ the researcher is more li<ely to obtain findings that deepen the understanding of most inservice presenters about ways in which they might improve. lso/ the findings might enlighten administrators about reasonable ;ualifications and e3pectations for local staff who aspire to be inservice presenters. 1. T$pi%al %ase sampling/ as one might e3pect/ involves the selection of typical cases to study. This strategy might be particularly useful in field tests of new programs. 7evelopers and policy ma<ers want their programs to be effective for the great majority of the individuals to be served by the program9 otherwise/ the program will not be considered cost-effective. lso/ stories about typical cases might be useful for selling the program to various constituencies. These first three sampling strategies*e3treme or deviant case sampling/ intensity sampling/ and typical case sampling*complement one another. @one is inherently superior to the others. Dach serves important/ but different purposes in ;ualitative research. 2, Ma2imum variation sampling involves selecting cases that illustrate the range of variation in the phenomena to he studied. "or e3ample/ suppose a researcher wishes to study the e3periences of different school districts that have received state-funded grants to develop innovative projects. -n using a ma3imum variation sampling strategy9 the researcher might select districts that vary widely in si&e/ community setting $e.g./ urban/ rural%/ pro3imity to a university that has a college of education/ and the type of project underta<en $e.g./ curriculum development/ staff development/ services for a certain type of student%. This strategy serves two purposes: to document the range of variation in the funded projects/ and to determine whether common themes/ patterns/ and outcomes cut across this variation. +. Strati"ied purpose"ul sampling is slightly different from ma3imum variation sampling. stratified purposeful sample includes several cases at defined points of variation $e.g./ average/ above average/ and below average% with respect to the phenomenon being studied. #y including several cases of each type/ the researcher can develop insights into the characteristics of each type/ as well as insights into the variations that e3ist across types. -n contrast/ a researcher who uses ma3imum variation sampling is li<ely to have one case of each type/ which might be insufficient for drawing conclusions about that type. 8. Homogeneous sampling is the opposite strategy of ma3imum variation sampling. -ts purpose is to select a sample of similar cases so that the particular group that the sample represents can be studied in depth. "or e3ample/ suppose a researcher is interested in orientation programs for incoming high school students. -n doing pilot wor</ the researcher discovers that many high schools have orientation programs for all students/ but only some high schools have a special orientation program for at-ris< students. -n planning the main study/ the researcher might decide

to limit the sample of cases to orientation programs for at-ris< students. These programs can be the focus of intensive data collection and study rather than being one aspect of a broader study of orientation programs in general. B. Random purpose"ul sampling involves selecting a random sample using the methods of ;uantitative research. @evertheless/ the purpose of the random sample is not to represent a population/ which would be its purpose in ;uantitative research. :ather/ the purpose is to establish that the sampling procedure is not biased. "or e3ample/ if a researcher is evaluating a program for which some constituencies are critical/ the researcher can gain more credibility for his findings if he selects cases at random rather than loo<s for success stories to report. 6. !riti%al %ase sampling involves selecting a single case that provides a crucial test of a theory/ program/ or other phenomenon. "or e3ample/ Ialileo provided a critical*and convincing*test of his theory of gravity by demonstrating that a feather fell at the same rate as a coin when both were placed in a vacuum. Theories in education tend not to yield such precise predictions as are found in the physical sciences/ and therefore a critical case sampling strategy might have less applicability. Eowever/ this sampling strategy could prove useful in studying educational programs and related phenomena. "or e3ample/ a researcher might wish to evaluate a program by selecting a site in which it would be very difficult for the program to succeed. -f a study of this case yielded positive results/ one would be justified in claiming a strong generali&ation of the form: -f this program wor<s here/ it should wor< anywhere. .. T*eor$#)ased or operational %onstru%t sampling is used when the purpose of the study is to gain understanding of real-world manifestations of theoretical constructs. To illustrate/ we can consider =iagetGs theory of intellectual development/ which is widely used to interpret educational phenomena. 5ne of the constructs of the theory is the concrete stage of development. researcher might wish to develop further understanding of this construct by studying how it is manifested in particular settings. To achieve this purpose/ the researcher would need to select a sample of children who are at this stage of development. Then she could do an intensive analysis of how they function intellectually in various situations of interest to her. -n this e3ample/ then/ the selection of cases is determined by a particular theoretical construct. ,). !on"irming and dis%on"irming %ase sampling is done to validate findings of previous research. The validation process can be carried out in two ways. The first apis to study cases that are li<ely to confirm patterns/ themes/ and meanings discovered in previous case studies. -f the new case or cases are confirmatory/ the validity and generali&ability of the patterns/ themes/ and meanings are strengthened. The second approach is to loo< for cases that are good candidates for disconfirming previous research findings. -f the findings from these cases replicate previous findings of patterns/ themes/ and meanings/ their validity and generali&ability are greatly strengthened. Eowever/ if the findings are/ in fact/ disconfirming/ the researcher might develop new insights about the generali&ability limits of previous findings. ,,. !riterion sampling involves the selection of cases that satisfy an important criterion. This strategy is particularly useful in studying educational programs. "or e3ample/ suppose a researcher is planning to study a particular graduate program that prepares educational

administrators. Fsing a criterion sampling strategy/ the researcher might select two types of cases to study: $,% recent graduates who too< more than ten years to obtain their doctorates9 and $(% recent graduates who received their doctorates in three years or less. study of cases that satisfied one or the other of these criteria most li<ely would yield rich information about aspects of the program that wor< well or poorly. ,(. Politi%all$ important %ase sampling is a strategy that might serve a useful purpose for the researcher or funding agency. ,1. Opportunisti% sampling involves the use of findings from one case to inform the researcherGs selection of the ne3t case for study. -n fact/ the findings may alter the research design to be used in studying the ne3t case. 5pportunistic sampling is one of the most important strategies in selecting ;ualitative research samples. lthough =atton lists opportunistic sampling as a separate type of purposeful sampling/ the principle underlying it applies to many of the strategies described above. "or e3ample/ if you were to use the e3treme or deviant case sampling strategy/ you might start with the study of one case that you consider e3treme. s you develop an understanding of this case/ it may give you ideas about what to loo< for in selecting another e3treme case/ or it may cause you to switch to a typical case sampling strategy. 5pportunistic sampling allows you the fle3ibility to ma<e these switches. !onsider a typical problem that use of opportunistic sampling early in a research study might help the researcher avoid. ?e <now of instances in which researchers have selected a multiplecase sample at the outset of the study. They secure the cooperation of the sample and their informed consent letters/ and thus feel obliged to study all of them in depth. Fnfortunately/ the researchers sometimes discover after analy&ing data from the first few cases that they would learn more by studying other cases than those to whom they have become obligated. #y then/ however/ they may lac< the resources to select new cases because of their commitments to the initially selected cases. ,2. Snow)all or %*ain sampling involves as<ing well-situated people to recommend cases to study. s the process continues/ the researcher might discover an increasing number of wellsituated people and an increasing number of recommended cases/ all or some of whom can be included in the sample. lso/ the names of a few individuals might come up repeatedlyin tal<ing to different well-situated people. -f this type of convergence occurs/ these individuals would ma<e a highly credible sample. ,+. !om)ination3mi2ed purpose sampling reflects a decision to change from one sampling strategy to another as data collection progresses in order to meet multiple research interests and needs or for triangulation of the results. s =atton describes it: "or e3ample/ an e3treme group or ma3imum heterogeneity approach may yield an initial potential sample si&e that is still larger than the study can handle. The final selection/ then/ may be made randomly*a combination approach. . . . #ecause research and evaluations often serve multiple purposes/ more than one ;ualitative strategy maybe necessary.

Strateg$ La%4ing a Rationale ,8. !onvenien%e sampling as we discussed earlier/ is the strategy of selecting cases simply because they are available and easy to study. "or e3ample/ researchers might select teachers at a school where they formerly wor<ed/ because they believe that it will be easy to obtain the teachersG agreement to participate in the study. =atton observes that convenience is the least desirable basis for selecting cases to study.1( This strategy should be avoided because it is not purposeful in the same sense that the other ,+ sampling strategies described above are purposeful. Determining t*e Num)er o" !ases "or a .ualitative Resear%* Stud$ -n ;ualitative research/ determining the number of cases is entirely a matter of judgment9 there are no set rules. =atton suggests that selecting an appropriate sample si&e for a ;ualitative study involves a trade-off between breadth and depth: ?ith the same fi3ed resources and limited time/ a researcher could study a specific set of e3periences for a larger number of people $see<ing breadth% or a more open range of e3periences for a smaller number of people $see<ing depth%. -n-depth information from a small number of people can be very valuable/ especially if the cases are information-rich. Less depth from a larger number of people can be especially helpful in e3ploring a phenomenon and trying to document diversity or understand variation. =atton suggests that the ideal sampling procedure is to <eep selecting cases until one reaches the point of redundancy/ that is/ until no new information is forthcoming from new cases. Sample si&e obviously will be affected by the purposeful sampling strategy that you select in planning a ;ualitative study. -f you are using a critical case strategy/ a single/ well- selected case might be sufficient9 adding another one or two critical cases could serve as a replication of the first case. Eowever/ the decision to use a ma3imum variation strategy perhaps would re;uire ten or more cases/ even if the study was an initial e3ploration into the phenomenon of interest.

T*reats to E2ternal 0alidit$ #y ruling out e3traneous factors and assuming that the treatment influences an outcome/ researchers ma<e claims about the generali&ability of the results. Threats to e3ternal validity are problems that threaten our ability to draw correct inferences from the sample data to other persons/ settings/ and past and future situations. ccording to !oo< and !ampbell $,.B.%/ three threats may affect this generali&ability: L Intera%tion o" sele%tion and treatment This threat to e3ternal validity involves the inability to generali&e beyond the groups in the e3periment/ such as other racial/ social/ geographical/ age/ gender/ or personality groups. 5ne strategy researchers use to increase generali&ability is to ma<e participation in the e3periment as convenient as possible for all individuals in a population. L Intera%tion o" setting and treatment: This threat to e3ternal validity arises from the inability to generali&e from the setting where the e3periment occurred to another setting. "or e3ample/ private high schools may be different from public high schools/ and the results from our civics e3periment on smo<ing may not apply outside the public high school where the researcher conducts the e3periment. This threat may also result from trying to generali&e results from one level in an organi&ation to another. "or e3ample/ you cannot generali&e treatment effects you obtain from studying entire school districts to specific high schools. The practical solution to an interaction of setting and treatment is for the researcher to analy&e the effect of a treatment for each type of setting. L Intera%tion o" *istor$ and treatment This threat to e3ternal validity develops when the researcher tries to generali&e findings to past and future situations. D3periments may ta<e place at a special time $e.g./ at the beginning of the school year% and may not produce similar results if conducted earlier $e.g./ students attending school in the summer may be different from students attending school during the regular year% or later $e.g./ during semester brea<%. 5ne solution is to replicate the study at a later time rather than trying to generali&e results to other times. -n our civics*smo<ing e3periment/ the researcher needs to be cautious about generali&ing results to other high schools/ other students in civics classes/ and other situations where discussions about the ha&ards of smo<ing ta<e place. The behavior of adolescents who smo<e may change due to factors associated with the cost of cigarettes/ parental disapproval/ and advertising. #ecause of these factors/ it is difficult to generali&e the results from our civics e3periment to other situations. T*reats to 0alidit$ The final idea in e3periments is to design them so that you minimi&e compromises in drawing od conclusions from the scores you obtain in the e3periment. threat to validity means tt design issues may threaten the e3periment so that the conclusions reached from data may provide a false reading about probable cause and effect between the treatment and the outcome T*reats to Internal 0alidit$ number of threats to drawing appropriate inferences relate to the actual design and procedures used in an e3periment. Threats to internal validity are problems that threaten our ability to draw correct cause-and-effect inferences that arise because of the e3perimental procedures or the e3periences of participants. 5f all of the threats to validity/ these are the most severe because they can compromise an otherwise good e3periment.

The first category addresses threats related to participants in the study and their e3periences: L Histor$: Time passes between the beginning of the e3periment and the end/ and events may occur $e.g./ additional discussions about the ha&ards of smo<ing besides the treatment lecture% between the pretest and posttest that influence the outcome. -n educational e3periments/ it is impossible to have a tightly controlled environment and monitor all events. Eowever/ the researcher can have the control and e3perimental groups e3perience the same activities $e3cept for the treatment% during the e3periment. L Maturation: -ndividuals develop or change during the e3periment $i.e./ become older/ wiser/ stronger/ and more e3perienced%/ and these changes may affect their scores between the pretest and posttest. careful selection of participants who mature or develop in a similar way $e.g./ individuals at the same grade level% for both the control and e3perimental groups helps guard against this problem. L Regression: ?hen researchers select individuals for a group based on e3treme scores/ they will naturally do better $or worse% on the posttest than the pretest regardless of the treatment. Scores from individuals/ over time/ regress toward the mean. L Sele%tion: =eople factors may introduce threats that influence the outcome/ such as selecting individuals who are brighter/ more receptive to a treatment/ or more familiar with a treatment for the e3perimental group. :andom selection may partly address this threat. L Mortalit$: ?hen individuals drop out during the e3periment for any number of reasons $e.g./ time/ interest/ money/ friends/ parents who do not want them participating in an e3periment about smo<ing%/ drawing conclusions from scores may be difficult. :esearchers need to choose a large sample and compare those who drop out with those who remain in the e3periment on the outcome measure. L Intera%tions wit* sele%tion: Several of the threats mentioned thus far can interact $or relate% with the selection of participants to add additional threats to an e3periment. -ndividuals selected may mature at different rates $e.g./ ,8-year-old boys and girls may mature at different rates during the study%. Eistorical events may interact with selection because individuals in different groups come from different settings. "or instance/ vastly different socioeconomic bac<grounds of students in the teen smo<ing e3periment may introduce uncontrolled historical factors into the selection of student participants. The selection of participants may also influence the instrument scores/ especially when different groups score at different mean positions on a test whose intervals are not e;ual. The ne3t category addresses threats related to treatments used in the study: L Di""usion o" treatments: ?hen the e3perimental and control groups can communicate with each other/ the control group may learn from the e3perimental group information about the treatment and create a threat to internal validity. The diffusion of treatments $e3perimental and none3perimental% for the control and e3perimental groups needs to be different. s much as possible/ e3perimental researchers need to <eep the two groups separate in an e3periment $e.g./ have two different civic classes participate in the e3periment%. This may be difficult when/ for e3ample/ two civic classes of students in the same grade in the same high school are involved in an e3periment about teen smo<ing. L !ompensator$ e5uali-ation: ?hen only the e3perimental group receives a treatment/ an ine;uality e3ists that may threaten the validity of the study. The benefits $i.e./ the goods or services believed to be desirable% of the e3perimental treatment need to be e;ually distributed among the groups in the study. To counter this problem/ researchers use comparison groups $e.g./

one group receives the health- ha&ard lecture/ whereas the other receives a handout about the problems of teen smo<ing% so that all groups receive some benefits during an e3periment. L !ompensator$ rivalr$: -f you publicly announce assignments to the control and e3perimental groups/ compensatory rivalry may develop between the groups because the control group feels that it is the underdog. :esearchers can try to avoid this threat by attempting to reduce the awareness and e3pectations of the presumed benefits of the e3perimental treatment. L Resent"ul demorali-ation: ?hen a control group is used/ individuals in this group may become resentful and demorali&ed because they perceive that they receive a less desirable treatment than other groups. 5ne remedy to this threat is for e3perimental researchers to provide a treatment to this group after the e3periment has concluded $e.g./ after the e3periment/ all classes receive the lecture on the health ha&ards of smo<ing%. :esearchers may also provide services e;ually attractive to the e3perimental treatment but not directed toward the same outcome as the treatment $e.g./ a class discussion about the ha&ards of teen driving with friends%. The following category addresses threats that typically occur during an e3periment and relate to the procedures of the study: L Testing. potential threat to internal validity is that participants may become familiar with the outcome measures and remember responses for later testing. 7uring some e3periments/ the outcome is measured more than one time/ such as in pretests $e.g./ repeated measures of number of cigarettes smo<ed%. To remedy this situation/ e3perimental researchers measure the outcome less fre;uently and use different items on the posttest than those used during earlier testing. L Instrumentation: #etween the administration of a pretest and a posttest/ the instrument may change/ introducing a potential threat to the internal validity of the e3periment. "or e3ample/ observers may become more e3perienced during the time between a pretest and the posttest and change their scoring procedures $ie.. observers change the location to observe teen smo<ing%. Less fre;uently/ themeasuring instrument may change so that the scales used on a pretest and a posttest are dissimilar. To correct for this potential problem/ you standardi&e procedures so that you use the same observational scales or instrument throughout the e3periment.

HO6 DO &O/ 0ALIDATE THE A!!/RA!& O' &O/R 'INDINGS7 Throughout the process of data collection and analysis/ you need to ma<e sure that your findings and interpretations are accurate. Aalidating findings means that the researcher determines the a%%ura%$ or %redi)ilit$ o" t*e "indings t*roug* strategies su%* as mem)er %*e%4ing or triangulation, Several ;ualitative researchers have addressed this idea $e.g./ !reswell M Hiller/ ()))9 Lincoln M Iuba/ ,.6+%. Kualitative researchers do not typically use the word bias in research8 t*e$ will sa$ t*at all resear%* is interpretive and t*at t*e resear%*er s*ould )e sel"#re"le%tive a)out *is or *er role in t*e resear%*/ how he or she is interpreting the findings/ and his or her personal and political history that shapes his or her interpretation $!reswell/ ())B%. Thus/ accuracy or credibility of the findings is of upmost importance. There are varied terms that ;ualitative researchers use to describe this accuracy or credibility $e.g./ see authenticity and trustworthiness in Lincoln M Iuba/ ,.6+%/ and the strategies used to validate ;ualitative accounts vary in number $see eight forms in !reswell M Hiller/ ()))%. 5ur attention here will be on t*ree primar$ "orms t$pi%all$ used )$ 5ualitative resear%*ers triangulation+ mem)er %*e%4ing+ and auditing, L Kualitative in;uirers triangulate among different data sources to enhance the accuracy of a study. Triangulation is the process of corroborating evidence from different individuals $e.g./ a principal and a student%/ types of data $e.g./ observational field notes and interviews%/ or methods of data collection $e.g./ documents and interviews% in descriptions and themes in ;ualitative research. The in;uirer e3amines each information source and finds evidence to support a theme. This ensures that the study will be accurate because the information draws on multiple sources of information/ individuals/ or processes. -n this way/ it encourages the researcher to develop a report that is both accurate and credible. !redi)ilit$ and Trustwort*iness o" A%tion Resear%* -n this section we describe five validity criteria proposed by Iary nderson and Jathryn Eerr for use in evaluating the credibility and trustworthiness of action research studies.2B "or each criterion we also provide an e3ample of an action research study that we believe addresses that criterion well. ll these studies happen to involve collaboration between practitioners and academics/ although our intention in their selection was to identify a variety of clearly described/ well-designed/ recent studies that include actions involving documented improvements in educatorGs practice/ their effect on clients/ or both. -n designing action research studies or reading othersG studies/ we agree with Neichner and @offleGs recommendation that practitioners not rely solely on e3isting models/ but instead develop and apply their own criteria to improve and evaluate research ;uality:26 9, Out%ome 0alidit$ Out%ome validit$ involves t*e e2tent to w*i%* a%tions o%%ur t*at lead to a resolution o" t*e pro)lem under stud$ or to t*e %ompletion o" a resear%* %$%le t*at results in a%tion,

:, Pro%ess 0alidit$ Pro%ess validit$ addresses t*e ade5ua%$ o" t*e pro%esses used in di""erent p*ases o" resear%*+ su%* as data %olle%tion+ anal$sis+ and interpretation+ and w*et*er triangulation o" data sour%es and met*ods was used to guard against )ias, ;, Demo%rati% 0alidit$ Demo%rati% validit$ indi%ates t*e e2tent to w*i%* t*e resear%* is done in %olla)oration wit* all parties w*o *ave a sta4e in t*e pro)lem under investigation and to w*i%* multiple perspe%tives and interests are ta4en into a%%ount, <, !atal$ti% 0alidit$ !atal$ti% validit$ e2amines t*e degree to w*i%* t*e a%tion resear%* energi-es t*e parti%ipants so t*at t*e$ are open to trans"orming t*eir view o" realit$ in relation to t*eir pra%ti%e+ and *ig*lig*ts t*e eman%ipator$ potential o" pra%titioner resear%*, =, Dialogi% 0alidit$ Dialogi% validit$ is an assessment o" t*e degree to w*i%* t*e resear%* promotes a re"le%tive dialogue among all t*e parti%ipants in t*e resear%*+ to generate and review t*e a%tion resear%* "indings and interpretations, nother aspect of dialogic validity is shown by citations to other wor< by both authors/ who have written about and presented papers on global education to their professional peers. Are S%ores on Past /se o" t*e Instrument Relia)le and 0alid7 Cou want to select an instrument that reports individual scores that are reliable and valid. Relia)ilit$ means t*at s%ores "rom an instrument are sta)le and %onsistent, Scores should be nearly the same when researchers administer the instrument multiple times at different times. lso/ scores need to be consistent. ?hen an individual answers certain ;uestions one way/ the individual should consistently answer closely related ;uestions in the same way. 0alidit$+ *owever+ means t*at t*e individual>s s%ores "rom an instrument ma4e sense+ are meaning"ul+ and ena)le $ou+ as t*e resear%*er+ to draw good %on%lusions "rom t*e sample $ou are stud$ing to t*e population, :eliability and validity are bound together in comple3 ways. These two terms sometimes overlap and at other times are mutually e3clusive. Aalidity can be thought of as the larger/ more encompassing term when you assess the choice of an instrument. :eliability is generally easier to understand as it is a measure of consistency. To comprehend these concepts more fully/ it may be necessary to untangle their relationship. I" s%ores are not relia)le+ t*e$ are not valid9 scores need to be stable and consistent first before they can be meaningful. dditionally/ the more reliable the scores from an instrument/ the more valid the scores may be $however/ scores may still not measure the particular construct and may remain invalid%. The ideal situation e3ists when scores are both reliable and valid. 5f the two concepts/ reliability and validity/ reliability is easier to understand. -n addition/ the more reliable the scores from an instrument/ the more valid the scores will be.

Relia)ilit$ Ioal of good research is to have measures or observations that are reliable. Several factors can result in unreliable data/ including when: L Kuestions on instruments are ambiguous and unclear L =rocedures of test administration vary and are not standardi&ed L =articipants are fatigued/ are nervous/ misinterpret ;uestions/ or guess on tests $:udner/ ,..1%. :esearchers can use any one or more of five available procedures to e3amine an instrumentGs reliability. Cou can distinguish these procedures by the number of times the instrument is administered/ the number of versions of the instrument administered by researchers/ and the number of individuals who ma<e an assessment of information. ,. The test?retest relia)ilit$ procedure e3amines the e3tent to which scores from one sample are stable over time from one test administration to another. To determine this form of reliability/ the researcher administers the test at two different times to the same participants at a sufficient time interval. -f the scores are reliable/ then they will relate $or will correlate% at a positive/ reasonably high level/ such as ).8. This approach has the advantage of re;uiring only one form of the instrument9 however/ an individualGs scores on the first administration of the instrument may influence the scores on the second administration. !onsider this e3ample: researcher measures a stable characteristic/ such as creativity/ for si3th graders at the beginning of the year. Heasured again at the end of the year/ the researcher assumes that the scores will be stable during the si3th-grade e3perience. -f scores at the beginning and the end of the year relate/ there is evidence for test*retest reliability. (. nother approach is alternative "orms relia)ilit$. T*is involves using two instruments+ )ot* measuring t*e same varia)les and relating @or %orrelatingA t*e s%ores "or t*e same group o" individuals to t*e two instruments, -n practice/ both instruments need to be similar/ such as the same content/ same level of difficulty/ and same types of scales. Thus/ the items for both instruments represent the same universe or population of items. The advantage of this approach is that it allows you to see if the scores from one instrument are e;uivalent to scores from another instrument/ for two instruments intended to measure the same variables. The difficulty/ of course/ is whether the two instruments are e;uivalent in the first place. ssuming that they are/ the researchers relate or correlate the items from the one instrument with its e;uivalent instrument. D3amine this e3ample: n instrument with 2+ vocabulary items yields scores from first graders. The researcher compares these scores with those from another instrument that also measures a similar set of 2+ vocabulary items. #oth instruments contain items of appro3imately e;ual difficulty. ?hen the researcher finds the items to relate positively/ we have confidence in the accuracy or reliability of the scores from the first instrument. The alternate forms and test*retest reliability approach is simply a variety of the two previous types of reliability. -n this approach/ the researcher administers the test twice and uses an alternate form of the test from the first administration to the second. This type of reliability has

the advantages of both e3amining the stability of scores over time as well as having the e;uivalence of items from the potential universe of items. -t also has all of the disadvantages of both test*retest and alternate forms of reliability. Scores may reflect differences in content or difficulty or in changes over time. n e3ample follows: The researcher administers the 2+ vocabulary items to first graders twice at two different times9 and the actual tests are e;uivalent in content and level of difficulty. The researcher correlates or relates scores to both tests and finds that they correlate positively and highly. The scores to the initial instrument are reliable. 1. Interrater relia)lit$ is a procedure used when ma<ing observations of behavior. -t involves observations made by two or more individuals of an individualGs or several individualsG behavior. The observers record their scores of the behavior and then compare scores to see if their scores are similar or different. #ecause this method obtains observational scores from two or more individuals/ it has the advantage of negating any bias that any one individual might bring to scoring. -t has the disadvantages of re;uiring the researcher to train the observers and re;uiring the observers to negotiate outcomes and reconcile differences in their observations/ something that may not be easy to do. Eere is an e3ample: Two observers view preschool children at play in their activity center. They observe the spatial s<ills of the children and record on a chec<list the number of times each child builds something in the activity center. fter the observations/ the observers compare their chec<lists to determine how close their scores were during the observation. ssuming that their scores were close/ they can average their scores and conclude that their assessment demonstrates interrater reliability. Scores from an instrument are reliable and accurate if an individualGs scores are internally consistent across the items on the instrument. -f someone completes items at the beginning of the instrument one way $e.g./ positive about negative effects of tobacco%/ then they should answer the ;uestions later in the instrument in a similar way $e.g./ positive about the health effects of tobacco%. The consistency of responses can be e3amined in several ways. 5ne way is to split the test in half and relate or correlate the items. This test is called the Juder*:ichardson split half test $J:*()/ J:*(,% and it is used when $a% the items on an instrument are scored right or wrong as categorical scores/ $b% the responses are not influenced by speed/ and $c% the items measure a common factor. Since the split half test relies on information from only half of the instrument/ a modification in this procedure is to use the Spearman*#rown formula/ which estimates fulllength test reliability using all ;uestions on an instrument. This is important because the reliability of an instrument increases as researchers add more items to the instrument. "inally/ the coefficient alpha is used to test for internal consistency $!ronbach/ ,.62%. -f the items are scored as continuous variables $e.g./ strongly agree to strongly disagree%/ the alpha provides a coefficient to estimate consistency of scores on an instrument.

You might also like