Professional Documents
Culture Documents
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at .
http://www.jstor.org/action/showPublisher?publisherCode=ims. .
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to
Statistical Science.
http://www.jstor.org
StatisticalScience
1988, Vol. 3, No. 2, 149-195
and
EmploymentDiscrimination
StatisticalScience
Arthur
P. Dempster
149
150 A. P. DEMPSTER
employee. The framework was usedto describea col- Are actions or decisionsby an employercausing
lectionof obstaclesfacinga statisticianseekingto employees to suffer ill effectsofdiscrimination? Stat-
inferthepresenceor absenceofdiscrimination, orthe isticiansand theirclientsclearlyneedsharedworking
extentof discrimination, practicedby an employer principles leadingtoagreedunderstanding ofpotential
againsta legallyprotectedclass ofemployeessuchas causal inferences.One set of such principles,ably
women,blacks or olderpersons.I also describeda reviewed byHolland(1986a),represents a majorintel-
controversy over.whichof two formsof regression lectualcontribution of statisticsto science.The idea
analysis,called directregression and reverseregres- is to identify randomization-based studiesas situa-
sion,is thecorrectformto applyto employment data tionswherecausal inferences maybe soundlydrawn,
whenseekingto estimatea discrimination effect,and and to evaluateotherattemptsat causal inference by
I suggested thatmyframework couldbe usedto resolve judgingwhethertheirnonrandomized circumstances
thecontroversy. Pratt(1986)concludeda briefdiscus- adequatelyconform to criticalassumptions whichare
sion of Dempster(1984) withthe remark,"But we transparently satisfiedin randomized studies.Statis-
mustkeep strivingtowardsensiblemodes of using ticianssuchas Prattand Schlaifer(1984) or Holland
statisticsin legal and public arenas-think of the (1986a)are skepticalthatmanyclaimedcausaleffects
alternatives!" The presentanalysisattempts construc- in econometric or othersocial sciencemodelscan in
tivedevelopment oftheideas in thepreviouspaper. factmeettheassumptions.
The goal ofSection2 is to definea modeofstatis- Because the data sets on whichclaimsof employ-
tical thinkingthat I believeto be appropriateand mentdiscrimination are usuallybased are adminis-
sufficientto deal withtheproblem.Then,in Section trativerecords,and thereforeabout as far from
3, I takethemodeling processbackto firstprinciples, randomizedstudiesas one can get, it mightseem
and set out the reasoningbehindthe typeof model that statisticiansconcernedabout theirreputations
whichI advocate.Simplemathematical analysisofmy forscientificcredibility woulddeclarecausal infer-
modelindicatesthateitherdirectorreverseregression enceimpossibleor worse,and leavethematterthere.
givesunbiasedestimatesof discrimination effectsif Nevertheless, whileI acceptthe cogencyand impor-
corresponding assumptionshold,but that in general tance of the negativearguments, I believethat we
neitheris valid,becausethecorrectanalysisdepends bear professional responsibility forcarrying the dis-
on thevalueofa parameter thatcapturesan essential cussion further, but along lines complementary to
aspectof the real world,but whosevalue cannotbe themainissuesraisedso far.
estimatedfromthe statistician's data. Two scientific Holland(1986a) classifieddiscussionsofcausation
hurdlesare raised by this situation.One involves intothoseaddressing themeaningofthetermcause,
teachingstatisticiansand othersto see clearlyand those attempting to understandor establishcausal
explicitlythat statisticalanalysison its own rarely mechanisms in relationto a specificclass ofphenom-
offers completesolutionsto externally specified prob- ena and thosetryingto identify and measurecausal
lems.Instead,statistical analysisbringsus tothepoint effects in specificsituations.Althoughonlythethird
ofseeingclearlywhatthegap in ourknowledge is, so typeis directly concerned withinferring causaleffects,
thatwe maythenaddressthe secondhurdle,namely, impliedattitudesand understandings involvingthe
the difficult task of lookingoutsidethe data forevi- othertwo typesmust underlieany specificcausal
dencebearingon the missinginformation. Section3 analysisofstatisticaldata. For myviewson theoper-
also relates the new models to more traditional ational meetingof causation,I referto Dempster
descriptionsof econometricmodels as typifiedby (1987). An essentialpoint is that causal language
Goldberger(1984), and argues that Goldberger's impliesthepresenceofsomeactionmechanism oper-
framework locksin arbitrary assumptions whichprej- ating in the real worldand yieldingconsequences
udice his conclusionsand lead him to proposean called causal effects.The secondtypeof discussion
irrelevant test forthe validityof reverseregression. picks up fromhere and asks, for example,in an
Finally,in Section4, I discussa conceptualdilemma, employment discrimination case,whatare the causal
whichcan scarcelybe avoided,eitherbyemployers or mechanismsoperatingto determinerewards of
regulators. The questionis: whatto do if economic employment? My claimthattheanalysisofSection3
efficiency and legal avoidanceof discrimination col- belowmakesa contribution tothestatisticsofemploy-
lide? I call thisthe problemofjudgmentaldiscrimi- ment discrimination rests on the adoptionof an
nation.I suggestthatthemodelofSection3 provides explicitviewofthebasic mechanism ofrewarddeter-
a basis foranalyzingthedilemma,and leadsto policy minationwhichis at best left implicitin traditional
attitudesconsistent withfairand reasonablewaysto econometric models.
handletheproblem. The distinction doesnotinvolvethediscrimination
Causalthinking to theissues.Is discrim-
is intrinsic mechanism itself,because itseemssafeandreasonable
inationcaussingreducedpay forcertainemployees? to regardthediscrimination causaleffectas a quantity
EMPLOYMENT DISCRIMINATION 151
Kraskerand Pratt, 1984). I regardthe othertwo ofinformation on X2,and thenceon gM2 and AF2. It is
arguments as specious.Causal interpretation is essen- easy to constructartificialX* consistentwithgiven
tial to inferring discrimination as a cause and cannot data Y and X such that gM2 and AF2 have arbitrary
be taken lightly.There is no logical connection values.Such arbitrary valuesmayoftenbe unreason-
betweenwhetheror not a parameterexists and is able on a priorigrounds,butthereis no magiccureto
important in the real world,and whetheror not we be foundin thedata forthebias in (12). In thissense,
are luckyenoughto have data permitting estimation we knowthatit is hopelessto tryto solvetheproblem
of the parameter.If the data are not there,the only by replacingthe traditional"direct"regression based
courseopento theobjectively orientedBayesianis to on (1) withan alternativeformcalled"reverse" regres-
seekbetterdata,or,muchthesame,to seekobjective sion, whateverthe definitionof reverseregression.
sourcesofpriordistributions. Still,the reverseregression storyis fascinating,and
Despitethesenegativeremarks, thereare excellent leads me to concludethat reverseregressionhas a
reasonsforbeginning a studybyfitting model(1) and possiblerolein helpingthestatistician whois serious
estimating a. One reasonis to findoutwhatthedata aboutdeveloping a prior(= posterior)distributionfor
can legitimately tell us, namely,how well Y can be the bias a - a*.
predictedfromX. A secondreasonis moretechnical. The originalmotivation forthe statisticalmethod
It turnsout to be easierto giveinterpretable mathe- calledreverseregression, as wellas forthecontrasting
maticalexpressionsforthe bias a - a* than fora* termsdirectand reverse,comes fromcontrasting
directly. This is dueto thesimplicity oftheprocessof definitions of "fairness"whichare virtuallyfreeof
addingvariablesto a regression analysis. stochasticor causal modelingassumptions.Both
Ifwethinkhypothetically ofcarrying outtheregres- approachesagreethateach employeeshouldbe paid
sion analysisindicatedby (3*) in two stages,first exactlywhathe or she deserves,and thenask fora
regressing on X, and secondbringing in the informa- substitute principleto be usedin therealworldwhere
tionin X* notcontainedin X, we maywrite such perfectionis not achievable. In the first
X*f* = X1 + X2
approach,theprincipleis to requirethat,givenequal
(8)
qualifications, malesand femalesshouldbe paid the
whereX1 has the formX#%',whileX2 has the form sameon average.Ordinary, ordirect,regression based
X*,8 andthelinearcompounds represented byX2are on (1) is seen as a meansto obtaina suitablequalifi-
withX. Substituting
uncorrelated (8) into(6) leadsto cationsmeasureto be usedas a practicalstandardfor
judgingsuch equalityof pay averagesover gender
(9) Y= Ga* +X1 + X2,
groups.It followsthat a = 0 is the criterionforno
hencecomparison
with(1) showsthatW.'= fiso that discrimination, and a is the amountto be added to
femalepay to achieveparitywithmales.This simple
(10) X1 = XJ,,
lineofreasoningsoundsappealing,untilit is realized
and thatthechoiceofstandardis farfrominnocuous.For
example,I also supportthe principle,but withthe
(11) Ga* + X2= Ga + e.
more appropriateX*,8*used in place of X,8. The
If we denotethe male and femalepopulationmeans secondapproachreversesthe rolesofpay and quali-
ofX2by gM2 and AF2, and takepopulationaveragesof ficationsand suggeststhat the criterionforno dis-
(11) forG = 1 and G = 0, then,because e has zero crimination shouldbe that males and femaleswith
meansforbothgendergroups,we findthatAF2 = 0 givenpay shouldon averagehaveequal qualifications
and a = a* + ,UM2,hence measure.Statisticianscontemplating reverseregres-
sion are also naturallydrawnto X,8 from(1) as a
(12) a = a* + -
(AMe AF2)
suitablecompositequalifications measure,hencetest-
In words,thebias fromusinga in place ofa* is given ingfordiscrimination requireslookingat the"reverse"
bythedifference ofmaleandfemalepopulationmeans regression of X,8on Y and G. If the gendergroups
ofthe additionalpredictive variableX2 knownto the haveparallelbutdifferent regressionlines,as implied
employerbut not to the statistician.The reasonfor by the assumptionof commoncovariancestructures
retainingAF2 in (12) is thatthe conditionLF2 = 0 iS in thetwogendergroups,theremedy is to adjusteach
an artefact
oftheparticular choiceofgenderindicator female'spay by the amountrequiredto bringthe
G. Formula(12) holdsforany genderindicatorsuch regression linesintoconformity. The problemis that
that the male-femaledifference has absolutevalue the requiredshiftis different in the case of reverse
unity. regression fromthecase ofdirectregression.
data Y and X providea and X1,
The statistician's All statisticiansknowthattheregression linesof Y
assumingeffectively infinitesamplesize,butis devoid on X and of X on Y have different slopes,but the
EMPLOYMENT DISCRIMINATION 157
by HarryRobertsin the summerof 1979,at a time A comparisonof (12) and (15) showsclearlyhow a
when he was cailryingout massive analyses of changefromthe"direct"assessmenta tothe"reverse"
employeerecordsat the HarrisBank in Chicago,in assessmentaR can easilyswitchthesignoftheeffect.
preparationfora hearingon chargesthat the bank Note thatall the quantitiesappearingin the addi-
was practicingdiscrimination. Therehas been a long tionalterm[T2/T1](AM1 -AF1) are determined by the
subsequentliterature, includingConwayand Roberts data Y, X and G, so thattheproblemof
statistician's
(1983),withsubsequentcommentaries and rejoinder fromtheprob-
assessingthebias in aR is no different
in theApril1984issueoftheJournalofBusinessand lemofassessingthebias in a, i.e.,fora personalistit
EconomicStatistics, and Goldberger (1984). The key is stillthe problemof assessinga probabilitydistri-
pointis thatdirectand reverseregressions oftengive butionfor(M2 - F2), givenwhateverevidencecan
conflictingmessages.In situations wheremalesareon be broughtto bear on the question.I do thinkthat
averagemorequalifiedthanfemales,it oftenhappens some statisticiansfacingthis admittedly formidable
that the average male salary exceeds the average taskwouldbe helpedby havingthe additionalquan-
femalesalaryamongemployeeswitha givenqualifi- titiesassociatedwithreverseregressionin view.In
cationsmeasure,suggesting discrimination againstfe- particular,the conditionthat aR is unbiasedcan be
males,whereassimultaneously amongemployees with written
a givensalarythe averagequalificationmeasureof
males exceedsthe averagequalificationmeasureof (16) /M2 -
LF2 = [T2t1](AM1
-
AF1)
Comment
FranklinM. Fisher