You are on page 1of 19

Unobserved Product Differentiation in Discrete-Choice Models: Estimating Price Elasticities and Welfare Effects Author(s): Daniel A.

Ackerberg and Marc Rysman Source: The RAND Journal of Economics, Vol. 36, No. 4 (Winter, 2005), pp. 771-788 Published by: Wiley on behalf of RAND Corporation Stable URL: http://www.jstor.org/stable/4135256 . Accessed: 20/12/2013 10:52
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

Wiley and RAND Corporation are collaborating with JSTOR to digitize, preserve and extend access to The RAND Journal of Economics.

http://www.jstor.org

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

RANDJournal of Economics Vol.36, No.4, Winter 2005


pp. 771-788

Unobserved

product
and

differentiation in

discrete-choice models:
elasticities welfare
DanielA. Ackerberg* and
Marc Rysman**

estimating
effects

price

useddiscrete-choicemodelssuchas logit, nestedlogit, and random-coefficients models Commonly characteristicspace changes with the number place verystrongrestrictionson how unobservable of products. Weargue (and show with Monte Carlo experiments)that these restrictionscan lead to biased estimates of price elasticities and the welfare consequencesfrom additionalproducts. In addition,these restrictionscan identifyparametersthat are not intuitivelyidentifiedgiven the data at hand. Wesuggest an alternativemodel that does not have thesepropertiesand present a and an empiricalexampleshow structuralinterpretation of the model. Monte Carlo experiments that this issue can be importantin practice.

1. Introduction
m The recentliteraturein appliedeconomics, and empiricalIndustrialOrganizationin particular,has often turnedto discrete-choicemodels to estimatedemandfor differentiated productsor differentalternatives. In these models, consumerutility functions,marketshares,and substitution that are observedby the econometrician.In addition, patternsdepend on productcharacteristics these models typically allow for unobservedproductcharacteristics throughthe inclusionof some form of "symmetricunobservedproductdifferentiation" (SUPD).1 The most common examples of SUPD are logit errorsin consumers' utility functions (see McFadden,1974). The economic justificationfor includingunobservable productdifferentiation is thatan econometrician thatarereletypicallydoes not observeall of the productcharacteristics vant to consumers'choices. Froman econometricstandpoint,allowing for unobservable product
Universityof Californiaat Los Angeles; ackerber@econ.ucla.edu. **Boston University;mrysman@bu.edu. The authorswish to thankseminaraudiencesat the FederalReserve Bank, Boardof Governors,StanfordGraduate of Justice, and School of Business, UCLA, CarnegieMellon, UBC, Johns Hopkins,Georgetown,SITE, the Department the EconometricSociety meetings. We also wish to thankAviv Nevo, Ariel Pakes, two referees, and the Editorfor their comments. Particularthanks are due to Anne Hall for her importantcomments on a later draft of the article. Rysman received financialsupportfrom NSF grantno. SES-0112527. SNotableexceptions are Bresnahan(1987), Feenstraand Levinsohn(1995), and Leslie (2004). Copyright? 2005, RAND. *

771

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

772 / THERANDJOURNAL OFECONOMICS differentiation often preventsthese models from predictingzero marketshares.Its inclusion can also ease estimation. Wearguethatwhile SUPDin itself maybe helpful,commonlyused models (e.g., logit models, models of Berry,Levinsohn,and probitmodels, nested logit models, and the random-coefficient Pakes (1995; henceforthBLP)) implementit in an undesirableway. These models assume that each productadded to the marketadds one additionaldimension to SUPD space. This feature results in very little "congestion"in unobservedcharacteristicspace and can be problematic in situationswhere differentconsumersface differentnumbersof products,because consumers are drawn either from differentgeographiesor from differenttime periods.2Researchersmay intuitively think that in marketswith more products,unobservablecharacteristicspace should "fill up" in some sense. These standard models place strongrestrictionson how this occurs. We show that these restrictionsplay a majorrole in econometric identificationof two of the majorquantitiesof interestin differentiated-product markets.First are the welfare effects of new products.This problemis one that has been recognized, e.g., in Trajtenberg (1990), Petrin (2002), Berryand Pakes (1999), and BajariandBenkard(2001). Because of the lack of crowding in the standardtreatmentof SUPD, welfare calculationsin standard models tend to overpredict gains from the introductionof new products.This problemhas potentiallyserious implications for policy issues such as the constructionof price indices.3 Second and less recognized are the implicationsof SUPD on estimated substitutionpatterns. We argue that using the standardversions of SUPD can lead to misleading econometric conclusions regardingprice elasticities, in terms of both magnitudesand statisticalsignificance. Restrictions of standardSUPD force variationin the numberof productsin the choice set to identify (or help identify) price elasticities. Interestingly,we show that with these restrictions, one can often "identify"price elasticities withoutobservingmeaningfulvariationin prices. This source of identificationrelies entirely on assumptionsabout unobservablecharacteristicspace. These assumptionsare even more unreliableif, as is often the case, "defining" differentproducts has some arbitrariness to it.4 There are two previous approachesin the literaturethat addressthese issues. The first set work of (e.g., BLP (1995) and Petrin(2002)) tries to reducethe importanceof SUPD by linking substitutionpatternsto observablecontinuouscharacteristics (e.g., BLP) or observed groupings (e.g., the nested logit). This approach keeps SUPD (e.g., logit errors)in the model but attemptsto reduce its importance.These methodologieswork to the extent thatthe econometricianobserves the relevantproductcharacteristics. However,as inflexible SUPD still exists in these models, its effects can still exist.5 A second and more recent approach,advocatedby Berry and Pakes (1999) and Bajariand Benkard(2002), eliminates SUPD altogetherfrom the model.6 In their "purehedonic"models, productsare unobservablydifferentiatedonly with respect to a single dimensionalunobserved As new products characteristic. characteristic enter,thisunobserved spacebecomesmorecrowded. While these approachesare intuitivelyvery attractivein the sense that there are no ad hoc logit
2There are many examples. Berry and Waldfogel (1999), Crawford(2000), Arcidiacono (2005), and Rysman (2004) face cross-sectionalvariationin the numberof availableproducts.Berry,Levinsohn,andPakes(1995), Bresnahan, Stern,andTrajtenberg (1997), Petrin(2002), andCrawfordand Shum(2005) face temporalvariation.Nevo (2001), Town and Liu (2003), and Shum (1994) face both. This list is far from exhaustive. estimates of these welfare effects, the welfare 3 While we believe that our methodologies will provide "better" gains of any new productwill dependon the shape of the demandcurve at very high prices. Thus, unless one observes a wide range of prices, any calculationof welfare gains is going to rely on fairly "structural" assumptionsaboutthe upper portionof the demandcurve. 4 For example, with cars and computers,the empiricaldefinitionof what constitutesa "choice"clearly has some to it (e.g., BMW 3 Series versus (BMW 330, BMW 325) versus (BMW 330i, BMW 330Ci, BMW 330 Ci arbitrariness convertible)). 5 In addition to logit errors, these approachestypically allow for a scalar unobservedproduct characteristic are equally valued by all correspondingto each product.However, as these scalar unobservedproductcharacteristics consumers,they do not play a majorrole in determiningthe extent of productdifferentiation. 6 Feenstraand Levinsohn (1995) also estimate a multidimensional pure hedonic model, albeit without any unobservedcharacteristics.
? RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

ACKERBERG ANDRYSMAN / 773 intensive(BerryandPakes)ormoredataintensive errors, they arealso eithermorecomputationally than standard models and Benkard) includinga logit error.7 (Bajari This article suggests a third approach,which we interpretas somewhat of a compromise between the above two. We arguethat it is the unnecessaryinflexibilityof standard logit errors that can adversely affect estimates of parametersof interest such as substitutionpatternsand welfare effects. As such, we keep logit errorsin our model but allow them to be considerably moreflexible thanis currently done. This flexibilityallows an econometrician to estimatehow fast with unobservedcharacteristic the additionof new products,not assumeit as prior space expands work does. In practice,our approachsimply puts functionsof the numberof productsin a market (and/orthe numberof productsin a groupor nest) into the discrete-choiceestimatingequation.We show thatthis model has a structural interpretation--onewhere new productscrowd out existing productsin retail store or shelf space. Althoughthis model of "crowdingout" is very stylized, it is intuitiveand capturesphenomenathatwe believe actuallyoccur in markets.8 Ourflexible logit errorimposes no additionalcomputational is considerablysimpler burden,andthus our approach to implementthanBerryand Pakes (1999) as well as less dataintensivethanBajariand Benkard (2002). On the otherhand,those with a more structural leaning mightprefertheirmethodsin that they completely eliminate ad hoc logit errors,while we only make them more flexible.9 We proceed as follows. In Section 2 we argue (1) that traditionaldiscrete-choicemodels that place unnecessaryrestrictionson SUPD, (2) thatthese restrictionscan "identify" parameters intuitivelyshouldnot be identified,and (3) thatthese restrictionscan bias estimatesof parameters of interest. Section 3 introducesour model of productcongestion and discusses estimation.In Section 4 we present Monte Carlo results showing that in the presence of productcongestion, standard estimationprocedures can give biasedresults(sometimesvery large)andthatthesebiases tendto be in particular directions.Section 5 appliesthe estimatorto dataon Yellow Pages demand from Rysman (2004). We find that the adjustmentssignificantly affect predictions. Section 6 discusses a multiplicativeadjustment,which providesmany of the same benefits for estimation with a slightly differenttheoreticalmotivation. arefocusedon the contextof estimatingaggregated Lastly,note thatmuchof ourapplications discrete-choicemodels. These models are typically estimatedon data across markets(in space or time) where one often observes changes in the size of the choice set and where our concerns are relevant.However,our comments and techniquesare equally applicablefor discrete-choice models estimatedon individual-leveldata (e.g., product,employment,or transportation choice) when there are changes in the choice set over individualsor time.

ditionaldiscrete-choice andthatthis leadsto undesirable modelsarerestrictive, identification results. Webriefly oursolution to theproblem, whichis formalized andfurther motivated suggest
in Section 3. We use the nested logit model to formallyillustrateouridentificationpoints,but we discuss the extension of our argumentsto a full random-coefficients model. o Identification. We startby using derivative-based identificationarguments to show how the nested logit model handleseconomicallyinterestingvariationin a restrictiveway. Forexposition,
space may expandtoo 7 Anotherpossible critiqueof these hedonic models is thatwhile unobservedcharacteristic much with logit errors,it may expand too little with the pure hedonic models, at least when unobservedcharacteristic space is modelled as one-dimensional. 8 For example, retail stores often sell only a small subset of the availablewholesale products.Computerretailers, e.g., typically display between 10 and 30 computers,while the total numberof wholesale computersavailablein a given this is due to the costs of retail space. year is between 150 and 250 (Pakes, 2003). Presumably, 9 An interestingissue is to what extent these various models are observationallyequivalentin terms of market shares,marketsharederivatives,or marketsharechanges with new products(see, e.g., Anderson,DePalma, and Thisse has gravitated (1992) andMcFaddenandTrain(2000)). Regardlessof observational equivalence,the fact thatthe literature towardusing models includinglogit errorsmakes our resultsrelevant.
? RAND 2005.

2. Unobserved differentiation in common discrete-choice models 0 This sectionarguesthatassumptions aboutunobservable characteristic spaceusedin tra-

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

774 / THERANDJOURNAL OFECONOMICS assume thereare J productsand an outside option, labelledproduct0. The J productsare in one group (nest) g and the outside option is in a groupby itself. In the nested logit model, the utility obtainedby consumeri from productj (j > 0) is
uij = o + XjIf +
?ig(0)

+ Sij,

where Eijis distributed type-I extremevalue, and 'ig(a) is constantfor each individualwithinthe nest and distributed such thatTig (a) + ,ijis distributed product type-Iextremevalue (see Cardell, i's taste Note consumer for good j and jig represents that 1997). idiosyncratic eij represents + 8io, i's idiosyncratictaste for productsin group g. As is standard,we assume uio = Wio(a) The a zero. measures "mean" the outside to the of utility option normalizing parameter E [0, 1] correlationin unobservedutility among productsin the nest. Lower values of a imply stronger within-groupsubstitutionrelative to across-groupsubstitution(in this case, substitutionto the In what follows, we interpret outside alternative). Xj as the price of productj, but our arguments to elasticities with to triviallyapply respect generalproductcharacteristics.10 share Denote the market for firm j as sj, the marketshare for the entire group of inside productsas sg, andthe marketsharefor j withingroupg as sjil. We thenhave (Cardell,1997)11,12 exp (fo + Xjip) DJ sl = = s+ + Dr1) j=sjl SJlg 1exp (,o + Xkfll) , sg e1 s,, whereD= exp(o+Xk 1).

(1)

There are threeforms of variationin datathatidentify the parameters fl anda in the nested is market in shares due to The variation first changesin observable type within-group logit model.13 at the to this derivativecorresponding type of variationtells us productcharacteristics. Looking The derivativeis what parameters are identifiedby the variation.14

ax1 = fliSjlg(l

=Sjlg

- Sjlg),

(2)

suggesting thatthis type of variationidentifies01. The second and thirdtypes of variationare changes in group marketshares(sg) due to (1) and(2) changesin the numberof products.To focus changes in observableproductcharacteristics on group-levelchanges, assume Xj = X Vj. In that case, the derivativesof group shares, with respect to X and J are
ag

aX

sgg(( - sg)

iJ

J Sg)

(3)

fora: cross-groupswitchingfromchanges This suggeststhattherearetwo sourcesof identification in the numberof productsand cross-groupswitching from changes in observed characteristics. for Note thatthereare also two sourcesof identification switchingfromchanges within-group fll:
10We The points in our ignore endogeneity issues regardingprice, which has been a focus of the priorliterature. articleare valid whetherprice movementsare purelyexogenous or whetherthey are endogenousand one must find some exogenous source of price variation. (and notation)used above and in the following identificationargumentsis different I1 Note thatthe normalization to Berry's(denote these as flBerryand (Berry), used by Berry (1994). To convertour parameters from the normalization = a = 1 - aBerryandP = one can use the transformations f(Berry/la [fBerry/(l - 'Berry). 12While the descriptionof our identificationarguments would be slightly differentwith the Berrynormalization, normalizations when thereare multiplenests the models are identical.One needs to be morecarefulwith these alternative (see, e.g., Hensherand Greene(2002).). 13Note that the constanttermlo is identifiedby the level of the inside productmarketshares. 14Note thatthese comparative we would like to do in the datato staticscorrespondto hypothetical"experiments" identify parameters.
C RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

ACKERBERG ANDRYSMAN / 775 in observedcharacteristics and cross-groupswitching from changes in observedcharacteristics. Given that these three comparativestatics (asglaX, asg/aJ, and asj/aXj) map into only two structural parameters (01 and or),the model implies a restrictiverelationshipbetween the effects. has interestingimplicationsforidentification. This restrictive Observingmarkets relationship where productcharacteristics (or price) differ across marketsbut the numberof productsis the same in all marketscan identify both o and 61. Therefore,a researchercan identify the effects of addinga productto the choice set (e.g., the additionalwelfare generatedby the new product) withoutever observingvariationin the numberof products.Perhapseven more unintuitively, one can identifycross-priceelasticitiesbetweenproductsin the groupwithoutever observingchanges in relativepricesof the products(for a simple exampleof this, see the firstpartof Section 3). More generally,not only do pricechangesplay a role in identifyingpriceelasticities,butless intuitively, changes in the numberof productsplay a role in identifyingprice elasticities. Similarly,not only do observedchanges in the numberof productsplay a role in identifyingthe effects of changing the numberof products,less intuitively,changes in prices or characteristicswill play a role in identifyingthese effects.15 sourcesof identification believable?Clearlythisidentification is coming Aretheseunintuitive from somethingin the structureof the demandmodel. Thus, the answerto this question should is believable. We arguethroughthe rest of the articlethat what dependon whetherthis structure is generatingthis identificationis a very peculiarand unintuitivepropertyof standard logit errors. As such, our answerto the above questionwould be "no. o Properties of logit errors. Any model including logit errorsimplicitly makes restrictive characteristic assumptionsaboutthe relationshipbetween unobservable space and the numberof that of errors the dimension unobservable characteristic products.Specifically,logit imply space the of see note that can to number To we write consumer this, expandsproportionally products. i's set of logit errorsfor the J productsas
Eil = dlleil + + dlJ ei

eiJ = dJl1il +

+ dJJEiJ,

where djk are dummy variableswith djk = 1 if and only if j = k, djk = 0 otherwise.Written in this way, we can interpretlogit errorsas representinga J-dimensional characteristicspace: (Eil, ?..., iJ) are consumeri's preferencesover the J dimensions, and the vector (djl,..., djj) representsproductj's "location"in the J-dimensional space. Withthis interpretation, note thatif we addanotherproduct(J + 1) to the model, this product differentiates in an entirelynew dimension(thatof which is associatedwith a new logit dj+l,j+1), errorEij+I.Thus, the dimensionalityof unobserved characteristicspace expands by 1 with the additionof the new product. Anotherimplicationof logit errorsis that all productsare "equidistant" from each other in unobserved characteristic spaceandthis distanceremainsconstantas the numberof productsin the marketchanges.In a sense, thereis no "crowding in unobserved out"or "congestion" characteristic in the following way. Withclassical product-differentiation models space. This is counterintuitive such as the Hotelling model or the Salop circularmodel in mind, one would naturallyexpect productsin more dense marketsto be "closer"in characteristic space.16
15One's choice of instruments can affect how these comparativestatics play a role in identifyingparameters. For (for within-groupsharein the Berry example, supposeone has the choice of using (1/J) E xj and/orJ as an instrument (1994) inversion)in the nested logit model. Using only (1/J) xj (J) as an instrumentwould correspondto fittingthe second (third)comparativestatic more closely and would probablylead to better (worse) estimates of price elasticities and worse (better)estimates of welfare effects. If one uses both instruments(or a combinationof the two, e.g., E xj as suggested in BLP), which comparativestatic is fitted "better"will depend on the relative amounts of variationin (1/J) 1 xj and J in the data. 16 In logit characteristicspace, if one randomly chooses two products from a market,the expected difference between Eil and Ei2 is the same regardlessof the numberof productsin the markets.In contrast,consider a Hotelling
? RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

776 / THERANDJOURNAL OFECONOMICS As will become clear later, it is these strong assumptionsabout the relationshipbetween unobservablecharacteristic space and the numberof productsthatgeneratethe unintuitiveidentification results above. Therefore,unless one completely believes in this "no crowding out" and worry propertyof logit errors,one shouldprobablynot believe these sourcesof identification about obtainingbiased estimates of parameters(e.g., a and t), price elasticities, and welfare

calculations.17

o of the firstpartof this section were based on a fairly More-general models. The arguments nested model. Do random-coefficients models with logit errors(e.g., BLP,McFadden simple logit and Train(2000), andNevo (2001)) or nested logit models with more-complicated nesting structures have similar identificationproperties?We believe so. Comparative statics in these models are too complicatedto formulatearguments like the above. However,we can appealto a number of informal arguments.First, note that the nested logit model is in fact a random-coefficients dismodel--one where the randomcoefficientis on a groupdummyvariableandhas a particular tribution resultsshouldnot change (parameterized by a). The intuitionbehindthese identification if randomcoefficientsareinsteadon continuouscharacteristics and/orareassumedto havenormal would be identified distributions.18 Second, as with the nested logit model, all of the parameters if one estimated a random-coefficients model on a set of marketsall with the same numberof products.Therefore,any variationdue to the fact thatmarketshave differentnumbersof products is necessarilyhandledin a restrictiveway. Third,our Monte Carloresultson random-coefficient models suggest that they have similarproblems. Generally,we believe thatany model includingstandard logit errorswill have similarproperties, and that identificationin these models is suspect unless one believes the unintuitiveand restrictiveassumptionsinherentin standard logit errors. o Proposed solution. We now brieflypreviewourproposedsolutionto the problem,showing that it eliminates the perverseidentificationresults discussed above. Later,in Section 3, we give a structural of our solution. This structural interpretation interpretation correspondsto relaxing the "no crowdingout"assumptionof standard errors. logit We propose addinga function f(J; y) with parameter in the nested y to the termfi + Xj431 model i.e., (1), logit exp (fo + Xjpli+ f(J; y)) , + f(J; Y)) kjlg exp (B0+ Xk l =1 J
k=l
+ exp(fo Xk4l =

Da
1+D 1 +D

Sj=gS

Sj = SjgSg,

where D =

+ f(J; y)).

(4)

With this model, the threecomparative statics discussed above are


aSjlg = plSjlg(l Sg) = g( - Sg) g = sg(1 sg) + f(J; y)

Xj

aX

-ag(l

Sg) y+f(J;y)}.

(5)

model where productsspace themselves out as much as possible. Withtwo productsin the market,the expected distance betweentwo randomlychosen products(withoutreplacement) is trivially1, with threeproductsin the marketthe expected differenceis 1/3*1 + 2/3*1/2 = 2/3, with four productsit is 3/6*1/3 + 2/6*2/3 + 1/6*1 = 5/9, and with five productsit is 4/10*1/4 + 3/10*2/4 + 2/10*3/4 + 1/10*1 = 1/2. 17The CES demandsystem also does not display crowding,and is in fact subjectto many of the criticisms about elasticities and welfare effects that we make of logit-basedmodels. Extensionsof our adjustmentto the CES model are availablefrom the authors. 18In other words, we expect that in these models, comparativestatics in entry and exit will also play a role in identifying price elasticities, and comparativestatics in prices will play a role in identifying the effects of additional of errors. products.Again, this identificationis likely to be highly relianton the exact structure
? RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

ACKERBERG ANDRYSMAN / 777 staticsarethe same as before,butthe thirdnow dependson an additional The firsttwo comparative in the function.19 new y, parameter, Note thatthis adjustment gives the nestedlogit model the abilityto matchall of the observed will be identifiedby variationin within-group marketshares in the data.In this model, 1l variation Conditionalon 6, o is identifiedby changes due to changes in observableproductcharacteristics. andconditional in groupmarketsharein responseto changesin observableproductcharacteristics, on a, y is identifiedby changes in group marketsharein response to changes in the numberof products.In this adjustedmodel, one cannot identify the effect of addinga productto the choice set without observing variationin the numberof products,nor identify cross-price elasticities betweenproductsin the groupwithoutobservingchangesin relativeprices of the products.More generally, in this model we expect price elasticities to be identified by price variation,not by changes in the numberof products.Similarly,effects of changingthe numberof productsshould be identifiedby actualchanges in the numberof products,not by changes in prices. In essence, this adjustedmodel eliminatesthe unintuitivesources of identificationdescribedearlier.

3. A structuralinterpretation m In this section we exhibit a structuralmodel that generatesthe adjustmentsuggestedin the


of the new parameters, which can aid in previoussection. This providesa structural interpretation further to and the model a first-order for the condition instance, (for understanding adding writing It a also shows how the above detailed to more flexible producers). adjustment corresponds logit errorthateliminatesthe "no crowdingout" assumptionof standard logit errors.We also discuss estimationissues. Intuition behind the model. We begin with a story.Supposeone is interestedin estimating o a nested logit model of competitionbetween fast food firms(one nest is the fast food restaurants and one nest is a composite "outside"good). Data is obtainedon prices and marketshares for two time periods of data. In the first time period there is only one firm, MD, and in the second period there is entry and thus two firms, MD and BK. Suppose that prices are identical for all firms in all periods, that in the first period MD has a 50% marketshare,and that in the second periodboth MD and BK have 25% marketshares. Since the entryof BK "steals"marketshareonly from MD (andnot the outside alternative), a nested logit model will necessarilyestimateo = 0, i.e., that the within-groupvarianceis zero. This a = 0 implies (1) thatMD and BK are identicalin all respectsto all consumers,and (2) that the cross-priceelasticity between MD and BK is infinite.Note that identificationhere has come solely from changes in the numberof products,as thereis no variationin prices. Now consideran alternative storyof whatis going on in this data.Supposethese firmsoperate outlets and there is important (franchises) (i.e., all else equal, through geographicaldifferentiation consumerstendto go to the nearestoutletlocation).Otherthangeographicdifferentiation through their outlet locations, the food served by BK and MD is identical. In the first period there are two outlets, both franchisedto MD. In the second period there are also two outlets, but one of the MD outlets has been taken over by BK. Since prices remain constant and MD and BK serve identical food, this story is perfectly consistent with the marketshare data above. But is the nested logit predictionof infiniteprice elasticities correctin this example?We would expect not. Due to the geographicdifferentiation, we would expect a price cut by BK to only partially cut into MD's marketshare.The nested logit model estimateof a = 0 is highly misleadinghere: unintuitiverestrictionsof the model (ratherthanvalid price variation)are incorrectlyidentifying price elasticities to be infinite. The intuitionbehindthis storycan motivatea structural model in which J entersthe discretechoice estimatingequation.In the example, unobservedcharacteristic space (in this case, outlet
19Notethatouradjusted in the spirit modelis somewhat of McFadden's (1975)"universal logit"model,which Incontrast, includes characteristics of all products in theutility function fora particular we somewhat arbitrarily product. focuson a specific andprovide a structural modelgenerating thisadjustment. adjustment
? RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

778 / THERANDJOURNAL OFECONOMICS locations) is subjectto congestion:the entryby BK reducesthe numberof outlets MD has. This at the outletlevel confoundsthe observationthata new producthas entered.Standard "crowding" logit-basedmodels simply do not deal well with such congestion,hence the incorrectlypredicted price elasticities. We now presenta formalmodel of such retail crowdingor productcongestion that deals with this issue. If we were to take this model to the fast food data described above, price elasticities would not be identified--an intuitiveoutcome given the lack of any variationin prices. o A model of product congestion. Suppose that the productsof interestare sold througha retail marketconsisting of R retail outlets. As in the above example, we consider the standard case where marketsharesare observedat the productlevel: data at the retail outlet level are not observed. Modelling unobservedretail outlets is simply a way of motivatingour more general logit errors. Assume that each retail outlet sells only one of the wholesale products,and that productj is sold in Rj retailoutlets where >j Rj = R. The twist of our congestion model is that logit errorsrepresentidiosyncratic,unobservedconsumerpreferencesover retail outlets rather than over products. (In the next section we expand the model to one in which consumershave logit errorsbased aroundboth retailoutlets andproducts.)Precisely,the logit utility function for consumeri purchasingfrom retailoutlet r takes the form
Uijr = uj + ir,

where uj measuresmean productquality.A typical specificationfor uj is uj = XjS - apj + j, where (Xi, 4j) are productj's characteristics (observedand unobservedrespectively)and pj is its price. The important distinctionbetween this and a standard logit model is thatit containsEir, not Eij. Intuitively,Eir might capturethe fact that consumerslive differentdistances from the R retail outlets. Note how this model capturescongestion as new productsenterthe market.In the standard logit model, when new productsenterthe market,new Eijare drawnfor the new products.In the extreme version of our congestion model, where the numberof retail stores R does not change as new productsenter, there are no new unobservableterms drawn.The dimensionalityof the unobservedcharacteristic space remainsthe same as the new productssimply crowd out the old from retail stores. products To aggregatethe model to the level of observation(the productlevel), we need to aggregate over retailoutlets.The shareof productj is the sum of the sharesof all the retailoutletsthatcarry productj. As the probabilitythati buys fromr is the same acrossoutlets thatcarryj, the market sharefor productj is s = R e"i 1+ 1+ 'k Rkeuk (6)

euj+ln(Rj)

kek+ln(Rk)

(7)

Note thatthe differencebetween our congestion logit model and a standard logit model is simply the additionalterm ln(Rj) in the marketsharefunction. o Estimating the model. With individual-leveldata, (6) could be estimated by maximum likelihood. With aggregatedata,this model can be estimatedusing the Berry (1994) inversion: In((so)s) / =u

+ln(Rj).

In practice,one needs to parametrically specify Rj. In the simplestcase, whereeach product is sold in an equal numberof retail stores, we have Rj = R/J and we need only specify R. One
? RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

ANDRYSMAN / 779 ACKERBERG example is R = yo + 1J, where J is the numberof products.As scaling up R is unidentifiable from the constantterm in the utility function, a normalization is necessary,an obvious one being
R = y +(1 - y)J.

This results in the estimatingequation In S)= u + In(y/J + - y). (8)

(SO)/

This specificationis attractivein that it nests the pure logit model (y = 0) as well as the pure congestion model (y = 1). With y = 0, the numberof retail outlets (and correspondinglythe dimensionof SUPD) increasesproportionally to the numberof products,whereas with y = 1 it does not change in the numberof products.Intermediate cases are capturedby 0 < y < 1. Another suggestion for parameterizing the additiveterm is to let ln(Rj) = y In(J). In this case, y = 0 is still the standardlogit model and y = -1 is still a full crowding model (in the sense thatexpected welfare dependson observableproductcharacteristics but not the numberof A nice attribute this of is that in contrast to this (8), products). specification specificationcan be estimatedwith OLS or IV techniques.A drawback is thatthis specificationlacks a clear structural of the parameter. Given Last, note thatone mightestimateR(J) nonparametrically. interpretation that J is discrete, this is extremely simple: one just includes indicatorfunctions for different marketsize (with a normalization for one J).

o Extensions of the model. The assumptionthatall productsare sold by an equal numberof retailstoresmight not seem reasonable.However,given no dataon retailers,it is hardto imagine how one could intuitively separateout effects of product characteristicsand price on utilities versustheireffects on the numberof retailstorescarryingthe product.To formalizethis, suppose that
Rj = f(J)exjT', so that product characteristicsdo affect Rj. In this case, rl is not separatelyidentified from fl, the parametersin the utility function. With other specificationsof R , the differenteffects but this identificationwould be completely dependenton might be identified computationally, nonlinearities.As such, we suggest the specification where all products are sold by an equal numberof stores. The assumptionthatlogit errorsarenot correlatedfor the same productsold acrossdifferent outlets may also seem unreasonable. However,we can obtain a very similarestimatingequation in a model that relaxes this assumption.Suppose consumershave unobservedtastes over both productsand retail stores, i.e.,
Uijr = + Uj Uj+Ci+Pijr elj + P2,r -

and p is a weighting parameter that measuresthe relativeimportanceof the two unobservables. This formulationis very similar to the standardnested logit model. With the standardnested logit distributionalassumptions(Ei3rdistributedtype-I extreme value, Ei. distributedsuch that + type-I extremevalue), we get the following product-levelmarketshares: ijr distributed Elj ij pe2r [R exp
J1 +k
0 RAND 2005.

Ej is consumeri's product-specifictaste, e2r is consumeri's productretailoutlet-specific taste,

)]p
P

k I[exp )l(:()1

1)]

+ exp(uj p ln(Rj)) + Ek exp(uk+ p ln(Rk))'

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

780 / THERANDJOURNAL OFECONOMICS where Rj is the numberof retail stores in which productj is sold. Then we have the estimating equation, In sj = uj + p ln(Rj). (So) Considerthe specificationln(Rj) = y In(J). In this case, y and p are not separatelyidentified, only their productpy is. While a differentspecificationfor Rj might lead to separateidentification of p and y, it would again be based on a nonlinearity.This lack of identificationis not a drawback,because separatingthe parameters(e.g., p versus y) is irrelevantfor empirical or welfare implications.It means that our originalmodel is robustto unobservedtastes at both the productand retail store level. io Application to more-general discrete-choice models. The above subsectionsaddedcongestion to a simple logit model. We can similarlyadd congestion to more-realisticmodels such as nested logit and random-coefficients models. For example, consider the nested logit utility function:
Uijr = Uj + ?ig + Sir,

where jig is consumeri's idiosyncratictastes for productsin groupg. Note thatthis nested error term is defined over productgroupingsand not retail store groupings(since retail stores are not observable,one cannotgroupthem). The variable8ir is still a retail store-specific unobservable. With this utility function,productsharesare given by
[s

s= S1 = Sj gSg

+ kea Rke 1. L + LkEgj

R 2e

g (k (EkEgk

' Rke ) g ( kEgRkea

and estimationcan proceedusing the Berry (1994) inversion:20 In

(so)/

= uj + aIln(Rj) + (1 - a)lnsjlg.

Again, we need to parameterize ln(Rj) to estimatethis model. The simplest approachwould be to do exactly what we did in the logit model, specifying ln(Rj) as equal to eitherln(y/J + 1 - y) or y ln(J). As a more ambitiousand flexible alternative,one might want to allow congestion to varyacrossproducts.In otherwords,one mightexpect goods in one nest to crowdout (in termsof retail space) goods in the same nest more thangoods in differentnests. One could accommodate this possibilityby, e.g., allowing Rj to dependon the numberof productsin the nest as well as the total numberof products.Withmultiple-levelnested logit models or otherGEV models (e.g., the model of Bresnahan, Stem, andTrajtenberg (1997)), one could allow morecomplex Rj functions. In random-coefficients models like BLP, one could again simply add ln(y/J + 1 - y) or y ln(J) to the conditional (on randomcoefficients) marketshare equations. Again, while this allows congestion, it assumesthatthe congestionoccursequally acrossproducts.A more flexible approachmight let Rj be a weighted count of the numberof productsin the market,where the weights depend on how close other productsare to j in characteristic space. For instance, one could specify Rj as
J

Rj = 1 :((Xj
k=l

Xk)*(cov(X))-'(Xi

- Xk)),

20Notethatwe now(andin therestof the article) use Berry's we still use our (1994)normalization, although redefined a. Formally, to transform theseparameters in equation use (andthe parameters (9)) to Berry's parameters,
0 = 1 - rUBerry and0 = fBerry.
G RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

ACKERBERG ANDRYSMAN / 781 where0 is the normalprobability densityfunction.This specificationis similarin spiritto counting in same the nest products differentlythan productsin differentnests in the nested logit model. researchers Intuitively, might expect thatproductsthatareclose togetherin observedspace crowd each otherout more thanmore distantproducts;this specificationallows for this possibility.

4. MonteCarloresults
In this section we use Monte Carlo simulationsto study how standardlogit-based models we examine performwhen data are generatedaccordingto our congestion models. In particular, how well the standardmodels estimateprice elasticities and the welfare effects of new product We findpotentiallylargebiases in bothquantitiesacrossa varietyof specifications, introductions. suggesting thatignoringcongestion can be problematicin practice. o Nested logit model. The rows of Table 1 contain variousspecificationsof our congestion model in a nestedlogit framework. In all specifications,we simulatedatafroma very largenumber of markets(N = 5,000). Because of this large amountof data,thereis very little estimationerror in our estimates (and resulting elasticities), so these estimates can essentially be interpretedas asymptoticresults. In each market,there are between 1 and 10 products,distributeduniformly across this range. There are two nests in each market;the first contains all the inside products, the second contains only the outside alternative.In the base specification,price is drawnfrom a normal distributionwith mean 2 and variance .2.21The constant term in utility is 1 and the coefficient on price is -1. The nested logit parameter a is initially set at .8. As is standard, the utility from the outside alternativeis normalizedto zero. The variousspecificationsin the rows of the table change variousparameters of the model. The "nestedlogit" subrowscontainthe resultsof naivenestedlogit estimationon these data,using the standard Berry (1994) inversionwith the numberof productsin the market(J) and the mean in the market((1/J) characteristic Xj) as instrumentsfor the endogenoussjig.22 Because of the large amountof data,the "truth" subrowsin the tables arenot only the estimationresultsfrom our congestion models, but also the true values of these quantities(since we use a large amount of dataandthe truthis our congestion model). The columns of the table containvariouselasticity and welfare calculations at the estimates. Elasticities are computed for the mean marketwith J = 5 and Xj = 2 Vj. Cross-priceelasticity is (ask/aXj)(Xj/sk), outside-goodprice elasticity is (aso/aXj)(Xj/so). Welfareincrease refers to the percentageincrease in welfare moving from a marketwith 1 productto a marketwith 10 products. The firstrow of Table 1 containsresultsfor the full congestion model. In this model y = 1, i.e., the numberof retailoutletsdoes not changeas the numberof productsincreases.Naive nested logit estimationof this model gives extremelypoorresults.The nested logit estimatesthe average own-price elasticity to be -11.07, while the actualown-priceelasticity is -2.29. Within-group cross-priceelasticities are off by an order of magnitude,and estimates of across-group(to the outside alternative) price elasticities are about70% of theirtruevalue. While in actualitythereis no welfare gain moving from 1 productto 10 products(since in the full-congestion model new products"completely"crowd out the old ones), the nested logit estimatessuggest a gain of 20%. Interestingly,in this case the nested logit model does a reasonablejob at matchingwelfare gains (at least in an absolute sense), but a terriblejob at price elasticities.23 There is a clear intuition as to why, in the presence of congestion, standardestimation methodsareproneto overestimate acrosswithin-group cross-priceelasticities andunderestimate nested logit specificationunderestimates the nesting group cross-priceelasticities. The standard
21 Note that the McFaddenand Train (2000) result regardingthe generality of the mixed multinomiallogit (or model does not applyto datageneratedfrom our model. The reasonis thatin our congestion model, random-coefficients) the distribution of the unobservable termfor each wholesale productdependson the numberof otherwholesale products. 22 Half of this variationin price is within-market, half is across-market. 23 In generatingour data, we also allowed a scalarunobservedcharacteristic valued equally by consumers,$j, to generatean econometricerrorat the aggregatelevel (see Berry, 1994). The varianceof (j across productswas set at .5.
? RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

782 /

THE RAND JOURNALOF ECONOMICS TABLE1 Monte Carlo Resultsfor Nested Logit Model
OutsideOwn-Price Parameters r=1 (Full congestion) ' = .95 Estimator Truth Nested Logit Truth Nested Logit r = .80 Truth Nested Logit Truth Nested Logit r = .95 a = .2 r =.95 fo = 2 r = .95 = -2 r =.95 Mean(X) = 1 r = .95 Var(X) = .95 Truth Nested Logit Truth Nested Logit Truth Nested Logit Truth Nested Logit Truth Nested Logit Elasticity -2.29 -11.07 -2.28 -4.85 -2.25 -2.98 -2.21 -2.42 -8.28 -1.39 -2.18 -4.95 -4.75 -7.87 -1.09 -2.39 -2.28 -3.51 Cross-Price Elasticity .21 254 .22 .97 .25 .49 .29 .36 1.71 2.29 .31 1.04 .24 1.26 .16 .52 .22 .61 Good Price Elasticity .11 .07 .12 .09 .15 .13 .19 .18 .11 .10 .21 .16 .04 .03 .11 .08 .12 .10 Welfare Increase (%) 0 20.00 28.40 60.60 94.30 146.40 184.50 233.20 6.50 30.90 23.00 48.20 33.50 110.10 23.00 48.20 28.40 110.50

r= .5

fi

parametera (e.g., in row 1 the standardnested logit model estimates a = .093 while in truth, a = .8). Consideragain the estimatingequationunderour adjustment:

ln s)

apj

+ (1 - a)ln(sjg)

ln(Rj(J))

4j.

(9)

The standard approachignores the terma ln(Rj(J)). Recall that Rj(J) will decline in J if there is any congestion, i.e., if the numberof retail stores in which productj is sold declines in J. Typically the within-groupshare,ln(sjlg), will also decline in J, so the omitted variablewill be with ln(sjlg) (andone of the instruments for In(sjig),J). This will tendto bias positively correlated of a suggests nested logit model. The underestimate the estimateof a downwardin the standard too much insulationbetween groups. As such, across-groupsubstitutionis estimatedto be too weak, and within-groupsubstitutiontoo strong. the parameters Rows 2 through9 perturb of the model. In rows 2 through4, the congestion parametery is varied. As would be expected, the nested logit estimates are closer to the truth as y decreases (recall that y = 0 implies no congestion, i.e., the standardnested logit model is the truth).However, even at y = .5, there are still significantbiases in the nested logit results. Row 5 changes the nesting parametera from .8 to .2. While the nested logit does a bit better it does worse with welfare predictions.Rows 6 through8 on price elasticities (proportionally), respectivelychange the constanttermin the utility function,the slope termin the utility function, and the mean of price. The large biases in price elasticities and welfare calculationscontinue to persist. In the last row of the table, the varianceof price is increasedin the simulateddataset. Interestingly,estimates of price elasticities get considerablybetter,while estimates of welfare changes worsen.We believe the intuitionbehindthis resultis thatincreasingthe variancein price increases the data's informationon the second comparativestatic (in Section 2) relative to the third comparativestatic. This will tend to move parameterssuch that the second comparative
0 RAND
2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

ACKERBERGAND RYSMAN / TABLE2 Model (RCM) Monte Carlo Resultsfor Random-Coefficients


OutsideGood Price Elasticity .08 .07 .11 .11 .17 .17 .06 .05 .35 .28 .02 .01 .21 .17 .08 .07 Welfare Increase (%) 5.1 29.20 19.40 41.90 44.10 58.40 3.80 22.60 14.50 62.30 .80 6.10 15.40 66.20 5.10 30.80

783

Own-Price Parameters r = .95 Estimator Truth RCM Truth RCM Truth RCM Truth RCM Truth RCM Truth RCM Truth RCM Truth RCM Elasticity -4.31 -9.69 -4.28 -5.66 -4.23 -4.6 -4.55 -1.14 -4.06 -9.34 -6.38 -14.21 -2.05 -3.65 -4.31 -6.13

Cross-Price Elasticity .09 1.7 .12 .62 .19 .35 .08 1.76 .38 1.89 .02 2.69 .21 .69 .09 .76

r = .80 r = .5

r = .95 o1 = .4 r = .95

#o = 4
r = .95 fl = -3 r = .95 Mean(X) = 1 r = .95 Var(X) = .95

static is more closely satisfied,but the thirdcomparativestatic is less closely satisfied.Since the second comparativestatic is directly relatedto elasticities, while the thirdcomparativestatic is more relatedto welfare changes due to changes in the numberof products,this improvesprice elasticity estimates,but worsens estimatesof welfareeffects. o Random-coefficients (BLP) model. We also simulatea random-coefficients logit model. This is the type of model used in BLP.We again use 5,000 marketswith the numberof products distributed with mean 2 uniformlyfrom 1 to 10. Price is again drawnfrom a normaldistribution and variance .2. The constantterm in the utility function is initially set at 2. We allow a random coefficient on price, equal to 01 = exp(op,z), where z is a standard normal,and initially,f1l = -2 and or, = .2. We impose that the data is drawn from a crowding model with crowding term In(y/J + 1 - y). In Table2, rows marked"RCM"correspondto estimatesof a naive BLP-style model with a regularlogit error(using a constant,Xj, J, and (1/J) E Xj as random-coefficients Rows marked"truth" model with instruments). correspondto estimatesof a random-coefficients our more flexible logit error(again, this correspondsto the trueelasticities and welfare effects). Examiningthe table,the firstrowconsidersthecase wherey = .95. As in the nestedlogit case, both elasticities and welfare calculationsare considerablybiased with the naive RCM. Again, as we lower the level of crowding,the standard RCMdoes better,butthereare still significantbiases when y = .5. The remainingrows in the tableagainperturb the parameters of the model, andagain not much changes:biases in bothprice elasticity andwelfarecalculationspersist.As in the nested in the last row improvesprice logit results, increasingthe varianceof the observedcharacteristic elasticities but worsens welfare calculations.The worseningof welfare calculationsis marginal, increases, though.This may be due to the fact thatas the varianceof the observablecharacteristic the relativeimportance of logit errorsin the model decreases.One would expect this effect to tend to improveboth price elasticities and welfare calculations,perhapscounteractingthe worsening of welfarecalculationsdue to the comparative staticeffect.24In summary, ourMonteCarloresults
24This does match the fast food franchise story in Section 3, where the nested logit model predictsa = 0, thus correctlymeasuringthe welfare gains due to the entryof B K to be zero.
0 RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

784 / THERANDJOURNAL OFECONOMICS suggest that ignoring possible congestion and using standardlogit errorscan significantlybias estimatesof price elasticities and welfare effects, even in random-coefficient, BLP-style models.

5. Empirical example m We end with an empiricalexample. Rysman (2004) studies a dataseton the Yellow Pages
industry,measuringthe positive feedbackloop between consumers'choice of directory(which is drivenby the amountof advertisingin the directory)andretailers'placementof advertisements in directories(which is drivenby consumerusagepatterns). the consumer's models decision Rysman as a discretechoice between availabledirectoriesandanunspecifiedoutsideoption.He observesa cross-sectionof directoriesandusage behaviorwhereconsumersin differentgeographicmarkets have access to differentnumbers of directories.Figure1 shows thepercentageof consumersserved of The variancein this numberof directoriesmakes this is a numbers directories. different by the natural to place apply techniquespresentedin this article.25 Correctlyestimatingthe elasticity for measuringthe importanceof of usage to the quantityof advertisingin a directoryis important the feedbackloop. In addition,correctlymeasuringthe welfarebenefitsof competingdirectories for the policy questionstudiedin the article.26 is important The datasetconsists of observationson the numberof uses, per household,per month,in the distributionareas of 428 directoriesin 1996.27We assume that a representative consumerneeds informationof the kind she could find in the Yellow Pages M times per month. The exogenous parameterM is constantacross markets.Each time a consumerneeds information,she can use one of the Yellow Pages in the areaor turnto the outside option. The utility to consumeri from using directoryj is
Uij = j1

ln(Aj) +

Xj2

+j + ij.

The variable Aj is the quantityof advertisingat directoryj, and the matrix Xj representsdefactors mographicvariablesthatmay affect usage.28The variable4j representsdirectory-specific that are unobservableto the econometrician,such as the quality of the book or regional usage habits. We estimate this model and a model with our adjustment.A complicating factor is that areasoverlapwith each other.A directorymay face no competitorsfor Yellow Pages distribution some of its consumersand one or more competitorsfor anothergroupof consumers.Although we observethese distribution areas,we cannotdistinguishhow muchusage comes from different area. portionsof a directory'sdistribution We observesj (the market Even so, implementingthe simple logit model is straightforward. in directoryj's totalmarket, sharefor directoryj) andso (themarketsharefor the outsideoption)29 and submarkets (areasof a directory'smarketthatare servedby a uniformset of directories)are Underthe logit model, the ratio alternative." distinguishedonly by the presenceof an "irrelevant the of these so is of alternatives, presence sj /so is the same in each submarket. sj /so independent
25 That is, the effect that increasingthe varianceof X will tend to make the estimatedmodel match the second comparativestatic betterand the thirdcomparativestatic worse. 26 Since Yellow Pages arenot sold throughretailstores,thereis no literalretailcongestionin this market.However, one can thinkof our congestion model as capturingthe possibility thathouseholdshave a limited amountof bookshelf or drawerspace, and throwout books that don't fit. 27The policy question is whetheror not welfare improves as competitionincreases. Multiple directoriesreduce marketpower but dissipate networkeffects. Rysman also estimates retailerdemand for advertisingand a publisher's first-order conditionfor setting the quantityof advertising.Here, we focus only on the consumer'sdecision. 28The datawere collected by NationalYellow Pages Monitor.NYPM surveyrespondents maintaindiariesof their Yellow Pages usage for one week. NYPM normallysurveysbetween 1,000 and 3,000 people per MSA, althoughit used even for very small 11,200 respondentsin the Los Angeles area.This usuallyresultsin at least a few hundred respondents directories. 29As a measureof advertising,Rysman uses the numberof pages in a book times the numberof columns in a directory.The numberis multipliedby .8 for directoriesthat are observablysmallerthan a standarddirectory.For Xj, each directoryis associatedwith a centralcounty,and Xj comes from county-levelcensus data. C RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

ANDRYSMAN / 785 ACKERBERG


FIGURE1 NUMBER PER PERSON OF DIRECTORIES
50

40
0

39.1

38.1

30
E
0) aL

20

16.1

15.1
1.2

0.2 6

0.1 7

0.1 8

Number

Therefore,we can use the standard logit equation.For the simple logit model, we estimate ln(sj) - In(so)= a ln(Aj) + Xj[p + j. To implementthe crowding model, we take the crowding term to be the populationweighted In thatcase, we estimate averageof Rj across submarkets.
+j, In(sj) - In(so)= a ln(Aj) + Xjf + ln(Rj) +

where

Rj= Y *jk kEK(j) R(k).


Here, K(j) is the set of submarketsin j's marketarea, /jk is the percentageof j's population that lives in submarket k. k, and J(k) is the numberof productsin submarket Weuse two specificationsof the crowdingtermRj. The firstis theparameterization suggested in Section 3: Rj = -(y +(1 - y)J)/J. The second specificationis nonparametric; we allow the Rj to take on differentvalues for each J. We observevery few marketswith more then 5 directories, in the so we restrictmarketswith 6, 7, or 8 directoriesto have the same adjustmentparameter case. We estimate both specifications by the generalized method of moments nonparametric as in Rysman (2004). (Hansen, 1982) using the same set of instruments Results appearin Table 3. Parameter estimates suggest that congestion is important.In the case, y = .62 and is precisely measured.Recall that y = 0 implies no crowding and parametric for the crowdingtermare close case, the parameters y = 1 is full crowding.In the nonparametric to being monotonic in J and decrease at a decreasingrate. Wald tests reject the joint equality of the estimates for different J. Regardingestimates of the other parameters, the two crowding models findcoefficientscloser to zero thanthe simple logit model, presumably to compensatefor the effect of the crowdingtermson elasticities. Table4 presentselasticity and welfareestimates.The columns on the left presentelasticities of usage with respect to advertising.While differences across are not tremendouslylarge, there are some differences between the models. First, it appearsthat the standardlogit specification overestimateselasticitiesby 10%to 20%.Second, the standard logit modelunderpredicts changes in elasticities as the numberof productsincreases.When the numberof productsgoes from 1 to 8, the standard logit model shows thatelasticity increasesby 14%,whereasthe crowdingmodels both find that elasticity increasesby 23%. This coincides with our intuitionabouthow standard logit-based models restrictthe extent to which crowding can occur as the numberof products increases.
? RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

OFECONOMICS 786 / THERANDJOURNAL


TABLE3 EstimationResultsfor YellowPages Data
Parametric Standard Variable Advertising Constant % urbanpopulation % lived in differentcounty % lived in differentstate % own house %graduated high school % graduatedcollege Per-capitaincome Telco book Countypopulationgrowth % take public transporation %have not moved Populationdensity Gamma Adjustment J=1 J=2 J= 3 J= 4 J=5 J = 6, 7, 8 Coefficient .705 -6.08 -.023 .078 .047 -.019 -.042 -.015 .029 1.156 .003 -.035 .072 -1.11E-0 Error Standard (.069) (1.07) (.006) (.015) (.020) (.012) (.014) (.016) (.021) (.103) (.016) (.030) (.016) (3.88E-05) CrowdingTerm Coefficient .631 -4.92 -.016 .058 .031 -.020 -.032 -.023 .032 1.050 .012 - .023 .047 -8.39E-05 .616 Error Standard (.070) (1.01) (.005) (.013) (.017) (.011) (.012) (.014) (.018) (.100) (.014) (.027) (.015) (3.43E-05) (.120) 0 -.350 -.343 -.743 -.865 -.967 Fixed (.142) (.177) (.217) (.308) (.364) Nonparametric CrowdingTerm Coefficient .632 -4.94 -.013 .061 .027 -.021 -.040 -.007 .022 1.018 .016 - .041 .054 -7.20E-0 Error Standard (.073) (1.17) (.005) (.016) (.023) (.012) (.013) (.016) (.022) (.103) (.015) (.034) (.019) (3.50E-05)

More striking are the welfare calculations.The logit model predicts that even the 7th and 8th Yellow Pages directoriesimply nontrivialwelfare increases,over a thirdof what the first directory generates.On the other hand, the crowding model implies much lower benefits from new model finds that welfare increases directories.When going from 1 to 8 directories,the standard increases welfare the Under 400%. over by 180%and 146%for the paracrowdingmodels, by cases. These ratesof increaseare precisely measuredand significantly metricand nonparametric model actuallyfinds that welfare decreases differentacross models. Note thatthe nonparametric for when going from 3 to 4 directories.The possibility that welfare actually increases is well within confidenceintervalsfor these estimates, and this resultdisappearswhen we parameterize
TABLE4 SummaryVariablesfor YellowPages Data
Elasticity Firms 1 2 3 4 5 6 7 8 Increase(%) C RAND 2005. Standard .55 .58 .60 .61 .62 .63 .64 .64 (.052) (.056) (.058) (.059) (.060) (.061) (.062) (.062) Parametric .45 .52 .55 .56 .57 .57 .58 .58 (.053) (.057) (.059) (.060) (.061) (.062) (.063) (.063) Nonparametric .45 .52 .54 .57 .58 .58 .59 .59 (.054) (.060) (.060) (.064) (.066) (.066) (.066) (.067) Standard .20 .36 .51 .63 .74 .84 .93 1.02 410.5 (.007) (.012) (.015) (.018) (.020) (.022) (.023) (.024) (5.4) Firms'Welfare Parametric .28 .37 .45 .52 .59 .66 .72 .78 180.5 (.025) (.012) (.019) (.032) (.044) (.055) (.064) (.073) (49.4) Nonparametric .27 .36 .50 .46 .50 .53 .60 .66 146.1 (.026) (.026) (.044) (.061) (.108) (.136) (.149) (.159) (67.3)

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

ANDRYSMAN / 787 ACKERBERG the crowding function. For assessing welfare gains to new products,and to a lesser extent in estimating advertisingelasticities, standardlogit-based models appearto give biased results in this data

6. An alternativeapproach
m In this section we brieflypropose an alternativeapproachto allowing crowdingin standard discrete-choice models. Intuitively,this approachtries to model a situation where additional firms entering the marketdifferentiateinto dimensions of unobservedcharacteristicspace that consumerscare less about.This seems to makeintuitivesense; for example,in a marketwith only a few breakfastcereals, cereals may be primarilydifferentiated by how healthy they are or how differentiated crunchythey are.In a marketwith manycereals,cerealsmay be primarily only by the on theirboxes, likely a less important characters characteristic. andRysman(2003) (see Ackerberg constructa structural model exhibiting this property.Applied www.rje.org/main/sup-mat.html), to a basic logit model, this model generatesmarketsharesof the form
exp
=

(gJ,r)

1+ Ek

exp

Jr)

where g(J; r) is a function of the numberof productsin the marketand a parameterr. Note the similaritybetween this model and the model of Section 3. Both allow J to enter the market share equation:the former adjusts the equation multiplicatively,the latter adjustsit additively. This multiplicativeadjustment essentially allows the varianceof the logit errorsto dependon J. and show that the implicationsof this "multiplicative" (2003) Ackerberg Rysman approachare similar to what we derived above for the "additive" A that in J decreases very approach. gt(J, r) that welfare of in benefits new markets crowded are that elasticities attenuated, implies products increasein more crowdedmarkets(relativeto a case withouta crowdingterm),and thatthe three comparativestatics from Section 2 can be matchedwith the additionalparameterr. They also discuss estimationand studythe impactof the adjustment in Monte Carlostudiessimilarto those here.

7. Conclusion
This articlehighlightsproblemsthatariseas a resultof the way thatstandard discrete-choice ? models handle symmetricunobservedproductdifferentiation. We show that restrictiveassumptions aboutthe relationshipbetween the numberof productsin a marketand the dimensionality of unobservedcharacteristicspace can lead to significantlybiased estimates of elasticities and welfarechanges.We suggest a straightforward thatintroducesthe numberof products adjustment in a marketinto the estimatingequation.We presenta structural of our solutions, interpretation showing how it could arise from an agent maximizationproblem.We end with Monte Carlo and in practice. empiricalevidence showing thatthis issue can be important

References
J.-F.Discrete Choice Theoryof ProductDifferentiation.Cambridge,Mass.: S., DE PALMA, A., AND THISSE, MIT Press, 1992. Action in Higher Education:How Do Admission and FinancialAid Rules Affect Future P. "Affirmative ARCIDIACONO, Econometrica,Vol. 73 (2005), pp. 1477-1524. Earnings?" C.L."DiscreteChoice Models as Structural Models of Demand:Some Economic Implications BAJARI, BENKARD, P. AND of CommonApproaches." WorkingPaper,StanfordUniversity,2001. AND. "DemandEstimationwith HeterogeneousConsumersand UnobservedProductCharacteristics: A Hedonic Approach." NBER WorkingPaperno. 272, 2003. DiscreteChoice Modelsof ProductDifferentiation." RANDJournalofEconomics,Vol.25 (1994), S.T. "Estimating BERRY, pp. 242-262.
ANDERSON, C RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

788 -

THE RAND JOURNAL OF ECONOMICS

AND A. "Estimating the PureHedonic Discrete Choice Model."WorkingPaper,Yale University,1999. PAKES, AND J. "FreeEntryand Social Inefficiencyin Radio Broadcasting." RANDJournalof Economics, WALDFOGEL, Vol. 30 (1999), pp. 397-420. A. "Automobile Pricesin MarketEquilibrium." J.A., AND LEVINSOHN, PAKES, Econometrica,Vol. 63 (1995), pp. --, 841-890. T.F."Competition andCollusionin the AmericanAuto Industry: The 1955 PriceWar." JournalofIndustrial BRESNAHAN, Economics, Vol. 35 (1987), pp. 457-482. M. "MarketSegmentationand the Sources of Rents from Innovation:Personal STERN, S., ANDTRAJTENBERG, --, Computersin the Late 1980s"'RANDJournalof Economics,Vol. 28 (1997), pp. S17-S44. N.S. "Variance for the ExtremeValueand Logistic Distributionswith Applicationsto CARDELL, ComponentsStructures Models of Heterogeneity." EconometricTheory,Vol. 13 (1997), pp. 185-213. G. "The Impactof the 1992 Cable Act on HouseholdDemand and Welfare." RANDJournal of Economics, CRAWFORD, Vol. 31 (2000), pp. 422-449. AND M. "Uncertainty andLearningin Pharmaceutical Demand." SHUM, Econometrica,Vol. 73 (2005), pp. 11371173. R.C. ANDLEVINSOHN, J.A. "EstimatingMarkupsand MarketConductwith MultidimensionalProductAtFEENSTRA, tributes." Reviewof EconomicStudies,Vol. 62 (1995), pp. 19-52. D.A. ANDGREENE, W.H."Specification andEstimationof the NestedLogit Model:Alternative HENSHER, Normalization." Research,Part B-Methodological, Vol. 36 (2002), pp. 1-17. Transportation P. "PriceDiscriminationin BroadwayTheater?'." RANDJournalof Economics,Vol. 35 (2004), pp. 520-541. LESLIE, D. "ConditionalLogit Analysis of QualitativeChoice Behavior." In P. Zarembka,ed., Frontiersin EconoMCFADDEN, metrics.New York:Academic Press, 1974. and Simultaneityin Transportation DemandAnalysis."WorkingPaperno. 7511, . "On Independence,Structure, and TrafficEngineering,UC Berkeley, 1975. Instituteof Transportation AND K. "MixedMNL Models for DiscreteResponse."Journalof AppliedEconomics,Vol. 15 (2000), pp. TRAIN, 447-470. A. "Measuring MarketPower in the Ready-to-EatCerealIndustry." NEVO, Econometrica,Vol. 69 (2001), pp. 307-342. A. "A Reconsiderationof Hedonic Price Indexes with an Applicationto PCs?'AmericanEconomicReview, Vol. PAKEs, 93 (2003), pp. 1578-1596. A. "Quantifying the Benefitsof New Products: The Case of the Minivan." Journalof Political Economy,Vol. 110 PETRIN, (2002), pp. 705-729. Between Networks:A Study of the Marketfor Yellow Pages."Review of Economic Studies, RYSMAN,M. "Competition Vol. 71 (2004), pp. 483-512. M. "Does AdvertisingOvercomeBrandLoyalty?Evidence from the Breakfast-Cereals Journalof EcoSHUM, Market." nomics and ManagementStrategy,Vol. 13 (2004), pp. 241-272. RAND Journal of Economics, Vol. 34 (2003), pp. TowN, R. ANDLIU, S. "The Welfare Impact of Medicare HMOs"?' 719-736. M. Economic Analysis of Product Innovation:The Case of CT Scanners. Cambridge,Mass.: Harvard TRAJTENBERG, UniversityPress, 1990. --

? RAND 2005.

This content downloaded from 137.224.252.10 on Fri, 20 Dec 2013 10:52:10 AM All use subject to JSTOR Terms and Conditions

You might also like