Professional Documents
Culture Documents
Miguel Santana
1 Summary
In “Salience and Taxation: Theory and Evidence” (2009), Raj Chetty, Adam Looney and Kory Kroft set out to investigate
one of the main tenets of orthodox public economics: consumers fully optimize with respect to tax-inclusive prices, so
that taxes only affect demand insofar as they affect market prices. For the European economist, this sentence may seem
a truism: after all, posted prices already include taxes, so it cannot be that consumers take into account anything other
than the tax-inclusive price. However, in the United States prices are usually posted without the sales tax. This means
consumers are only explicitly informed about the full price when they are about to do the payment at the register. As
a consequence, there is scope for incomplete optimisation, since consumer choice is subject to two stages: first, decisions
are made without clear information on the prices paid; then, after buyers have already made a preliminary decision on
quantities, they are informed about full expenditure. This lends itself to the study of salience effects: do non-salient taxes
influence demand through channels other than their quantitative impact on the prices paid? To answer this question,
consider the demand equation
ln x(p, τ s ) = α − εx,p ln p − εx,1+τ s ln(1 + τ s ), (1)
where p is the pre-tax price, τ s is the non-salient tax, x(p, τ s ) is the demand for good x at pre-tax price p and non-salient
tax τ s , and εx,(.) are elasticities. If it is true that salience matters, the slope coefficients should not be equal. Therefore,
the hypothesis of incomplete optimisation is
∂ ln x(p,τ s )
∂ ln(1+τ s ) εx,1+τ s
θτ ≡ ∂ ln x(p,τ s )
= 6= 1. (2)
εx,p
∂ ln p
Note that ignorance of tax rates cannot account for these effects: in that case, myopia is simply being added to the
standard model. θτ 6= 1 can only happen if fully informed consumers under- or overreact to non-salient taxes.
To test this hypothesis, the authors take on two empirical strategies. The first strategy consists of estimating treatment
effects in data from an experiment at a grocery store. During a three-week period in 2006, 750 products in an aisle of
a supermarket in the United States had their price tags changed so as to include information on the full price. Using
two control groups - other products in the same aisle and the same products in other supermarkets in nearby cities -,
the authors perform differences-in-differences(-in-differences) estimation to obtain a point estimate of θˆτ = 0.35, which is
1
2 Results 2
statistically significant and suggests consumers underreact to taxes. The second strategy relies on observational data for
alcohol consumption, and is motivated by the possibility that the results of the experiment described above were driven
not by differences in tax salience, but by other mechanisms through which price tags themselves may act. In a panel of 50
states (plus the District of Columbia) over 34 years, the authors analyse the effect of changes in excise beer taxes - which
are salient - and in local sales taxes - which are not - on beer consumption per capita, effectively estimating tax-demand
curves. After controlling for several factors and performing robustness checks, the average point estimate is θˆτ = 0.06, an
even more extreme result than in the experimental setting.
The final sections of the paper turn to the implications of salience effects on theory. First, the authors briefly discuss
what kind of positive theories may be associated with underreaction to taxation. Then, they adapt the traditional
formulas of tax incidence and excess burden of taxation to account for incomplete optimisation. Here, it is shown that
under additional assumptions, some standard results no longer hold. These include the independence of statutory and
economic incidence and the positive relation between elasticities and incidence. Chetty and co-authors finish by posing
that the adapted formulas can be used in other settings, such as complex social programs with non-salient features.
This review relies on the second empirical strategy, the estimation of tax-demand curves. Section 2 replicates and
discusses the results. Section 3 questions the key identifying assumptions and provides an alternative way to show that
the introduction of stricter regulations on drunk driving does not affect the tax elasticities. Section 4 concludes.
2 Results
E
4 ln beerpcst = α − εx,1+τ E ln(1 + τst s
) − εx,1+τ s ln(1 + τst ) + x0 st ρ + εst , s = 1, ..., 49, t = 1971, ..., 2003 (3)
E s
where beerpcst is beer consumption per capita, 1 + τst is the gross-of-excise-tax price, 1 + τst is the gross-of-sales-tax
price, and xst is a vector of covariates. Two states must be excluded from the sample: Hawaii and West Virginia. The
exclusion of the former is due to the fact that it levies no sales tax; the latter is omitted due to low correlation between the
sales tax and state tax revenues. Since identification requires significant variation in the sales tax, the authors regress the
percentual change in tax revenues on the percentual change in the sales tax, controlling for state income. Low correlation
may indicate frequent changes in the tax base, leading to imprecise estimates for the elasticities.
Table 1 and Figures 1 and 2, which replicate, respectively, Table 5 and Figures 2A and 2B of the paper, are instructive.
Summary statistics for the pooled dataset show that excise and sales taxes are not a negligible fraction of the price paid.
Indeed, excise and sales taxes represent, on average, 6.5% and 4.5% of the price, respectively. On the other hand, the
figures make us suspect that the relation we had inferred from the grocery store experiment is still present. These plot
the mean log changes in beer consumption per capita against rounded log changes in taxes1 . Since the slope of the fitted
line in figure 1, which considers the excise tax, is considerably higher than the slope of the fitted line in figure 2, which
considers the sales tax, we still expect to find that consumers underreact to non-salient taxes.
1 Mean log changes in beer consumption are computed as averages across those state-year pairs with the same rounded gross-of-tax price (to
Table 2, analogous to Table 6 of the paper, presents the estimates of the model of equation (3) and the p-value of an
F-test for the equality of the tax-elasticities. As indicated in the body of the paper, standard errors should be clustered
by state, even though the reported results in the paper are not generated by clustering. Since doing so adjusts for serial
correlation, we nevertheless adopt that practice2 . Column (1) refers to a baseline regression which includes only year fixed
effects and log changes in population as covariates. The remainder of the table estimates the same model with additional
covariates. Column (2) controls for the business cycle by including log changes in income per capita and the unemployment
rate. Column (3) includes dummies for the imposition of stricter drunk driving regulations as regressors, for the excise tax,
being a policy instrument, may be correlated with other measures thought to reduce incentives for alcohol consumption.
Finally, column (4) controls for the fact that changes in the excise tax rate may be related to evolution in social norms by
introducing region fixed effects into the underlying model. Using clustered standard errors, the null hypothesis is rejected
at a 5% significance level for all specifications except that of column (1). Furthermore, the direction of the alternative
hypothesis is as we suspected: the excise-tax-elasticity is always higher (in absolute value) than the sales-tax-elasticity.
Note, however, that we can never reject the null hypothesis that the sales tax elasticity is equal to zero for any statistically
sane significance level.
Table 3, which replicates all columns of Table 7 of the paper with the exception of column (3) (for which no data is
publicly available), reports results for robustness checks. Since the excise tax rate in the baseline regression is converted to
ad valorem form by dividing the CPI-adjusted tax in dollars by the average national price of beer in year 2000, there is the
possibility that some of the variation in the excise tax is being incorrectly captured as a policy measure. Instead, it might
be due to inflation-induced erosion. As such, column (1) reports results for a model where the authors instrument for the
ad valorem rate using a time-invariant excise tax rate, which is generated by dividing the dollar amount of the excise tax
by the sample average of the national price of beer. Column (2) estimates the model with 3-year differences to account
for “learning effects”, acknowledging the fact it may take time for consumers to realise tax rates had changed. Column (3)
reports results for the subsample of states where food, a reasonable substitute for beer, is exempt from the sales tax: in
that case, tax changes effectively induce changes in relative prices, so that if insensitivity to the sales tax is still present, it
cannot be due to dilution of beer consumption in a broad base. Column (4) accounts for substitution towards other forms
of alcohol consumption by changing the dependent variable to share of beer in total ethanol consumption. As the authors
point out, because the excise taxes are highly correlated across these beverages, this share is unaffected by the beer tax
rate; i.e., no substitution is taking place. As for the remaining robustness checks, columns (1) and (3) imply rejection of
the null hypothesis at a 5% significance level, while column (2) does not. Furthermore, we still have the result that the
sales tax elasticity is not statistically significant.
2 This implies that this replication will have different values for the p-values of the individual significance tests, as well as of the above-
mentioned F-test. The values presented in the paper can be recovered by omitting the clustering option in the Stata do-file.
3 Discussion and Further Estimation 4
• The instrument for the excise tax rate should be policy-related, but cannot display any correlation with quantity
demanded. One possibility is the lagged state budget deficit. When budget deficits are high, it is reasonable to
expect that governments raise taxes. What is not to be expected, however, is that demand for beer is correlated
with the budget deficit.
• The instrument for the price should solely be supply-related, in the spirit of Working (1927). Here, we could use
data on barley production by the relevant suppliers to instrument for the price. Note that this would imply a
state-by-state analysis. If a single firm produces and sells beer for all the US, then we are back to Chetty et al.’s
assumption, and there ceases to exist a need to instrument for the price.
Lack of data availability has pre-empted the possibility of engaging in such estimation procedures.
this is another underlying assumption; since demand in equation (1) is not a function of income, a possible representation of preferences is the
quasilinear utility function, where general equilibrium effects are absent.
5 Note the p-value for the individual significance test of the coefficient associated with the indicator for introduction of this law in column
(3) of Table 2.
4 Conclusion 5
and automatically suspended independent of criminal proceedings whenever a driver either refuses to submit to chemical
testing, or submits to testing with results indicating a blood alcohol content of 0.08% or higher.
Throughout time, more and more states implemented the ALR law. However, nine of them never did. Hence, we can
use this subsample as a control group to estimate the effects of such implementation on beer consumption per capita. Since
we have a panel data structure, we are in principle apt to perform differences-in-differences estimation. Nevertheless, a
generalisation is needed: as different states introduce the law in different years, the traditional DD regression must include
year dummies, and the interaction term must be replaced by a policy variable, as suggested by Imbens and Wooldridge
(2007). Hence, the regression is
beerpcsgt = α + βtreatg + y0 δ + γpolicysgt + x0st ρ, s = 1, ..., 49, g = 0, 1, t = 1970, ..., 2003, (4)
where g = 1 stands for the treatment group, beerpcsgt is per capita beer consumption, treatg is an indicator for inclusion in
the treatment group, y is a vector of year dummies, policysgt is an indicator for presence of the ALR law in the state-year
pair, and xst is a vector of covariates which includes the excise tax in dollars, the sales tax, state income per capita, the
state unemployment rate, state population, dummies for the presence of additional drunk driving regulations and region
fixed effects.
Column (1) of Table 4 reports the estimated coefficients. For a regression including all these controls, γ is not
statistically different from zero for any sensible significance level. However, neither are most of the other alcohol regulation
indicators (with the exception of the lower .02 BAC limit for youth drivers), income per capita, the sales tax, the
unemployment rate, population, and the region fixed effects. Nevertheless, we still do not obtain a statistically significant
estimate for γ if we exclude all these controls, as can be seen from column (2) of Table 4. Hence, we cannot reject the
null hypothesis that the ALR law has no influence on beer per capita consumption. Here the identification crucially
depends, as usual, on the “common trends” assumption, but in this context this needs to be verified only conditional on
the covariates.
Note also that by having multiple time-periods, we can perform a test for Granger causality by including leads and
lags of the policy variable in the regression, as Angrist and Pischke (2008) suggest. The results are given in column (3)
of Table 4 for 3-year lags and leads. We see that none of the leads of the policy variable are significant, allowing us not
to reject the null of Granger causality. Furthermore, the coefficient associated with the 2-year lag is statistically different
from zero at a 10% significance level (and negative, as one might expect), suggesting beer consumption may be lagging
the introduction of the ALR law.
4 Conclusion
In this review, we have replicated the results reported in Section III of Chetty et al. (2009). When we indeed cluster
standard errors by state, as the authors suggest in the paper, we find that there are changes in the test statistic for the
F-test of equality of tax-elasticities. In addition, we note that the sales tax is insignificant in all specifications, leading to
possible inefficiencies in estimation if the specification is otherwise correct.
4 Conclusion 6
The assumption of perfectly elastic supply at the state-level may be put into question, since the relevant robustness
check used the excise tax as an instrument. In a context of optimal taxation, this ceases to be methodologically sane.
Accordingly, we suggest two possible instruments to consistently estimate the demand equation (3). Finally, we perform
differences-in-differences estimation to analyse the effects of ALR laws on beer consumption per capita. If anything, we
find that such laws have a lagged effect on consumption.
5 Appendix 7
5 Appendix
Figure 1
Figure 2
Table 5 - Summary Statistics for State Beer Consumption, Taxes, and Regulation
Table 3 - Effect of Excise and Sales Taxes on Beer Consumption: Robustness Checks
Dependent variable: Change in log(per capita beer consumption) Dep. var.
IV for excise 3-Year Food Share ethanol
with policy rate differences exempt from beer
(1) (2) (3) (4)
References
[1] Angrist, J., Pischke, J.-S., (2008) Mostly Harmless Econometrics: An Empiricist’s Companion, Draft Manuscript
[2] Chetty, R., Looney, A., Kroft, K., (2009) Salience and Taxation: Theory and Evidence, American Economic Review,
vol. 99, issue 4, 1145-1177
[3] Imbens, G., Wooldridge, J., (2007) NBER Summer Institute 2007 Methods Lectures, Lecture 10 ( available at
http://www.nber.org/WNE/lect_10_diffindiffs.pdf)
[4] Working, E.J., What Do Statistical "Demand Curves" Show?, The Quarterly Journal of Economics, 1927, vol. 41, issue
2, 212-235