Professional Documents
Culture Documents
To cite this article: Steven B. Caudill & Franklin G. Mixon Jr. (2016) Estimating class-specific
parametric models using finite mixtures: an application to a hedonic model of wine prices,
Journal of Applied Statistics, 43:7, 1253-1261, DOI: 10.1080/02664763.2015.1094036
Article views: 35
Download by: [Orta Dogu Teknik Universitesi] Date: 08 April 2016, At: 03:12
Journal of Applied Statistics, 2016
Vol. 43, No. 7, 1253–1261, http://dx.doi.org/10.1080/02664763.2015.1094036
Hedonic price models are commonly used in the study of markets for various goods, most notably those
for wine, art, and jewelry. These models were developed to estimate implicit prices of product attributes
within a given product class, where in the case of some goods, such as wine, substantial product differen-
tiation exists. To address this issue, recent research on wine prices employs local polynomial regression
clustering (LPRC) for estimating regression models under class uncertainty. This study demonstrates that
a superior empirical approach – estimation of a mixture model – is applicable to a hedonic model of
wine prices, provided only that the dependent variable in the model is rescaled. The present study also
catalogues several of the advantages over LPRC modeling of estimating mixture models.
Keywords: local polynomial regression clustering; finite mixture models; latent class models; hedonic
price models; wine markets
c 2015 Taylor & Francis
1254 S.B. Caudill and F.G. Mixon
[8] employ a unique approach – local polynomial regression clustering (LPRC) (see [3]) – to
estimating hedonic price (regression) models under class uncertainty.
This study demonstrates that a superior approach – a mixture model – is applicable to a hedonic
model of wine prices, such as that in [8], provided that the dependent variable in the model is
rescaled. In doing so, this study details some advantages that the mixture approach may have
over LPRC and, as much as possible, it provides comparisons between the estimation results
from the two statistical methods using the California and Washington wine data examined in
[8]. Our empirical results show that the two mixture models estimated have better aggregate
out-of-sample performance than the LPRC models estimated in prior research. Before turning to
our empirical approach to hedonic pricing of a differentiated good (i.e. wine), we briefly review,
in the section that follows, both the foundational and more recent literature on hedonic price
models, with particular attention to those applied to wine.
Downloaded by [Orta Dogu Teknik Universitesi] at 03:12 08 April 2016
verifiable, thus supporting the goods-classification research by Nelson [15,16], Darby and Karni
[9], and Mixon [14], and that offering wine through both retail channels is an effective strategy
for suppliers [13]. Next, Oczkowski and Doucouliagos [20] examine the relationship between
the price of wine and its quality through a literature review of more than 180 hedonic wine
price models developed over 20 years. Their review suggests that the relationship between the
price of wine and its sensory quality rating is a modest partial correlation of + 0.30, which
exists despite the lack of information held by consumers about a wine’s quality and the incon-
sistency of expert tasters in evaluating wines. The review of the literature in this study supports
the existence of strategic buying opportunities for better-informed consumers as well as strategic
price setting possibilities for wine producers given the incomplete quality information held by
consumers [20].
A recent study of house prices by Belasco et al. [1] demonstrates how academic researchers
are employing more sophisticated statistical procedures in order to better understand how hedo-
Downloaded by [Orta Dogu Teknik Universitesi] at 03:12 08 April 2016
nic pricing works. Through the use of a finite mixture model, these authors are able identify
latent submarkets in housing that are based on the demographic characteristics of residents. The
approach taken in this study is important, given that participants in each submarket value hous-
ing characteristics differently [1]. The academic literature on the hedonic pricing of wine is also
making use of more advanced statistical procedures in order to explore separate classifications of
wine. A particularly interesting study along these lines is that of Costanigro et al. [8], who point
out that economists have long been interested in markets for differentiated products, adding also
that when differentiated products are located far apart in product space, they no longer compete
against each other [11]. The hedonic approach as developed by Rosen [21] is meant to estimate
implicit prices of product attributes within a given product class.5 However, as the product space
between two given goods increases, the market valuation of the attributes included in them will
diverge [8].
In addressing the product space or separate markets issue in the case of wine, Costanigro
et al. [8] employ a unique approach – LPRC (see [3]) – to estimating hedonic price (regression)
models under class uncertainty. LPRC consists of several steps, beginning with the estimation of
the functional relationship between dependent and independent variables in the regression model
locally (via local polynomial regression), and followed by aggregating the sample observations
into clusters sharing functionally similar local nonparametric estimates. Finally, once the clusters
have been identified, class-specific parametric estimates are obtained by ordinary least squares
(OLS) using data for each cluster.
Costanigro et al. [8] apply their method to estimate hedonic price regressions for a sample
of 9600 wines in California and Washington. Following a study conducted by Ernst & Young
Consulting [10] for Australian wine producers, Costanigro et al. [8] assume the existence of four
wine classes. For the first stage, Costanigro et al. [8] use the local regression smoothing (LOESS)
algorithm [4] and only the continuous independent variables in their data set to obtain estimated
partial derivatives at each point in the data set, after which the Ward algorithm is used to partition
the sample into four clusters based on the similarity of these derivatives. OLS is applied to each
of the four clusters, including several dummy variables omitted from the first stage local poly-
nomial regression, thus producing class-specific hedonic regression models. Costanigro et al. [8]
demonstrate the usefulness of their approach using their estimation results to make out-of-sample
forecasts for a similar sample of wines.6
This study demonstrates that a superior approach – a mixture model – is applicable to a hedo-
nic model of wine prices, such as that in [8] and the earlier studies discussed above, provided that
the dependent variable in the model is rescaled. In doing so, this study details some advantages
that the mixture approach may have over LPRC and, as much as possible, it provides compar-
isons between the estimation results from the two statistical methods using the California and
Washington wines data examined in [8]. In the following section, we discuss the estimation of a
1256 S.B. Caudill and F.G. Mixon
mixture model with concomitant variables, which are included in the probability function to help
explain regime or class membership. Such a discussion indicates that mixture models have some
advantages over the LPRC approach.
variables is to improve cluster assignment. Second, hypothesis testing on the coefficients of the
concomitant variables is possible. Finally, if desired, the mixture model can be estimated without
including any concomitant variables if their inclusion is worrisome.
There are a number of additional benefits of the mixture approach. The inclusion of dummy
variables as concomitant variables to assist in class assignment is straightforward, whereas the
inclusion of dummy variables in the LOESS approach [4] used by Costanigro et al. [8] in the
first stage is problematic. The mixture model also facilitates the use of information criteria for
the selection of the number of regimes or classes. This cannot be accomplished in the LPRC
approach in its present form, although Costanigro et al. [8] recognize this limitation and mention
it as the subject of future research. We now turn to the estimation of a mixture model employing
the Costanigro et al. [8] wine data, with comparisons to follow.
Notes: The dependent variable is rescaled. Table reports only the coefficients of the continuous independent variables.
Lastly, the figures in parentheses are absolute values of t-ratios.
For reference, the Costanigro et al. [8] estimation results for only the continuous independent
Downloaded by [Orta Dogu Teknik Universitesi] at 03:12 08 April 2016
variables Cases, Score, and Age and using our rescaled dependent variable are given in Table 1.
These coefficients correspond to the results in Table 5 in [8]; however, they are multiplied by 100.
Table 1 also provides information on the average wine price and fraction of sample observations
for each of the four Costanigro et al. [8] regimes. The coefficients for regimes 3 and 4 seem to be
similar, while the coefficients for regimes 1 and 2 are less so. Regime 1 represents about one-half
of the sample and has the lowest average wine price, which is about $17. Regime 4 accounts
for the fewest observations, about 5%, and the highest wine price, which is about $93. Taken
together, their first two regimes account for about 85% of the sample.
In order to demonstrate the usefulness of their approach, Costanigro et al. [8] forecast wine
prices for an additional sample of 3233 wines.8 They repeat their first steps to obtain four new
clusters from the sample, and then use the original in-sample-based OLS regressions to make
predictions. They measure forecasting precision using the median percentage error rate (MPER),
given by,
abs(y − ŷ)
MPER = median . (2)
y
The LPRC method yields an overall MPER of 12.24%.9 This particular MPER is important
because it provides the easiest avenue for comparing results from the two methods. Aggregate
comparisons are the only comparisons that can be easily made between the two methods because
the resulting sample partitions, or clusters, from each method differ.
the values, thus preventing them from becoming too large or too small. When the mixture model
is used for out-of-sample forecasts, the resulting aggregate MPER is 10.17%, which is about
17% less than the value of 12.24% using LPRC.
Comparing predictions by cluster is complicated by the fact that the clusters identified are
not the same with LPRC and the mixture model, and, unlike LPRC, the mixture assignment to
clusters is probabilistic. In an effort to shed some light on the identification of clusters for the
two methods, we used in-sample predictions for both methods; then, using the clusters defined
by the in-sample LPRC, we calculated simple correlations between the LPRC prediction for that
cluster and the predictions from each of the four regimes of the mixture model. The results of
this exercise are provided in Table 3. What is immediately striking about the results is how high
the correlations are throughout, even though we are fitting the mixture predictions to the LPRC
partition. For example, all regimes have correlations in excess of 0.72 with the predictions from
the first LPRC class. Regime 3 has a correlation in excess of 0.88 with the second LPRC class.
The highest correlation with the third LPRC class is 0.663 with the predictions from regime 1.
Finally, the highest correlation, which is 0.658, with the fourth LPRC class is, again, with regime
1. Clearly, there is considerable overlap in the predictions.
mixture model, the disparities in cluster size and wine prices are not as great as with the LPRC
model. The effect of additional information on the switching function does increase the contrast
in cluster size and wine price as compared to the first model estimated.10 The aggregate MPER
is, at 9.04%, once again below the LPRC value – in this case by about 26%.
5. Conclusion
Hedonic price models are commonly used in the study of markets for various goods, most notably
those for wine, art, and jewelry. These models were developed to estimate implicit prices of
product attributes within a given product class, where in the case of some goods, such as wine,
substantial product differentiation exists. To address this issue, recent research on wine prices
employs LPRC for estimating regression models under class uncertainty.
This study shows how a simple rescaling of the dependent variable makes possible the esti-
mation of a mixture model for hedonic wine regressions. Our empirical results show that the
two mixture models estimated have better aggregate out-of-sample performance than the LPRC
Downloaded by [Orta Dogu Teknik Universitesi] at 03:12 08 April 2016
models estimated in prior research. This study also catalogues several of the advantages over
LPRC modeling of estimating mixture models.
Acknowledgements
The authors wish to thank, without implicating, an anonymous referee of this journal for providing helpful suggestions
on a previous version. Much of this research project was done while the first author was a visiting professor of economics
at the University of Milan – Bicocca.
Disclosure statement
No potential conflict of interest was reported by the authors.
Notes
1. A recent study by Sirmans et al. [24] provides a catalogue and extensive discussion of the most prevalent physical
property characteristics included in this line of inquiry. These characteristics are included in a new study by Salter
et al. [22], which controls for about 10 housing characteristics in examining the relationship between real estate
agent attractiveness and transactions prices.
2. Oczkowski [19] provides a concise comparison of popular and regression-based approaches to wine pricing.
3. See Schamel and Anderson [23] for a related hedonic price study of wines from both Australia and New Zealand.
4. See Combris et al. [6] for a related study of Burgandy wine, and Landon and Smith [12] for a study of the impact of
quality and reputation on the market for Bordeaux wine.
5. As Costanigro et al. [8] indicate, it is assumed that the goods in question are somewhat differentiated but similar
enough that consumers consider them as variations of the same product.
6. Costanigro et al. [8] admit, throughout their study, that the latent class, or finite mixture model is the closest com-
petitor to LPRC. They also realize that the superior performance of LPRC in comparison to finite mixture models
would validate LPRC. However, the authors had little success estimating a finite mixture model.
7. This fmm procedure was not available to Costanigro et al. [8] at the time their study was produced. Exploratory tests
using the fmm procedure with the unscaled dependent variable proved troublesome in producing the current study.
8. In their study, Costanigro et al. [8] state that the size of the auxiliary sample is 3265, but we are able to reproduce
their pooled sample result of 19.71% given in their Table 3, so we feel confident in our calculations.
9. This MPER is presented in Table 3 of Costanigro et al. [8].
10. We also estimated a mixture model including the dummy variables as concomitant variables and the contrast between
cluster size and wine prices across the four clusters continued to increase.
References
[1] E. Belasco, M.C. Farmer, and C.A. Lipscomb, Using a finite mixture model of heterogeneous households to
delineate housing submarkets, J. Real Estate Res. 34 (2012), pp. 577–594.
[2] F. Caracciolo, L. Cembalo, and E. Pomarici, The hedonic price for an Italian grape variety, Ital. J. Food Sci. 25
(2013), pp. 289–294.
Journal of Applied Statistics 1261
[3] K. Chen and Z. Jin, Local polynomial regression analysis of clustered data, Biometrika. 92 (2005), pp. 59–74.
[4] W.S. Cleveland, S.J. Devlin, and E. Grosse, Regression by local fitting, J. Econometrics. 37 (1988), pp. 87–114.
[5] P. Combris, S. Lecocq, and M. Visser, Estimation of a hedonic price equation for Bordeaux wine: Does quality
matter? Econ. J. 107 (1997), pp. 390–402.
[6] P. Combris, S. Lecocq, and M. Visser, Estimation of a hedonic price equation for burgundy wine, Appl. Econ. 32
(2000), pp. 961–967.
[7] A. Corsi and S. Strom, The price premium for organic wines: Estimating a hedonic farm-gate price equation,
J. Wine Econ. 8 (2013), pp. 29–48.
[8] M. Costanigro, R.C. Mittelhammer, and J. McCluskey, Estimating class-specific parametric models under class
uncertainty: Local polynomial regression clustering in an hedonic analysis of wine markets, J. Appl. Economet.
24(1) (2009), pp. 117–135.
[9] M.R. Darby and E. Karni, Free competition and the optimal amount of fraud, J. Law Econ. 16 (1973), pp. 67–88.
[10] Ernst & Young Entrepreneurs, Etude des Filières et des Stratégies de Développement des Pays Producteurs de Vins
Dans le Monde: Analyse de la Filière Viticole Australienne, ONIVINS (Office National Interprofessionnel des
Vins), 1999.
Downloaded by [Orta Dogu Teknik Universitesi] at 03:12 08 April 2016