You are on page 1of 31

BIG DATA & ANALYTICS IN NETWORKED BUSINESS

USING BIG DATA TO MODEL TIME-VARYING EFFECTS FOR


MARKETING RESOURCE (RE)ALLOCATION1
Alok R. Saboo, V. Kumar, and Insu Park
J. Mack Robinson College of Business, Georgia State University, Atlanta, GA 30303 U.S.A.
{asaboo@gsu.edu} {vk@gsu.edu} {ipark8@gsu.edu}

Marketing resource allocation has been a topic of intense scrutiny, yet the literature on the topic has not paid
adequate attention to the fact that the effectiveness of marketing-mix elements varies over time. Despite the
fact that firms collect volumes of data on their customers, existing estimation approaches do not readily lend
themselves to modeling the temporal variations for big data and provide little guidance to managers in terms
of their resource allocation decisions. We address this gap and argue that marketing-mix effectiveness varies
with the evolution of the consumer–brand relationship and explicitly model these temporal variations using a
time-varying effects model (TVEM) that accounts for self-selection of customers into receiving marketing com-
munications and endogeneity of the number of such communications. The proposed TVEM framework handles
the complexities associated with big data analytics and provides novel insights for data-driven decision making.
We combine transaction data from a Fortune 500 retailer with demographic information obtained from Acxiom
Corp for over a quarter million customers to test our framework. The results provide strong support for our
proposed framework. Specifically, we find that the influence of marketing mailers, other transaction charac-
teristics (coupon redemption, returns, and cross-buy), and demographic factors (age, income, household size,
and interests) on sales varies significantly over the customer life cycle and ignoring such temporal variations
can lead to gross misallocation of marketing investments. Specifically, our results suggest that firms can
increase their revenues by over 17 percent by just reallocating their resources based on the proposed frame-
work. To facilitate adoption of our proposed framework, we provide guidance and actionable insights for
managerial relevance.

Keywords: Time-varying effect model, TVEM, big data, dynamic marketing resource allocation, time series
models, dynamic models, marketing-mix effectiveness, direct marketing

Introduction1 tiveness (e.g., higher sales) gets more allocation than the one
that is less effective (Raman et al. 2012). However, such
Firms have long been concerned with optimizing resource rules of thumb can be misleading considering that marketing-
allocation across marketing activities such as advertising, mix effectiveness varies over time. For example, the product
promotions, incentives, or direct marketing to influence life cycle theory suggests that consumers are responsive to
outcomes of interest (e.g., sales, employee turnover, and advertising in the early stages when they are looking for
customer engagement). Firms typically base their resource information, whereas they respond to promotions in the
allocation decisions on the effectiveness of their marketing maturity stage (e.g., Parsons 1975). Similarly, marketing-mix
investments such that a marketing input with a higher effec- effectiveness can change due to other market events such as
entry of new competitors (Pan and Lehmann 1993), advances
in technology (Chen and Stallaert 2014), or regulatory
1
Bart Baesens, Ravi Bapna, James R. Marsden, Jan Vanthienen, and J. Leon changes (Stremersch and Lemmens 2009). Finally, the effec-
Zhao served as the senior editors for this paper. tiveness of marketing-mix elements can change due to the

MIS Quarterly Vol. 40 No. 4, pp. 911-939/December 2016 911


Saboo et al./Using Big Data to Model Time-Varying Effects

evolution of the consumer–brand relationship that evolves (due to a variety of factors such as inertia, habit, etc.) in
after every interaction. All these factors suggest that resource consumer responsiveness to a firm’s direct marketing efforts
allocations based on historical performance can be counter- so as to enhance the ROI on marketing on a real-time basis
productive and that managers should continuously reallocate and make both methodological and substantive contributions.
marketing resources based on the expected returns (Kumar Specifically, traditional methodological approaches such as
2013). This frequent reallocation of marketing resources, panel data regression models ignore the potential variations in
although demanding, has been shown to provide superior the impact of direct marketing efforts on sales over time and
shareholder returns in a recent study by McKinsey & may yield misleading or, worse, incomplete insights. As an
Company (Fruk et al. 2013). alternative approach, we use the advances in time-varying
effect analysis to model the coefficients (or the parameter
Real-time adjustment in organizational strategies requires estimates) as a smooth (continuous) function of time (Tan et
continuous measurement and analysis of data to make infor- al. 2012). Further, instead of using multilevel models that
mation usable at a higher frequency. Fortunately, firms are impose strong parametric assumptions about the nature of
now collecting huge amounts of data on their consumers and change in the relationship between the two variables, we use
should be able to use that information to reallocate resources. nonparametric techniques to model the coefficient functions.
Whereas the answer to “how much data is big” may vary by The proposed TVEM framework relies on nonparametric
organizational size, the objective is to create value from the assumptions that require large volumes of data for estimation
analysis of their data to provide novel insights. Yet, aca- and hence is an ideal candidate for a big data context. Thus,
demics have not offered any systematic mechanism that TVEM offers an alternate approach to the functional
handles the challenges associated with big data and generates regression analysis, which is designed for sparse longitudinal
real-time intelligence to guide this dynamic resource alloca- data, where both the predictor and response are functions of
tion problem—a gap that we seek to address through this a covariate such as time (Shi and Choi 2011). Of greater
research (e.g., Goes 2014). importance, our estimation methodology overcomes the
limitations of the existing approaches highlighted by Leeflang
Several theories explain the phenomenon through which the et al. (2009). First, unlike other methodologies such as the
effectiveness of marketing inputs varies with changes in the moving window regression (Mahajan et al. 1980) or the piece-
consumer–brand relationship. As consumers become familiar wise regression (Parsons 1975) that employ arbitrary windows
with a brand either through external information (i.e., or utilize only a subset of the data each time, we recover the
advertising) or product usage, their knowledge structure about true functional form by using nonparametric techniques. We
the firm changes and decreases the uncertainty about the firm, also improve upon alternative time-varying parameter models
affecting their price perceptions, purchase intentions, and that make assumptions about the shape of the time-varying
willingness to pay (e.g., Rao and Monroe 1988). However, process (e.g., Foekens et al. 1994) or a priori distinguish
research on usage dominance suggests that once consumers between the effectiveness of marketing instruments based on
become familiar with a brand and have tried the same, their known information such as performance regimes (Pauwels
personal usage experience dominates all external information and Hanssens 2007). Our approach is immune to errors
as an input into the purchase decision, suggesting a reduction caused by incorrect specification as we do not rely on any a
in effectiveness of advertising for future purchase decisions priori information about the shape of the time-varying process
(Deighton et al. 1994). This finding is consistent with Hoch or make any “arbitrary distinction” between performance
and Ha’s (1986) framing experiment where they find that regimes (Osinga et al. 2010, p. 175). Also, unlike Kalman
advertising had no effect on the attitudes of participants who filtering or dynamic linear models (DLM) that are based on
were allowed to experience the product. Similarly, psycho- state-space modeling, our approach does not assume any pre-
logical concepts such as habit or inertia have been docu- specified number of underlying states and does not take
mented to influence consumer purchases over time (e.g., Dubé “several hours or days” to estimate despite the large trans-
et al. 2010). For instance, Givon and Horsky (1990) find that action dataset (Leeflang et al. 2009, p. 15).
purchase reinforcement and habitual loyalty effects are
stronger than advertising carryover effects. Thus, although We illustrate the value of incorporating the time-varying
there is enough evidence that marketing-mix effectiveness effects of marketing inputs using a large dataset from a
changes over the customer life cycle as customers learn about Fortune 500 national retailer. Our dataset comprises of over
the firm, we are not aware of any empirical study that investi- a quarter million unique customers (N = 281,150) and offers
gates this topic and provides managerial guidance for a wide variety of customer information including online and
implementation. offline transactions recorded between 2007 and 2010, qualita-
tive data regarding customers’ preferences, and demographic
We propose a time-varying effects model (TVEM) that information. The large dataset, as we discuss subsequently,
utilizes big data to explicitly account for temporal variations has all the principal characteristics of big data (i.e., high

912 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

volume, velocity, and variety), and is in line with the demands flexible enough to allow firms to use big data for basic low-
placed by our nonparametric estimation approach that requires frequency forecasting to high-frequency nowcasting, for
the estimation of additional parameters. Although the large improving firm performance (Manyika et al. 2011). We are
dataset is difficult to store and analyze using existing not presenting TVEM as a replacement to other longitudinal
methods, it is in line with organizational realities where firms estimation techniques. We believe that TVEM should be
have volumes of transaction data and are looking for ways to viewed as another tool in the toolkit of scholars working on
generate meaningful insights from it. Our time-varying ap- longitudinal data that offers significant value without much
proach efficiently models the effects of marketing inputs over up-front costs.
time and the heterogeneity across customers, and, most impor-
tantly, provides actionable insights to managers. The pro- We organize the remainder of the paper as follows. We begin
posed model is superior to alternate specifications and can be by providing some motivation for our research question.
easily estimated within seconds or minutes (depending on the Then, we discuss the basic logic behind the concept of time-
size of the data); we provide the required code for managers varying effects modeling (or TVEM). Next, we describe in
to implement our model using their existing infrastructure. detail the estimation of such models, where we provide alter-
Our results provide strong support to our framework that sug- native estimation algorithms. Then, we discuss the data
gests that the effectiveness of marketing-mix elements varies employed in this research and comment on the data-related
over time and highlight how firms can increase their return- opportunities and challenges associated with TVEM estima-
on-investment by incorporating these temporal variations in tion. Thereafter, we discuss our results, and the implications
their resource allocation strategy. More importantly, our of our research for both scholars and managers. We conclude
results are also managerially significant. Specifically, firms with the limitations and future directions of this research.
can increase their returns on marketing investment by over
17% by reallocating their resources in line with the proposed
TVEM model without making any other changes, thereby
preventing the misallocation of scarce marketing resources
Motivation
across customers, thus saving millions of dollars.
Substantive Perspective
In summary, our TVEM approach is easy to implement within
Marketing resource allocation has gained prominence as
existing organizational infrastructure and provides novel
managers are under increasing pressure to deliver results on
insights, helping firms in analyzing and exploiting big data to
tighter budgets. As firms limit the size of their marketing
increase firm value. Firms “struggling with understanding
budgets, marketers must find ways to maximize the impact of
and deciding” (Goes 2014, p. vi) how to use big data for
their marketing dollars. Scholars have produced a rich body
decision making can easily employ our framework for real
of literature that offers both methodological and substantive
time strategic decision making. Specifically, our research
insights into the complex decisions related to marketing
directly helps firms in four broad ways in which they can use
resource allocation. The broad theme in this stream of
big data to create value (e.g., Manyika et al. 2011). First, our
research is to assess the impact of marketing actions on con-
approach can unlock significant value by making information
sumer demand so as to adjust marketing resource allocation
usable at much higher frequency, allowing firms to change
(across medium, channels, functions, geographies, products,
their strategies in real time. Second, as firms create and store
customers, etc.) to increase firm value (Gupta and Steenburgh
more data about customers, our approach provides a more
2008). We briefly summarize the key themes across this
accurate and detailed description of how various factors such
research domain (interested readers should refer to review
as transaction characteristics or demographic factors influence
articles by Gupta and Steenburgh 2008; Kumar and Reinartz
the desired outcomes and how the influence of these factors
2012; Rust et al. 2004; Shankar 2008).
varies over time, allowing firms to make better decisions and
adjust their business levers accordingly.2 Third, our TVEM
The dominant theme across the marketing resource allocation
methodology allows firms to have a deeper understanding of
domain is to identify factors that can explain and influence
consumer decision criteria and, in turn, precisely tailor their
consumer response to marketing actions and use those
products or services. Fourth, our approach can substantially
insights to help managers in their resource allocation deci-
improve decision making and allow firms to reallocate their
sions. Along those lines, scholars have proposed transaction
scarce resources efficiently. Overall, our TVEM approach is
characteristics such as recency, frequency, and monetary
value (e.g., Venkatesan and Kumar 2004), shopping charac-
2
Although TVEM is an explanatory model at its core, as we document
teristics such as return behavior (Anderson et al. 2009;
subsequently, it has nice predictive properties and can help firms in their Petersen and Kumar 2009) or shopping across multiple chan-
resource allocation decisions. nels (Venkatesan et al. 2007; Verhoef et al. 2010), attitudinal

MIS Quarterly Vol. 40 No. 4/December 2016 913


Saboo et al./Using Big Data to Model Time-Varying Effects

characteristics such as satisfaction (Bendapudi and Berry reallocation decisions. Consider a simple market response
1997; Seiders et al. 2005) or loyalty (Lewis 2004; Stern and model in the most general form where each customer can have
Hammond 2004), and organizational marketing efforts (Elsner different measurement occasions and measurement window
et al. 2004; Manchanda and Chintagunta 2004) to explain the sizes:
differences across consumer purchases. Using the insights
thus gained, scholars have examined how marketing resources Sij = β0 + β1 × X ij + εij ; i = 1, , n, j = 1, , mi (1)
should be allocated across customers (Elsner et al. 2004;
Kumar et al. 2013) and marketing activities such as customer where, Sij is the sales (or other outcome variable) for customer
acquisition and retention (Berger and Bechwati 2001; i measured at time tij, Xij is marketing input (e.g., advertising)
Reinartz et al. 2005), advertising and sales (Naik and Raman for customer i measured at time tij, n represents the total
2003), or value creation and appropriation (Mizik and number of subjects, mi is the number of repeated observations
Jacobson 2003); and markets or geographies (Elberse and for subject i,3 tij is the measurement time of the jth observation
Eliashberg 2003). Many of these tools or methodologies have for subject i, that is, tij are the different measurement
been made available to managers to help them with their occasions for subject i, and random errors gij are assumed to
resource allocation decisions (e.g., Divakar et al. 2005; Elsner be normally and independently distributed.4
et al. 2004; Shankar et al. 2008).
The model in Equation 1 is akin to some of the initial models
Despite the significant advances in the resource allocation in this space (e.g., Parsons and Schultz 1976) and assumes
domain and wide acceptance of the fact that firms that stra- constant parameter estimates, for instance, the effectiveness
tegically reallocate resources frequently based on marketplace of advertising remains constant. However, not only can the
considerations deliver superior returns (Fruk et al. 2013), a intercept, β0, vary over time due to market conditions, other
recent study by McKinsey & Co. highlights that companies response parameters, β1, also evolve over time due to a variety
continue to allocate resources based on historical allocations of factors (e.g., changes in consumer preferences, competitive
and rules of thumb (Doctorow et al. 2009). This is partly landscape, or economic conditions) (Leeflang et al. 2009;
because the “models for dynamic resource allocation typically Raman et al. 2012).
assume that marketing effectiveness is constant over time”
(Raman et al. 2012, p. 44), an assumption that may not stand One way to incorporate the temporal variations in the
up to scrutiny. Scholars are beginning to acknowledge that relationship between the variables of interest is to use multi-
the effectiveness of marketing instruments may not be con- level or hierarchical modeling and include the interactions
stant over time and some have even explicitly modeled the with time or an explicit process function (e.g., Foekens et al.
same (Osinga et al. 2010), yet a majority of such studies focus 1994; Mela et al. 1998). We can extend Equation 1 and write
on long-term effects and do not investigate the decisions that a simple multilevel linear regression model as:
managers must make routinely (e.g., Slotegraaf and Pauwels
2008). For instance, managers must decide on the amount of
resources that they need to allocate to each customer in the ( )
Sij = β00 + β01tij + β10 + β11tij X ij + εij (2)
coming week or month and they can maintain their historic
allocations or reshuffle their allocations across customers. where we rewrite β0 and β1 from Equation 1 as functions of
Most academic studies overlook such short-term decisions, time tij, where tij is the measurement time of the jth observation
providing little guidance to managers who then, not sur- for subject i, such that β0 = β00 + β00tij and β1 = β10 + β11tij.
prisingly, are very “slow to reshuffle their resources,”
resulting in poor firm performance (Fruk et al. 2013, p. 56). Traditionally, scholars assume a shape of the coefficient func-
This is frustrating as firms spend huge resources in collecting tions such as linear, quadratic, or exponential. Since there is
volumes of data on their customers, yet most companies a range of admissible functional forms, the chosen functional
struggle to take advantage of their big data resources in deci- form is almost surely mis-specified (Bierens and Pott-Buter
sion making, in part due to the lack of methods that can be 1991). Such a specification can be justified for panel data
easily implemented in the big data realm within the existing with small number of repeated observations where there is
organizational infrastructure (Chen et al. 2012; Goes 2014).
3
As a practical matter, we recommend using a minimum of 10 observations
per individual for the model (i.e., mi $ 10).
Methodological Perspective
4
Please note that our specification does not require every customer to have
Our study focuses on short-term resource allocation decisions the same measurement times (i.e., tij can be different for each i) and allows for
different number of measurement points for each customer (i.e., mi can be
and seeks to provide some guidance to managers in their
different for each i).

914 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

limited information to recover the true shape of relationship where Xij is the covariate, β0(r0) and β1(r1) are the associated
change. However, with the advent of big data, existing coefficient functions such that the intercepts and the coeffi-
statistical methods that force the data into a prespecified cients of Xij vary with the values of r0 and r1 respectively.
shape may be incapable of providing a comprehensive under- When the coefficient function is specified as a function of
standing of the phenomenon. Such mis-specifications will time, the varying coefficient model reduces to the special case
typically result in inconsistent parameter estimates and, of TVEM (using notations in Equation 2):
consequently, an incomplete, or worse, incorrect, under-
standing of the mechanisms of change. ( ) ( )
Sij = β0 tij + β1 tij X ij + εij (4)

The accurate portrayal of the coefficient functions provides


where β0(tij) and β1(tij) are assumed to be the continuous coef-
information about (1) the shape of change and (2) peaks and
ficient functions (i.e., they are continuous over the range of
troughs in the change function, which can provide an accurate
tij), and the model suggests that the influence of Xij on Sij
depiction of how the relationship evolves over time and help
varies over time tij, where tij is the measurement time of the jth
managers make better decisions. Since the only way to avoid
observation for subject i. This flexible model allows the time-
functional misspecification is to not use a functional form at
varying effects of the covariates without assuming parametric
all, we use nonparametric methods to let the functional form
functions.
of the change function be determined by the data. Nonpara-
metric methods have been widely used to alleviate the prob-
TVEMs can estimate the changing relationship between a
lems related to functional misspecification (Hollander et al.
time-varying parameter and the dependent variable over time
2013), including in marketing (Leeflang et al. 2009).
without making any assumptions with respect to the shape of
the trajectories of the variables (Tan et al. 2012). TVEMs are
“conditionally parametric because the time-varying param-
Proposed Modeling Approach eters are nonparametric functions, whereas the model is
parametric for a specified t” (Stremersch and Lemmens 2009,
We demonstrate the merits in using time-varying effect p. 700). The semiparametric approach is a good compromise
models (TVEM) that are free of problems related to functional between parametric models that are usually restrictive and
misspecification and recover the shape of coefficient func- less robust when the model is incorrectly specified and non-
tions using the data.5 The TVEM framework provides an parametric models that are usually more complex and less
ideal approach to model the continuously evolving nature of efficient (Wu and Zhang 2006). This freedom comes at a
the customer–brand relationship as it assumes that these func- price: We need a lot more data to fit a TVEM framework
tions are smooth (i.e., with no sudden jumps or break points) than that required by a parametric model (Leeflang et al.
and does not pose any parametric assumptions on the 2000). Fortunately, big data provides the required resources
coefficient functions.6 TVEM models are an extension of the to encourage the application of such models. While the data
varying coefficient model proposed by Hastie and Tibshirani required may be large, the suggested modeling approach takes
(1993) that allows estimating the coefficients of a variable as much less time to estimate than other models such as dynamic
a smooth function of other variables. Instead of assuming linear models or Kalman-filtering.
constant parameter estimates (such as in Equation 1), these
models increase the flexibility of linear regression models by
allowing their coefficients to vary smoothly as a function of
other variables. Thus, we can rewrite Equation 1 in a more Data Description
general form as:7
The empirical context for our research is a large Fortune 500
Sij = β0 (r0 ) + X ij β1 (r1 ) + εij (3) retailer selling a wide assortment of products related to home
improvement, gardening needs, furniture, and home appli-
ances. For reasons of confidentiality, we cannot reveal the
name of the retailer. The retailer operates more than 1,000
5
These models have also been referred to as time-varying parameter models stores across the United States and stocks an average of
(Leeflang et al. 2009), time-varying coefficient models (Wu and Zhang
20,000 items per store with prices ranging from a few cents to
2006), or dynamic generalized linear model (West et al. 1985).
several thousand dollars. The retailer started a loyalty club
6
A function is smooth if the first-order derivative function is continuous.
for its customers and we obtain our data from the members of
this club. The company targets the members of this club
7
For ease of exposition, we present the model with a single predictor; it can through mailers (which is operationalized as the number of
be trivially extended for multiple predictors. direct mail and emails sent to each customer). Given that the

MIS Quarterly Vol. 40 No. 4/December 2016 915


Saboo et al./Using Big Data to Model Time-Varying Effects

retailer has a fixed direct marketing budget, it currently estimating time-varying models (Tan et al. 2012; Walls et al.
allocates resources on the basis of the nature of customers 2006). Our final dataset includes 36 months of customer-
(e.g., business versus retail customers).8 Direct mail serves level information for 281,150 unique customers with
two primary purposes: providing product or marketing infor- 10,121,400 customer-month observations.10
mation and providing a purchase trigger. However, as
customers become familiar with the firm and its offerings, the In line with Goes (2014) and Zikopoulos et al. (2012), our
value of such marketing communications change. In addition, data have all the characteristics of big data: (1) large volume
customers’ responsiveness to the firm’s marketing communi- of information (detailed customer transaction records for
cations tends to vary with customers’ demographics and their 281,150 customers), (2) variety of information (e.g., online
transaction histories. These factors should be duly incor- and offline transactions, qualitative information on prefer-
porated in a firm’s mailing strategy. ences, demographics, and customer characteristics), and
(3) high velocity (transaction summary across entire retail
Accordingly, we obtain transaction data for 1.3 million unique network recorded in real time). Specifically, our original
customers who do not have a prior relationship with the firm dataset has a large number of customers and detailed trans-
over a three-year time period from 2007 to 2010. The data action information on them, which represents the high volume
provide rich information on customer purchases, including in- property. Further, the sales transactions are recorded in real
store and online transaction details such as the time of time at the point-of-purchase, which present high velocity
purchase, store location, purchase amount, number of items property. Finally, by incorporating the information on house-
purchased, cross-buy, and the number of product returns. We holds’ characteristics (e.g., demographics and qualitative
also have data on firm-initiated marketing communications information on preferences) from Acxiom’s database into the
(e.g., direct mail, coupons, and emails), and customers’ firm’s sales data, our dataset obtains the variety property. The
responses to those communications. In addition, we were availability of big data offers enough statistical power to
able to augment the transaction data with customer charac- empirically detect the temporal variations in marketing
teristics, which include age, income, marital status, population effectiveness.
density, interests, and several others for 0.9 million out of the
1.3 million unique customers through our collaboration with The sheer volume of our data also presents several computa-
Acxiom Corp., resulting in a SAS dataset of over 9.5 GB. tion challenges. We employed the latest versions of scalable
analytics software packages (SAS 9.3 and STATA 13) running
We drop a small number of customers from the dataset with on a Dell Supercomputer with 512GB RAM (most modern
poor coverage of the customer characteristics obtained machines have around 8GB RAM), 10TB of hard drive, and
through Acxiom, resulting in a dataset of approximately 0.76 a Dual Eight Core XEON processor for our empirical
million customers. We identify and remove any outliers. For analyses. Neither of the analytics software packages impose
the purpose of this study, we focus on a specific segment of any limit on the size of the dataset from a computational
customers and hence exclude B2B customers from our standpoint as long as the hardware (memory and processing
analysis as there may be other factors that influence their speed) is adequate.
relationships with the firm (e.g., account executives). In line
with our objectives to measure the influence of marketing
communications, we also exclude customers who have opted- Key Variables
out of any marketing communications from the retailer.9 We
observe the behavior for these customers over a period of 36 We describe the key variables for our study in this section.
months and have data on their transactions as well as We use the total amount of dollar sales (SALES) from each
marketing interactions over the period. This time frame is customer per month as our dependent variable. The key inde-
well beyond the recommendation of at least 10 periods for pendent variable in this study is the number of mailers,
physical as well as electronic, sent to each customer per
month (MAILS). All loyalty-club customers receive anywhere
8
Given that firms choose their mailing strategy based on customer charac- from 1 to 30 (with an average of 4.43) mailers from the
teristics, the number of mailers sent to a customer may not be exogenous. We retailer in a month. These mailers typically include product-
correct for the same in our analysis. related information and some coupons. If customers redeem
these coupons, the transaction is recorded at their point-of-
9
As we explain subsequently, we correct for the self-selection issue that may
arise due to this exclusion as customers receiving the firm’s marketing
10
communications may have greater preference for its products than those that As a practical matter, the TVEM approach works equally well on the entire
opt out of such communications, and this difference rather than the marketing sample (where we only include the transaction data and do not drop
communications may be driving sales. customers for whom we cannot match demographic information).

916 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

Table 1. Variable Descriptions


Variables Descriptions
Sales (SALES) Total amount of dollars spent each month
Mails (MAILS) Number of mailers (e-mails and direct mails) received by the customer
Coupons Redeemed (REDEEM) Number of coupons redeemed
Returns (RETURNS) Number of products returned
Crossbuy (CROSSBUY) Number of categories across which customer shops in a given month
Population density (PDENS) 12-point Likert scale from 1 (very rural) to 12 (very urban).
Dummy variable that equals 1 if the month of transaction is April or May
Seasonality (SEASON)
and 0 otherwise
12-point Likert scale from 1 (financially very unstable) to 12
Financial stability (FSTABILITY)
(financially very stable)
Household size (HSIZE) Number of people in the household
Dummy variable that equals 1 if a customer is interested in gardening and
Interest in gardening (GARDEN)
0 otherwise
Dummy variable that equals 1 if a customer is interested in home
Interest in home improvement (DECOR)
improvement and 0 otherwise
Marriage (MARRIAGE) Dummy variable that equals 1 if a customer is married and 0 otherwise
Age (AGE) Age of the customer
Dummy variable that equals 1 if a customer makes online purchases and 0
Online purchases (ONLINEPURCH)
otherwise
Dummy variable that equals 1 if a customer is interested in general reading
Interest in general reading (MGEN)
and 0 otherwise
Dummy variable that equals 1 if a customer is interested in news and
Interest in news and political reading (MPOL)
political reading and 0 otherwise
Dummy variable that equals 1 if a customer responds to mail orders and 0
Mail order responder (MOR)
otherwise

purchase. In line with the literature that suggests that promo- the firm. We also include demographic and socio-economic
tions (i.e., coupons) influence overall sales (e.g., Venkatesan factors that may influence customer purchases in the home
and Farris 2012), we include the number of coupons improvement, gardening, and construction category; we ob-
redeemed (REDEEM) in our model.11 Coupon redemption tain this data through our research collaboration with Acxiom
varies from 0 through 20 (with an average of 0.027), and Corp. Specifically, we include population density (PDENS)
provides us with a means to capture the differences in the measured on a 12-point scale, seasonality (SEASON) that
prices paid by customers. equals 1 for summer months or moving season (April and
May) and 0 otherwise, financial stability index (FSTABILITY)
Besides firm-initiated marketing communications, the measured on a 12-point scale ranging from 1 (financially very
customer-brand relationship develops through prior inter- unstable) to 12 (financially very stable), and household size
actions with the firm such as purchases or returns (Petersen (HSIZE) indicating the total number of people in the
and Kumar 2009). Thus, we also include the total product household ranging from 1 to 9 (indicating 9 or more people in
returns (RETURNS) and cross-buy (CROSSBUY) in our household). See Table 1 for a summary of all our variables.
framework. We measure product returns as the number of
products returned in a month and cross-buy as the number of
different product categories that a customer has bought from
Model Development
11 In this section, we discuss the issues that guide our estimation
Customers can also obtain coupons from sources other than firm-initiated
marketing communications (e.g., newspapers or magazines). approach. First, an important consideration in testing our

MIS Quarterly Vol. 40 No. 4/December 2016 917


Saboo et al./Using Big Data to Model Time-Varying Effects

framework is an alternative explanation of customer self- Next, we use the estimates of zOpt i from Equation 5 and the
selection. This explanation suggests that customers receiving resultant estimates of zOpt
i λ
Opt
to compute the inverse Mills
the firm’s marketing communications may have greater ratio (correction term; IMR) for each observation (e.g., Saboo
preference for its products than those that opt out of such et al. 2016) as follows:
communications, and this difference rather than marketing
communications may be driving sales. Second, and a related ( ) Φ (z λ );
IMRi = φ ziOpt λOpt Opt
i
Opt
if Opt = 1
issue, are concerns regarding the potential endogeneity of the (6)
mailers received by customers. Firms choose their mailing IMR = − φ ( z λ ) [1 − Φ( z λ )] ;
i
Opt
i
Opt Opt
i
Opt
if Opt = 0
strategy based on customer characteristics (i.e., the number of
mailers sent to a customer may not be exogenous).12 Finally, where φ and Φ are are the probability density function and the
as discussed earlier, the goal of our TVEM model is to reveal cumulative distribution function of the standard normal
the shape of the smooth coefficient function over time. distribution, respectively.
Accordingly, we discuss our approach for estimating the
unknown coefficient function. Then, we include the correction term (IMR) as additional
variables in our final model. Although the model is identified
by the nonlinearity of the Inverse Mills Ratio, for better
Accounting for Selection Bias identification, we exclude three variables including whether
they make online purchases in general (ONLINEPURCH),
We use the Heckman two-step method that has been shown to whether they are interested in general reading (MGEN) or
be an effective approach to check and correct for potential news and politics (MPOL) from our final model to meet the
self-selection bias (Heckman 1979). The first stage involves exclusion restriction. Both MGEN and MPOL should influ-
a probit model on the probability of opting out of retailers’ ence customers’ desire to receive and read mailers, but
marketing communications. Let zOpt i constitute the set of consumers’ reading habits are less likely to influence their
exogenous variables that influence the customers’ choice of purchases. Similarly, online purchases (ONLINEPURCH)
opting out of marketing communications. In line with prior should encourage customers to receive and read mailers, but
studies in this domain (e.g., Kumar et al. 2014), we include a should not influence total purchases as customers also make
range of demographic and psychographic variables that are offline purchases, an assumption that is also empirically
likely to influence customers’ choice. Specifically, we validated by the low correlation (ρ = -.005) between these
include age of the head of the household (AGE), household variables.13
size (HSIZE), financial stability (FSTABILITY), marital status
(MARRIAGE), and population density of the area in which
they live (PDENS). We also include some psychographic Accounting for Endogeneity of Mailers
variables such as whether the customers are interested in
gardening (GARDEN), home improvement (DECOR), general As is common in the literature, we use a control function
reading (MGEN), news and politics (MPOL), and whether approach to model the potential endogeneity of a firm’s
they make online purchases (ONLINEPURCH). Thus, we mailing strategy (e.g., Petrin and Train 2010; Wang et al.
specify the following first stage probit model: 2015). Firms base their mailing strategy on the basis of cus-
tomer characteristics and with an explicit objective to increase
Opti* = ziOpt λOpt + ηiOpt (5) sales, suggesting that number of mailers received by a
customer may be endogenous. We use the control function
where Opt*i denotes the latent measurement, and the observed approach as proposed by Garen (1984) to correct for any such
binary response for the ith customer is the indicator Opti = endogeneity.14
I{Opt*i > 0}; λOpt is the unknown parameter vector; zOpt
i is a
vector of exogenous variables; and the random error ηOpti is
13
assumed to be normally distributed. The exclusion variables, ONLINEPURCH, MGEN, and MPOL, were
obtained from the Acxiom dataset that includes a range of demographic and
psychographic variables about the customers and refers to the general
propensity of the customer to make online purchases and their reading
interests (general as well as about news and politics) and is not specific or
12
Given the fact that MAILS is the only decision variable that the firm can related to the focal retailer or the current context.
influence, we correct for the “first order endogeneity” of MAILS in the
proposed model (Rossi 2014). However, as per the suggestion of an 14
Although related, the control function approach is distinct from the instru-
anonymous reviewer, we also explore the potential endogeneity of our other mental variable (IV) approach to correct for potential endogeneity. While the
transactional variables (REDEEM, RETURNS, and CROSSBUY) in our IV approach relies on a good instrument (that is correlated with the endog-
robustness checks. enous variable, but not the error term), the control function approach relies

918 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

The control function approach requires two-stage estimation, where MAILSij indicates the number of mailers sent to
wherein we estimate the correction term in the first stage by customer i in period j, λMAILS is the unknown parameter vector,
regressing the endogenous variable, MAILS, on a set of exog- zijMAILS is a vector of exogenous variables specified earlier,
enous variables (e.g., Wang et al. 2015). Let zijMAILS be a and the random error ηijMAILS is assumed to be independently
vector of exogenous variables that influence the organi- and normally distributed.
zational choice of the number of mailers sent to the customer
i in period j. Extant research suggests that customer charac- We obtain consistent estimates of λMAILS and then use the
teristics and previous transactions influence organizational residuals ηijMAILS for mailers as additional explanatory vari-
mailing strategy (e.g., Elsner et al. 2004); for example, cus- ables in our final model shown in Equation 8. For model
tomers who purchase frequently and in large quantities are identification purposes, we exclude the dummy variable that
likely to receive more mailers (Gönül and ter Hofstede 2006). indicates whether the customer is interested in mail orders
Accordingly, we include customer characteristics and recent (MOR) from the second stage model shown in Equation 8.
transaction history as explanatory variables that may influence Interest in mail orders is likely to be related to the number of
organizational mailing strategy. Specifically, in terms of mailers received by the customer, but less likely to be related
transaction characteristics, we include lagged sales to final sales, which can be confirmed by the relatively low
(SALESij–1), mailers sent in the previous period ( MAILS ij−1 ),15 correlation between interest in mail orders and sales as com-
lagged coupons redeemed (REDEEMij–1), lagged products
pared to that between interest in mail orders and the number
returned (RETURNSij–1), and the number of product categories
of mailers received. The final model that we estimate is
that a customer purchased in the previous period
(CROSSBUYij–1). We also include customer characteristics
that should influence the mailing decision such as age (AGE), ( ) ( )
Sij = β0 tij + β1 tij X ij + γ 1ηijMAILS + γ 2 IMRi + εij (8)
financial stability (FSTABILITY), household size (HSIZE),
population density of the area they live in (PDENS), marital where γ1 and γ2 are the coefficients for the correction terms.
status (MARRIAGE), interest in gardening (GARDEN) or
home improvement (DECOR), and whether the customer is
interested in mail orders (MOR). Finally, given the seasonal Estimation of Time-Varying Effect Model
nature of our product category, we also include seasonality
(SEASON) to indicate the summer months or moving season In this section, we focus on the estimation of unknown
(April and May). Thus, we specify: coefficient functions, intercept β0(tij) and slope β1(tij) in
Equation 8. Smoothing methods have been widely used to
MAILSij = zijMAILS λ MAILS + ηijMAILS (7) estimate unknown functions and provide an attractive com-
promise between nonparametric approaches that make no
assumptions and parametric approaches that make strong
on deriving a proxy variable that partitions the endogenous variable into assumptions about the functional form. Current popular
endogenous and exogenous components (Petrin and Train 2010). The control smoothing methods include spline-based methods such as
function approach requires modeling the endogenous variable (e.g., MAILS)
using a set of exogenous variables, including the excluded variable, in the
smoothing splines (e.g., cubic splines, kringing), polynomial
first stage regression and using the residuals from this stage in the second splines (e.g., P-splines, B-Splines), and regression splines and
stage as a regressor with the assumption that the errors of the two stages kernel-based methods such as LOESS (or LOcal regrESSion)
follow a bivariate normal distribution (Luan and Sudhir 2010; Wooldridge and local polynomial kernels (for details on smoothing
2010). Inclusion of the first stage residuals in the second stage “solves the
endogeneity problem regardless of how the endogenous regressor appears”
methods, interested readers should refer to Fahrmeir et al.
(Imbens and Wooldridge 2007, p. 12), offering distinct advantages for 2013; Pagan and Ullah 1999; Ruppert et al. 2003; Simonoff
models nonlinear in endogenous variables. Having said that, we estimated 1996; Wu and Zhang 2006). While each method has its pros
another model using the classical IV method (2SLS approach) and, not sur- and cons, we select the penalized-spline (P-spline) method
prisingly, obtained identical results to those reported in the manuscript.
Moreover, the AIC and BIC values of the proposed model (using control due to its flexibility and computational efficiency (Tan et al.
function approach) is marginally lower (or better) than the ones with the IV 2012) and the fact that it has also previously been used in
approach. marketing (Sloot et al. 2006; Stremersch and Lemmens 2009).
15
Compared to other methods, P-splines show no boundary
To account for the dynamic panel bias introduced due to the presence of the effects, can fit polynomial data exactly, and can conserve
lagged dependent variables , we use a system-GMM estimator that relies on
relatively mild restrictions (Blundell et al. 2000), where we use the lagged
moments of the data (Eilers and Marx 1996). In addition, the
differences in the dependent variable as instruments for our equations in penalty used by the P-spline approach is more general than
levels, i.e., MAILSij = ΔMAILSij–1 + eij (Arellano and Bover 1995; Blundell the one used for smoothing spline. Finally, P-splines are easy
and Bond 1998) and use the predicted value of mailers sent in the previous to estimate using a mixed model estimation methodology and
period ( MAILS ij −1 ).

MIS Quarterly Vol. 40 No. 4/December 2016 919


Saboo et al./Using Big Data to Model Time-Varying Effects

are not sensitive to knot parameter selection (Baladan- Equation 11 can now be estimated using a linear mixed model
j 0
dayuthapani et al. 2005). with ai , i = 1, …, 3, j = 0 or 1 as fixed effects and a 3+ k ,

The general idea behind P-splines is that we can approximate


a 31+ k , k = 1, 2, …, K as random effects with variances η0 and
function β(t) with lower order polynomial functions, for η1, respectively. The variable parameters effectively shrink
example, truncated power basis (Tan et al. 2012): the random effect coefficients with a small variance parameter
implying that the random effect coefficients would be closer
to zero and hence a smooth function, whereas a large variance
t 0 , t 1 , t 2 ,  , t q , (t − τ 1 ) + ,  , (t − τ K ) +
q q
(9)
parameter implies larger random effect coefficients and hence
a closer fit. The optimal balance can be determined using the
where the first q + 1 terms are the power functions of t of
restricted maximum likelihood (REML) approach as demon-
order 0, 1, 2, …, q, and the remaining K terms are truncated
strated by Wand (2003). Moreover, such models can be
q order power function determined by K truncation points of
easily estimated across several platforms (e.g., nlme package
knots τ1, τ2, …, τK over the range of t; the notation (t – τ)+q
in R or S-Plus, PROC MIXED in SAS), making it an
indicates that the function equals zero for t # τ and (t – τ)
excellent practical choice. Ngo and Wand (2004) and Tan et
otherwise; we can specify quadratic splines using q = 2 and
al. (2012) provide a friendly implementation guide for SAS,
cubic splines using q = 3.
R, and S-Plus.
The choice of q is less critical, but given that “quadratic
splines are not often used” (Jain 2003, p. 269), we use q = 3 Thus, in addition to our focal variable (MAILS), our final
(or cubic splines).16 We can now specify the unknown model includes customer demographic and psychographic
coefficient functions in Equation 4 as: characteristics and recent transaction characteristics. Speci-
fically, we include customer characteristics such as age
K (AGE), financial stability (FSTABILITY), household size
β0 (t ij ) = a 00 + a10 t ij + a 20 t ij2 + a 30 t ij3 +  a 30+ k (t ij − τ k ) +
3
(10a) (HSIZE), population density of the area they live in (PDENS),
k =1
marital status (MARRIAGE), and interest in gardening
K (GARDEN) or home improvement (DECOR). Given the
β1 (t ij ) = a 01 + a11t ij + a 21 t ij2 + a 31t ij3 +  a 31+ k (t ij − τ k ) +
3
(10b) seasonal nature of our product category, we also include
k =1
seasonality (SEASON) to indicate the summer months (April
0 1 and May). We also include recent transaction characteristics
where coefficients a 3+ k and a 3+ k are then shrunk toward
zero (or penalized) to obtain smoother estimates (Wand such as REDEEM, RETURNS, and CROSSBUY. Further,
2003). Inserting the above coefficient functions (Equations acknowledging that some of these transaction characteristics
10a and 10b) in Equation 4, we obtain are known to have nonlinear effects (Petersen and Kumar
2009), we include quadratic effects of MAILS, REDEEM,
K
Sij = a00 + a10tij + a20tij2 + a30tij3 +  a30+ k (tij − τ k ) 3+ + a01 X ij RETURNS, and CROSSBUY.
k =1
(11)
K Finally, to account for any serial correlation, we use a first-
+ a t X ij + a t X ij + a t X ij +  a
1
1 ij
1 2
2 ij
1 3
3 ij
1
3+ k (tij − τ ) X ij + εij
3
k + order autoregressive model (e.g., Naik and Raman 2003),
k =1
where we include the lagged value of our dependent variable,
The above model is a linear regression model that can be esti- SALES. The presence of lagged dependent variable (SALES
mated using ordinary least squares regression. Wand (2003) in Equation 13 and MAILS in Equation 7) violates the strong
0 1
suggests treating a 3+ k , a 3+ k , k = 1, 2, …, K in the above exogeneity assumptions. To account for the dynamic panel
equation as random variables with normal distributions to bias introduced due to the presence of these lagged dependent
obtain optimal smoothing parameters. variables, we use a system-GMM estimator that relies on rela-
tively mild restrictions (Blundell et al. 2000), where we use
a30+ k ~ N (0, η0 ), a31+ k ~ N (0, η1 ); k = 1, 2,  , K (12) the lagged differences in the dependent variable as instru-
ments for our equations in levels, i.e., SALESij = ΔSALESij–1
+ eij (Arellano and Bover 1995; Blundell and Bond 1998).
The underlying assumption behind these instruments is that
past changes in the dependent variable y are uncorrelated with
16
An approximation with quartic or higher-order splines is not recommended the current errors in levels (Roodman 2009). Thus, the final
due to Runge’s phenomenon (i.e., an oscillation problem occurring at the model that we estimate is:
boundary of an interval; Chapra 2011)

920 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

( ) ( ) ( ) ( )
SALESij = β0 tij + β1 tij MAILSij + β2 tij MAILSij2 + β3 tij REDEEMij with the age of the housewife. Married customers
+ β (t ) REDEEM + β (t ) RETURNS + β (t ) RETURNS
2 2 (MARRIAGE) or those living in dense (or urban) areas
4 ij ij 5 ij ij 6 ij ij
(PDENS) may have little time for do-it-yourself (DIY)
+ β (t )CROSSBUY + β (t )CROSSBUY + β (t ) AGE 2
7 ij ij 8 ij ij 9 ij ij
(13) activities and hence are less likely to opt-in to receive mar-
+ β (t ) FSTABILITY + β (t ) HSIZE + β (t ) PDENS
10 ij ij 11 ij ij 12 ij ij keting communications from the retailer. Finally, in line with
+ β (t ) MARRIAGE + β (t )GARDEN + β (t ) DECOR
13 ij ij 14 ij ij 15 ij ij the optimum stimulation level (OSL) theory (Steenkamp and
+ β (t )SEASONS + γ SALES + γ η
16 ij ij 1 + γ IMR + ε ij − 1
MAILS
2 ij 3 i ij
Baumgartner 1992), customers interested in general reading
(MGEN) or “information seekers” pursue a higher level of
external stimuli and may be more likely to opt-in to receive
where SALES ij−1 is the predicted value of SALES from the
marketing communications than those with low OSL.
equation SALESij = ΔSALESij–1 + eij.

To provide a benchmark model comparison, we estimate the


above equation using each of the three specifications: (1) no Endogeneity Correction Procedure
time-varying effects (baseline model), (2) multilevel model
with parameter estimates as linear functions of time tij, and Next, we present the first stage results of our control function
(3) the proposed time-varying effects model with cubic approach to correct for the potential endogeneity of the
splines. mailers sent in Table 4. The results from Table 4 provide
some insights into the retailers mailing strategy. In line with
the view that firms use historical allocations to guide future
resource allocation (Doctorow et al. 2009), we find that the
Results number of mailers sent in the current period depends on the
number of mailers sent in the previous period (β = .756, p <
We present the pairwise correlations and the descriptive .001). Similarly, the mailing strategy is influenced by recent
statistics in Table 2. Next, we present the results of our customer transactions, and customer characteristics. The
estimation, starting with the results of our sample selection number of mailers sent is positively related to sales (β =
and endogeneity correction models. We then provide a fit .00002, p < .05), returns (β = .054, p < .001), and coupon
comparison of various specifications, and present and discuss redemptions (β = .741, p < .001), but negatively related to
the results of the best model.17 crossbuy (β = -.032, p < .001) of the previous period. Thus,
other than the negative effect of cross-buy, our results are in
line with the expectations that would suggest that the number
Sample Selection Procedure of mailers sent to customers should increase with their inter-
actions with the firm (Simester et al. 2006). One explanation
We present the results of the first stage probit model as for the negative effect could be that firms may not send as
detailed in Equation 5 in Table 3. The results provide insights many mailers to loyal customers (as indicated by high levels
into consumers’ decision to receive marketing communica- of cross-buys) due to the lower return on investments for loyal
tions. The likelihood of opting-in to receive marketing com- customers, as compared to those sent to disloyal customers
munications increases with customers’ financial stability (β = (i.e., a nonlinear effect of cross-buy on mailers).
.002, p < .001), household size (β = .003, p < .05), age of the
head of the household (β = .002, p < .001), online purchases To explore this thought further, we estimated the model with
(β = .032, p < .001), and interest in general reading (β = .041, the quadratic term for cross-buy and indeed we do find a non-
p < .001). In contrast, the likelihood of opting-in to receive linear relationship between cross-buys and the number of
marketing communication decreases with population density mailers sent, such that the number of mailers received by
(β = -.004, p < .001) and marriage (β = -.030, p < .001). customers increases at a decreasing rate as cross-buy
Although the results are specific to our product category, they increases. This result confirms the view that customers that
are in the expected direction and in line with some of the are loyal to the firm (as indicated by high cross-buys) require
studies in this domain. For example, Cotton and Babb (1978) little persuasion and marketing investments (Dick and Basu
find that household size has a positive influence on coupon 1994). Customers interested in gardening (β = .254, p < .001)
usage, and Teel et al. (1980) find that coupon usage decreases and home improvement (β = .187, p < .001) receive more
mailers than those who are not interested in such activities,
17
We reestimated all of our models after removing the insignificant variables
which is not surprising given the retailer’s focus on these
and obtained qualitatively identical results. Thus, in the interest of theoretical categories. As one would expect, the retailer sends more
relevance, we have retained the insignificant variables in the results. mailers to customers who respond to mail orders than those

MIS Quarterly Vol. 40 No. 4/December 2016 921


Saboo et al./Using Big Data to Model Time-Varying Effects

Table 2. Pairwise Correlation Coefficients


Variable (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17)
1 SALES 1.000
2 MAILS -0.019a 1.000
3 REDEEM 0.222a 0.041a 1.000
a
4 RETURNS 0.335 -0.005a 0.132a 1.000
a
5 CROSSBUY 0.630 -0.031a 0.218a 0.382a 1.000
a
6 PDENS -0.008 0.011a -0.007a 0.012a 0.010a 1.000
a
7 SEASON 0.063 -0.003a 0.039a 0.019a 0.057a 0.000 1.000
a
8 FSTABILITY 0.017 0.020a 0.016a 0.012a 0.018a -0.123a 0.000 1.000
9 HSIZE 0.008a -0.017a -0.005a -0.001c 0.011a -0.028a 0.000 0.136a 1.000
10 GARDEN 0.002a 0.013a 0.012a 0.003a 0.009a -0.076a 0.000 0.303a 0.156a 1.000
11 DECOR 0.003a 0.012a 0.009a 0.003a 0.007a -0.057a 0.000 0.324a 0.174a 0.592a 1.000
12 MARRIAGE 0.015a -0.028a 0.007a 0.009a 0.027a -0.117a 0.000 0.183a 0.314a 0.182a 0.181a 1.000
13 AGE -0.006a -0.025a 0.013a 0.015a 0.006a -0.028a 0.000 0.011a -0.065a 0.224a 0.168a 0.093a 1.000
14 ONLINEPURCH -0.005a 0.039a 0.002a -0.009a -0.007a -0.044a 0.000 0.299a 0.154a 0.289a 0.28a 0.116a 0.019a 1.000
15 MGEN 0.000 0.000 0.003a 0.002a 0.001a -0.025a 0.000 0.048a 0.044a 0.085a 0.075a 0.048a 0.108a 0.049a 1.000
16 MPOL 0.001b 0.000 0.001c 0.001c 0.001a 0.003a 0.000 0.007a 0.000 0.002a 0.00a a 0.000 0.001a 0.002a 0.002a 1.000
17 MOR 0.003a 0.007a 0.006a 0.003a 0.007a -0.038a 0.000 0.300a 0.186a 0.427a 0.480a 0.174a 0.126a 0.239a 0.062a 0.001a 1.000
Mean 65.949 4.429 0.027 0.088 0.990 5.180 0.167 14.778 3.257 0.822 0.870 0.762 51.442 0.551 0.077 0.000 0.939
S.D. 178.291 5.836 0.199 0.380 1.693 2.399 0.373 4.754 1.496 0.382 0.337 0.426 13.229 0.497 0.266 0.010 0.240
Notes. ap < 0.001, bp < 0.01, cp < 0.05

Table 3. Parameter Estimates for the First-Stage Probit Model to Correct for Sample Selection
Independent Variables Parameter Estimates Standard Errors 95% Confidence Interval
Financial stability index (FSTABILITY) 0.002*** 0.0004 [0.001, 0.003]
Household size (HSIZE) 0.003* 0.001 [0.001, 0.006]
Population density (PDENS) -0.004*** 0.001 [-0.006, -0.003]
Age (AGE) 0.002*** 0.0002 [0.0021, 0.0027]
Marriage (MARRIGE) -0.030*** 0.005 [-0.040, -0.021]
Interest in gardening (GARDEN) -0.003 0.006 [-0.016, 0.010]
Interest in home improvement (DECOR) 0.016* 0.007 [0.001, 0.030]
Online purchases (ONLINEPURCH) 0.032*** 0.004 [0.024, 0.040]
General reading (MGEN) 0.041*** 0.007 [0.027, 0.055]
News and political reading (MPOL) 0.432 0.223 [-0.006, 0.869]
Constant 0.118*** 0.012 [0.094, 0.142]

Notes. *p < 0.05, **p < 0.01, ***p < 0.001

922 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

Table 4. First-Stage Results for the Control Function Approach to Correct for Endogeneity of Mailers
95% Confidence
Independent Variables Parameter Estimates Standard Errors Interval
SALESj-1 0.00002* 0.00001 [0.0000042, 0.0000044]
MAILS ij −1 0.756*** 0.002 [0.753, 0.759]
REDEEMj-1 0.741*** 0.007 [0.728, 0.754]
RETURNj-1 0.054*** 0.004 [0.047, 0.062]
CROSSBUYj-1 -0.032*** 0.001 [-0.034, -0.029]
Population density (PDENS) 0.031*** 0.003 [0.025, 0.036]
Seasonality (SEASON) -0.013*** 0.004 [-0.012, -0.006]
Financial stability index (FSTABILITY) 0.029*** 0.002 [0.026, 0.032]
Household size (HSIZE) -0.065*** 0.005 [-0.074, -0.055]
Interest in gardening (GARDEN) 0.254*** 0.024 [0.207, 0.300]
Interest in home improvement (DECOR) 0.187*** 0.028 [0.133, 0.242]
Marriage (MARRIAGE) -0.39*** 0.018 [-0.424, -0.355]
Age (AGE) -0.013*** 0.001 [-0.015, -0.012]
Mail order responder (MOR) 0.124** 0.042 [0.041, 0.207]
Constant 1.314*** 0.053 [1.209, 1.418]

Notes: *p < 0.05, **p < 0.01, ***p < 0.001

who do not (β = .124, p < .01) to increase the response rate. above, the P-splines approach approximates the lower order
In contrast, married customers (β = -.390, p < .001) receive polynomial function by dividing the entire time range t into
less mailers than those who are single. The negative finding (K+1) smaller regions using K truncation points or knots. We
is in line with Baker (2013), who argued that newly married distribute the knots evenly over the entire time period.
couples, couples with young kids, and women are less likely Although there is no agreement on the number of knots (K) to
to be interested in DIY activities (McGoldrick and Collins be selected, Wand (2003) suggests that K can be selected as
2007). Similarly, the number of mailers decreases with a minimum number between 35 and T/4, where T is the num-
customer’s age (β = -.013, p < .001), in line with the view that ber of distinctive measurement times. Since our dataset spans
mature customers may be less interested in DIY home 36 months, we select K = MIN (35, 36/4) = 9 for the discus-
improvements. Also, the retailer reduces the number of sion of our results.18
mailers sent out in summer or around summer months or
moving season (β = -.013, p < .001), as customers may We compare the model fit with respect to alternate model spe-
require less persuasion during summer months, which is the cifications. Specifically, we compare the following models:
peak season for gardening as well as home improvement (1) baseline model with no time-varying effects as specified
projects. Finally, the retailer sends more mailers to customers in Equation 1, (2) multilevel model with parameter estimates
who are financially stable and are likely to buy more (β = as linear functions of time tij, as specified in Equation 2,
.029, p < .001) and live in densely populated areas (β = .031, (3) time-varying effects model with linear, quadratic, and
p < .001), and less to those who live in small households as cubic splines for only transaction characteristics (MAILS,
customers with larger households, who are likely to be REDEEM, RETURNS, and CROSSBUY), and (4) the full time-
married and with kids, are less likely to be interested in DIY varying effects model with linear, quadratic, and cubic splines
activities and hence less interested in the products offered by for all variables as specified in Equation 13. The fit statistics
the focal retailer (β = -.065, p < .001).

Model Fit Comparisons 18


Our results are highly robust to the number of knots. Results with 5 # K #
12 yield virtually identical results. Incidentally, the model with K = 9 also
Before we present the model fit comparison, we briefly dis- has the best fit statistics (AIC/BIC values) and hence we use the same for
cuss the knot selection issue. As discussed in Equation 9 further discussions.

MIS Quarterly Vol. 40 No. 4/December 2016 923


Saboo et al./Using Big Data to Model Time-Varying Effects

Table 5. Comparison of Model Fit Statistics


Model Trend Specification -2 Res LL AIC BIC
Baseline (Time-invariant) NA 1.1200E+08 1.1200E+08 1.1200E+08
MLM NA 1.1182E+08 1.1182E+08 1.1182E+08
Only transaction characteristics (MAILS, Linear spline 111,812,703 111,812,715 111,812,703
REDEEM, RETURNS, and Quadratic spline 111,811,815 111,811,827 111,811,815
CROSSBUY) specified as time-varying Cubic spline 111,811,453 111,811,465 111,811,453
Linear spline 111,812,656 111,811,987 111,811,965
All variables specified as time-
Quadratic spline 111,811,044 111,811,068 111,811,044
varying
Cubic spline 111,810,744 111,810,768 111,810,744

Table 6. Parameter Estimates for the Baseline Model (Without Any Time-Varying Effects)
Parameter 95% Confidence
Independent Variables Estimates Standard Errors Interval
SALES j −1 0.06*** 0.007 [0.047, 0.074]
Mails (MAILS) 0.412*** 0.027 [0.359, 0.466]
2
Mails × Mails (MAILS ) -0.013*** 0.002 [-0.016, -0.009]
Redeemed coupons (REDEEM) 98.065*** 0.81 [96.476, 99.653]
2 2
Redeemed coupons (REDEEM ) -7.647*** 0.277 [-8.189, -7.105]
Return frequency (RETURNS) 48.046*** 0.531 [47.006, 49.086]
2 2
Return frequency (RETURNS ) 0.129 0.187 [-0.237, 0.495]
Crossbuy (CROSSBUY) 44.926*** 0.136 [44.660, 45.192]
2 2
Crossbuy (CROSSBUY ) 2.734*** 0.028 [2.679, 2.790]
Population density (PDENS) -1.076*** 0.021 [-1.118, -1.035]
Seasonality (SEASON) 11.599*** 0.133 [11.338, 11.860]
Financial stability index (FSTABILITY) 0.233*** 0.012 [0.208, 0.257]
Household size (HSIZE) 0.311*** 0.03 [0.252, 0.370]
Interest in gardening (GARDEN) -1.398*** 0.14 [-1.672, -1.123]
Interest in home improvement (DECOR) -0.006 0.166 [-0.331, 0.319]
Marriage (MARRIAGE) -1.613*** 0.114 [-1.836, -1.389]
Age (AGE) -0.092*** 0.006 [-0.103, -0.081]
Residual(RESID) -0. 557*** 0.018 [-0.592, -0.522]
Inverse Mills Ratio (IMR) -15.211*** 3.063 [-21.216, -9.206]
Constant 18.492*** 2.699 [13.201, 23.784]
Notes: *p < 0.05, **p < 0.01, ***p < 0.001

924 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

are presented in Table 5, where we include the log-likelihood increase in MAILS sent increases the SALES per month to
and various fit statistics.19 $65.73 compared to $63.77 at the mean value of MAILS, an
increase of $1.96 or 3%. However, as can be seen from
As can be seen from the fit statistics, inclusion of temporal Figure 3, the nonlinear effect vanishes over the customer life
variations improves the model fit. Not surprisingly, the base- cycle. Coupon redemptions (REDEEM) also have an
line model (with no time-varying effects) has the lowest fit inverted-U shaped effect on the sales, suggesting that the
(AIC and BIC: 1.200E+08), and the fit improves for the customers who redeem a lot of coupons (or shop only
multilevel model (AIC and BIC: 1.1182E+08). The proposed promotional goods) may hurt sales (e.g., Kopalle et al. 1999).
TVEM framework provides the best fitting model (AIC and However, as can be seen from Figure 3, the effect is far more
BIC: 1.1181E+08), suggesting that the effect of variables in nuanced with the relationship showing significant variations
our model varies over time. The results from the three models as customers develop a relationship with the retailer. Both
provide the same overall conclusion, with each subsequent returns (RETURNS) and cross-buy (CROSSBUY) exhibit a
model providing a more refined understanding of the relation- nonlinear effect on sales. Specifically, as the returns and
ship between the variables. To understand the differences cross-buys increase, sales increase at an increasing rate. This
between the baseline model and our proposed approach, we finding is in line with the view that product returns reduce the
present the results of the baseline model in Table 6,20 the customer’s purchase risk and hence encourage future sales
multilevel model in Figure 1, the time-varying effects model
(Petersen and Kumar 2009). We hasten to highlight that our
in Figure 2, and duly discuss the effects of each of our
results are only related to sales and not profitability, which
variables.21
remains a fertile avenue for future research. Similarly, when
customers buy across multiple categories, there is an
increased likelihood of them purchasing across multiple pro-
Parameter Estimates duct categories on each purchase occasion, resulting in higher
sales as compared to those who do not shop across multiple
First, we highlight that both the correction terms are signifi- categories (Kumar et al. 2008).
cant, providing due justification for our correction procedures.
The negative coefficient of the inverse Mills ratio parameter The parameter estimates of our control variables are also in
estimate suggests that the customers in our sample (those that the expected direction for a retailer specializing in home
have opted-in) have fewer sales than those excluded from the improvement, construction, and gardening. Customers that
sample. Further, as can be seen from the results, the sales are financially stable (FSTABILITY), and live in larger house-
series (SALES) exhibits significant persistence; and the lagged holds (HSIZE), purchase more than others. In contrast to our
sales are a significant predictor of future sales (e.g., Naik and expectations, interest in gardening (GARDEN) has a signifi-
Raman 2003). Regarding the role of mailers, our results cant negative effect on sales. Informal surveys with execu-
suggest that the number of mailers (MAILS) have an inverted- tives and customers revealed that the focal retailer has a lower
U shaped effect on purchases, possibly due to information reputation for gardening products as compared to another
overload or fatigue (e.g., Eastlick et al. 1993). Moreover the major retailer. Thus, the customers that are interested in
results are economically significant; results from the baseline gardening shop at a competitor’s store or other specialty
model suggests that, on average, a one standard deviation retailers. Sales are lower for married customers (MARRIAGE)
in line with the view that married customers (or those with
19
The Baltagi-Wu LBI-statistic of 1.97 and the modified Bhargava et al.
young kids) have little time for home improvement or
Durbin-Watson statistic of 1.91 suggests serial-autocorrelation is not an issue. gardening activities (Baker 2013). Along similar lines, sales
decrease with the age (AGE) of customers as mature
20
To avoid the p-value problem associated with large samples, we use a customers may be less motivated to engage in such activities.
conservative approach and only discuss results that are significant at the 5%
level or better and also report the confidence intervals (Lin et al. 2013).
Although the basic results are interesting, the value of our
Further, in line with the economics literature (Petrin and Train 2010;
Wooldridge 2010), we now implement a bootstrap to account for the extra
TVEM model lies in its ability to highlight the temporal varia-
source of variation and correct for the generated regressor’s standard error. tions in these customer-brand relationships, as can be seen in
Figure 2. The plots clearly reveal that the effect of these
21
The coefficient functions β(tij) in the TVEM framework are approximated variables is hardly constant over the time period, a fact that is
using lower order polynomial functions as discussed in Equation 11. Thus, largely overlooked by traditional estimation techniques. To
in our case, each coefficient function is a function of 13 estimates as we use assess the robustness of our findings, we estimate several al-
9 truncation points or knots, resulting in a total of 173 parameter estimates.
Hence, we only present the plots of the coefficient functions to conserve
ternate specifications, including using different sets of exclu-
space; parameter estimates can be requested from the authors. sion variables in the self-selection and endogeneity correction

MIS Quarterly Vol. 40 No. 4/December 2016 925


Saboo et al./Using Big Data to Model Time-Varying Effects

Notes: Plots of control variables (e.g., PDENS, HSIZE) are omitted to conserve space; the same can be requested from
the authors.

Figure 1. Results from Multilevel Linear Regression Model

models, alternate measures for some of our variables, regarding the underlying states and the time required to
excluding squared terms, using a first-order autoregressive estimate such models (Leeflang et al. 2009). To illustrate this
model, varying the number and locations of the knots (K), and claim, we compare our TVEM approach with DLM estima-
including interaction effects. Our results are highly robust to tion. Since DLM cannot handle the large dataset (as can be
these alternate specifications and the model presented in the seen from Table 7), we use a synthetic dataset for this
paper is superior to such alternate specifications. comparison.22 We present the fit statistics (AIC and BIC
values) along with the time required for estimating these
models for different number of individuals (N) in Table 7.
Comparison with Dynamic Linear Models

We suggested earlier that the existing state-space method-


ologies such as dynamic linear models (DLM) or Kalman-
filters (Petris et al. 2009; West and Harrison 1997) are not 22
For this exercise, we used a simple model with just one independent
suitable for analyzing large datasets due to the assumptions variable and 120 time periods.

926 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

Notes: Plots of control variables (e.g., PDENS, HSIZE) are omitted to conserve space; the same can be requested from
the authors.

Figure 2. Time-Varying Effect Model with All Variables Specified as Time-Varying

As expected, DLM outperforms the TVEM approach for Predictive Validity


small Ns. However, the TVEM model provides a better fit as
N increases. The number of parameters to be estimated Although TVEM is an explanatory model that allows us to
remains constant for TVEM, whereas it increases exponen- capture the changes in the coefficient function, managers can
tially for the DLM estimation. Moreover, even for a very still use the results to make forward predictions. To demon-
small sample of N = 50, the DLM estimation takes over strate the predicitve abilities of our proposed model, we assess
twenty-four hours (making it impractical for big data the predictive accuracy of our model relative to alternate
analytics) compared to a few seconds in case of TVEM. In specifications. Specifically, we carry out both in-sample and
sum, for large datasets, the TVEM approach not only fits out-of-sample predictive validity tests and compare our model
better but also requires significantly less time for estimation. to two alternate specifications: (1) no time-varying effects

MIS Quarterly Vol. 40 No. 4/December 2016 927


Saboo et al./Using Big Data to Model Time-Varying Effects

Figure 3. 3D Plots for the Effect of MAILS, REDEEM, RETURNS, and CROSSBUY on SALES Over Time

(baseline model) and (2) multilevel model with parameter Accounting for Individual Heterogeneity
estimates as linear functions of time tij. We use the root mean
square error (RMSE) between the predicted and actual values Pauwels et al. (2004) note that the problems associated with
for the prediction tasks. For the in-sample prediction task, we ignoring unobserved heterogeneity are especially significant
predict the sales for an average customer from month 4 in dynamic models. Although we account for such unob-
through 36;23 for the out-of-sample task, we predict sales for served heterogeneity using the mixed effects specification,
four periods (37 through 40) after our estimation period.24 one could argue in favor of both the intercept and slope
The results in Table 8 provide strong support for our model heterogeneity25 (e.g., Pesaran et al. 1996), an issue that we
relative to the other models. The proposed model fits both in- discuss in this section.
sample and out-of-sample data well, providing credibility and
confidence in the explanatory as well as the predictive power First, given the nature of big-data, where we have access to
of our model. large volume and variety of information about consumers, the
concerns of unobserved heterogeneity are significantly miti-
gated. Firms literally have access to almost all of the possible
information about their consumers that can be included in
their estimation, partly mitigating the concerns of unobserved
23
Given the autoregressive nature of our model (Equation 13), we cannot heterogeneity. Moreover, as has been frequently pointed out,
make predictions for the first three periods. unobserved heterogeneity is a form of selection bias or
24
omitted variable bias (Murray 2005; Schunck 2014) and we
For the prediction task, we used the estimates from the training dataset and account for the same through our control function approach.
the observed values of the other transaction variables (e.g., REDEEM,
RETURNS, and CROSSBUY) to compute the predicted SALES. As a practical More importantly, our TVEM methodology addresses the
matter, if a company wants to predict SALES in future periods, we bigger challenge presented by velocity of data, wherein firms
recommend using the average values or the last observed values for
transaction variables. An alternate approach would be to estimate the model
25
using the lagged values of the other transaction values. We thank an anonymous reviewer for this suggestion.

928 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

Table 7. Comparison with Dynamic Linear Models


N=1 N=2 N=3 N = 20 N = 30 N = 35 N = 40 N = 45 N = 50
AIC 548.6 1163.6 1807.5 15279.9 23779.3 28720.5 33423.2 38573.0 43871.9
BIC 557.0 1185.9 1849.3 16506.4 26455.2 32330.3 38106.2 44468.5 51119.4
No. of
DLM 3 8 15 440 960 1295 1680 2115 2600
parameters
Estimation 0.5 1.5 10.5 15 24.5
3 min 2 hours 5 hours 7 hours
time min min hours hours hours
AIC 730.6 1527.7 2309.8 16119.3 24184.2 28469.9 32834.1 37132.9 41164.9
BIC 754.6 1552.5 2340.9 16151.2 24354.4 28574.3 32892.6 37241.7 41378.6
Number of
TVEM 28
Parameters
Estimation
Less than a minute
time

Notes: N is the number of individuals; for TVEM, the number of knots is 10. Both models were estimated on the same machine.

Table 8. Predictive Accuracy by Model Type


Proposed Model Baseline Model
Prediction Task Measure (TVEM) (time-invariant) Multilevel Model
RMSE 1.81 15.13 9.14
In-Sample
MAD 0.25 1.82 1.23
RMSE 2.30 5.79 2.55
Out-of-Sample
MAD 0.97 2.59 1.22
Notes: Root mean squared error (RMSE) represents the sample standard deviation of the differences between predicted values and observed
values; mean absolute deviation (MAD) is the mean of the absolute deviations between predicted and actual values.

Table 9. Comparison of Model Fit Statistics between (Baseline and TVEM) Models that Account for
Heterogeneity and Those that Do Not
Model Fit (AIC value) Model Fit (BIC value)
Baseline Model with Random Intercepts 37,032.5 37,164.7
Baseline Model with Random Coefficients 36,334.6 36,454.7
TVEM without Random Coefficients (Proposed One) 36,325.2 36,307.2
TVEM with Random Intercepts 36,317.2 36,293.2
TVEM with Random Coefficients 36,129.4 36,103.4
Notes: Random Coefficients include random intercepts and random slopes.

collect data on a continuous basis, by allowing us to incor- geneity at the customer level. Random slopes allow the
porate updated information in real time and update decisions parameter estimates to vary across individuals. The addition
accordingly. of random coefficients, however, requires a different estima-
tion technique and significantly increases the model com-
Nevertheless, TVEM can be extended to account for both plexity and estimation time. Thus, one needs to weigh the
intercept and slope heterogeneity by adding random coeffi- relative benefits of such an approach. To establish the
cients (both random intercept and random slopes). A random- relative importance of time-varying effects and accounting for
intercept model allows intercepts to vary with individual individual heterogeneity, we carry out additional analysis on
customers and allows us to control for unobserved hetero- a subsample (100 customers) to compare several models that

MIS Quarterly Vol. 40 No. 4/December 2016 929


Saboo et al./Using Big Data to Model Time-Varying Effects

help us tease out the relative importance of time-varying of segments based on costs and benefits of having multiple
effects and individual heterogeneity.26 Specifically, we esti- segments. Most firms make resource allocation decisions on
mate (1) a baseline panel data regression model with random segments rather than on individual customers as it is
intercepts, (2) a baseline panel data regression model with extremely expensive to personalize offerings at an individual
random coefficients that includes both random intercept and level. Thus, our approach of analyzing the TVEM model for
random slopes, (3) baseline TVEM model, (4) TVEM model a segment of customer is in line with organizational realities.
with random intercepts, and (5) TVEM model with random
coefficients that includes both random intercept and random As a demonstration of the above approach, we created sub-
slopes and present the results in Table 9. segments of customers based on available demographic
variables using the K-means clustering algorithm and
As can be clearly seen, the baseline TVEM model without estimated the TVEM model for each segment with the
any random coefficients provides a better fit compared to a random-intercept. We varied the number of segments from 30
panel data model that accounts for both intercept and slope to 300 (with the corresponding change in the number of
heterogeneity. Not surprisingly, the TVEM models with customers in each segment) and successfully estimated our
random intercepts and random coefficients do better than the model for each subsegment. Clearly, firms should use their
baseline TVEM model. The results highlight that firms will managerial judgment in using the appropriate segmentation
be better off with the TVEM model and that the value of variables and the size of segments. However, in line with our
incorporating time varying effects will only go up as the overall objective, the proposed approach can easily handle
volume of data goes up. These results also suggest that big both slope and intercept heterogeneity.
data has its unique challenges that may be more important
than issues that plague traditional estimation approaches (such Next, we would like to highlight that one can easily estimate
as endogeneity, heterogeneity, and autocorrelation). the first difference model (that eliminates any time-invariant
individual characteristics) within our TVEM framework,
Given the economic significance of accounting for time- which may eliminate any residual concerns about unobserved
varying effects and the fact that most firms make resource heterogeneity. Finally, we would also like to highlight that
allocation decisions on segments rather than on individual we estimated the random-intercepts model using a Bayesian
customers as it is extremely expensive to personalize of- Markov Chain Monte Carlo technique that can easily account
ferings at an individual level, we instead recommend the use for unobserved heterogeneity. We can share the WinBUGS
of segmentation techniques to model both slope and intercept code upon request. The code can be easily extended by
heterogeneity (e.g., Andrews et al. 2002; Bago d'Uva 2005; scholars to other environments (e.g., R or C++) for future
Clark et al. 2005).27 Specifically, firms can use their existing research.
segmentation bases or segment their customers based on
desirable characteristics using the K-means clustering
algorithm or latent-class segmentation (Wedel and Kamakura Accounting for Endogeneity of
2000) and then run the TVEM model with random intercepts other Transactional Variables
for each segment.28 Firms need to choose the optimal number
Whereas we correct for the potential endogeneity of MAILS
in the proposed model as that is the only variable that the
26
We replicated the analysis multiple times on different sets of customers and focal firm can directly influence, one can argue that our other
obtain the same pattern of results. We assume that both the random intercepts transactional variables may also be endogenous and should be
and the slope parameters follow a normal distribution. However, scholars can corrected for.29 For example, REDEEM may be endogenous
explore the idea of specifying these parameters as functions of individual as consumers using coupons may be qualitatively different
characteristics (i.e., hierarchical specification). We thank an anonymous (e.g., consumers who have higher intention to purchase may
reviewer for this suggestion.
search for and use coupons actively) from other customers.
27 Similarly, CROSSBUY and RETURNS may be considered
Please note the TVEM approach of recovering the temporal variations in
relationships between variables is fundamentally distinct from person- endogenous. For example, Kumar et al. (2008) document that
alization techniques that allow individual-level customization such as customers who shop across multiple product categories shop
recommendation systems that assume stable preferences and recommend more; similarly, Petersen and Kumar (2009) demonstrate that
products “based upon a description of the item and a profile of the user's returns increase sales by reducing the uncertainty associated
interests” (Pazzani and Billsus 2007, p. 325).

28 29
While it may be very expensive to manage, firms can create micro- We thank an anonymous reviewer for highlighting the potential
segments (with a few hundred customers in each segment), if desired. endogeneity of our transaction variables.

930 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

with product purchases. In sum, in addition to MAILS, our in our case), other marketing-mix elements may have persis-
estimation approach may also have to account for the poten- tent effects (Osinga et al. 2010).31
tial endogeneity of REDEEM, CROSSBUY, and RETURNS.
In line with Osinga et al. (2010, p. 174) who indicate that
Accordingly, we estimate an additional model to test and “persistent effects can occur only in nonstationary series,” we
correct for the endogeneity of the three transactional variables performed a unit root test and found overwhelming evidence
(i.e., REDEEM, RETURNS, and CROSSBUY), in addition to against the null hypothesis of a unit root and conclude that all
our focal marketing variable (i.e., MAILS). Specifically, we of our focal variables are stationary. That is, there is no evi-
follow the same control function approach, where we first dence of any long-run (or persistent) effects in the data as
predict the potentially endogenous variable using a set of there is no evolutionary or long-run component in SALES
exogenous variables (e.g., demographics) and use the correc- (Dekimpe and Hanssens 1995).
tion terms (residuals) from the first stage as additional
explanatory variables in our final model. We use the cumu- However, we acknowledge that other marketing mix elements
lative values of the respective variables as exclusion variables may have persistent effects and that there could also be delays
for identification purposes. The cumulative value of the in consumers’ responses to marketing actions.32 For such
respective transaction characteristics (Cum_REDEEM, situations, the TVEM framework can be easily extended to
Cum_RETURNS, and Cum_CROSSBUY) is likely to be a include additional lags of marketing inputs. We recommend
good predictor of the current value of the respective variable, a cautious approach as including additional lags can cause
but is less likely to be a good predictor for current sales. multicollinearity issues.
Indeed, the correlations between cumulative and the current
value of our transaction variable is high, whereas there is
almost no correlation between the cumulative transaction Incorporating Additional Data
variables and current sales. Thus, the final model that we
estimate is as follows: As we suggested earlier, our original objective is to have
firms update their resource allocations frequently as and when
( )
Sij = β0 t ij +  + γ 1η1MAILS
ij + γ 2 η2REDEEM
ij new data becomes available. This leads to a question about
the impact of such frequent updates on the results of our
+γ 3η3RETURNS
ij + γ 4η4CROSSBUY
ij + γ 5 IMRi (14) model. Clearly, as new data comes in, some of the estimates
+εij will change to reflect the changes in the shape of the coeffi-
cient function. However, to establish the reliability of our
approach and for managers to trust our model, it is imperative
where γi, i = 1, þ, 5 are the coefficients for the correction
that our results do not change dramatically (e.g., change signs
terms.
or large changes in the magnitudes of our estimates) every
time new data comes in. To alleviate those concerns, we re-
While results from the final model confirm the endogeneity of
estimate our model for a subsample of 259,308 customers for
our transactional variables as all the correction terms are
whom we have 40 months of transaction information and
significant, the results are qualitatively identical to the ones
present the results in Figure 4. Minor differences aside, our
reported here and can be obtained from the authors, providing
results are very similar to those that we presented for the 36-
confidence in our results.30
months sample earlier, lending further credibility and confi-
dence to our approach.
Incorporating Long-Term Effects of A related concern that one may have is whether it is optimal
Marketing-Mix Elements to use all data (starting from the first period) to estimate the
model. It is possible that using use a “moving window ap-
We include only the contemporaneous MAILS in our model. proach” to emphasize more recent behavioral data may yield
Thus, our model does not account for the long-term effects of more accurate sales predictions.33 Given that consumer pref-
MAILS sent in previous periods. While this may be valid for
promotional material that are time sensitive and typically have
a very small redemption period (typically less than 10 days, 31
We thank an anonymous reviewer for this insight.

32
We thank an anonymous reviewer for this insight.
30
The results of our first stage models and those of the final model (Equation
33
14) can be obtained from the authors. We thank an anonymous reviewer and the editor for this comment.

MIS Quarterly Vol. 40 No. 4/December 2016 931


Saboo et al./Using Big Data to Model Time-Varying Effects

Notes: Plots of control variables (e.g., PDENS, HSIZE) are omitted to conserve space; the same can be requested from
the authors.
Results from Time-Varying Effect Model with Extended Sample (40 Months)

erences evolve, their recent behavior should be better the most recent 24 months of data to make one-month forward
predictor of their future actions, making the moving window predictions. Specifically, the MADs for a one-month forward
approach intuitively attractive. prediction using the most recent 12, 18, 24, 30, and 36 months
data are .31, .06, .01, .31, and .55 respectively. Clearly, the
To test this assumption, we estimated our final model with interval will be different for other contexts, but these results
different moving windows (most recent 12, 18, 24, 30, and 36 highlight that it may not be necessary to use all of the data
months of data) to predict the sales for the next month and (starting from first period) to make better predictions and thus
measure the predictive accuracy using the standard measure, significantly lower the estimation complexities.35 We hasten
mean absolute deviation (MAD).34 The results confirm the
intuition that the recent data can produce more accurate sales
predictions. In our analysis, we get the best prediction using 35
For example, we estimated another model to predict the sales for the next
quarter (four months) using the same moving window approach and found
that using the data from the most recent 30 months data provides the best
34
Other metrics such as root mean squared error (RMSE) and mean absolute predictions. Thus, managers need to identify the optimal number of data
error (MAE) provide similar conclusions. points required for the task at hand.

932 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

to highlight that if the objective is to understand how the co- same. Specific to our context, we find that firms can increase
efficient function evolves so that the learning can be used for their revenues per customer from $156.83 (under the baseline
a new cohort of customers, firms may want to use the full model that assumes a constant effect of marketing actions) to
data. $183.03, by using the proposed TVEM approach over a 36-
month window (i.e., gain an increase of 17% in revenues per
customer without any additional expenditure).36 Stated dif-
ferently, firms can achieve the same level of sales as under the
Managerial Implications baseline scenario even after reducing the number of mailers
and Discussion sent under the baseline scenario by a fifth, indicating savings
over 20% in marketing budgets. These results provide some
Firms are collecting more data on their customers than ever credence to the McKinsey report that suggests that retailers
before, which offers both significant opportunities and chal- can increase their margins by more than 60% by using big-
lenges as firms try to convert all of this data into actionable data to its full potential (Manyika et al. 2011).
insights. A recent McKinsey report suggesting that retailers
using big data can increase their operating margin by more The results from our model can help managers reallocate their
than 60%, duly highlights this opportunity (Manyika et al. marketing resources to increase sales per customer by over
2011). However, Parmar et al. (2014, p. 89) argue “com- 17% without any additional investment.37 Moreover, exe-
panies are notoriously bad at finding ways to make money” cutives can easily use the proposed model to make projections
and suggest that companies use data they own (or have access for future time periods and allocate their marketing resources
to) to facilitate growth. Along this vein, the objective of our accordingly.38 We simulated this forward-looking resource
research was to utilize volumes of customer transaction data allocation approach, wherein we used the parameter estimates
that organizations collect to guide their marketing resource at the end of the 36th month to make resource allocation
allocation decision across customers. Specifically, we exam- decisions for the next four months (months 37 through 40).
ine a retailer’s decision to send marketing mailers to cus- The results suggest that, on average over the four month
tomers. We conceptualize that customers’ response to such period, firms can increase their revenues by up to 20.55%
mailers change as they become familiar (and establish a from their current levels using the TVEM approach compared
relationship) with the company and that firms should adjust to only 8.86% using the baseline (time-invariant) model with-
their marketing allocations accordingly on a frequent basis to out additional investments.39 Although we make projections
increase the returns from their marketing investments. Given
the dynamic environment, frequent updating of resource
36
allocation decisions is essential as suggested by Fruk et al. The absolute sales numbers do not include the effects of other variables.
(2013) who demonstrate that frequent reallocation of re- 37
sources results in superior performance. Accordingly, we We used the first order condition to compute the optimal number of mails
to be sent using the baseline model. The baseline model suggests that the
propose and demonstrate a time-varying effects model that
explicitly accounts for the temporal changes in the effective-
(
optimal number of MAILS = − 2 × (0−.412 )
0.013)
= 16 , that is, 16 mails every month for
a total of 36 × 16 = 576 mails. We then use the results from the TVEM
ness of the marketing-mix elements and provide actionable model to reallocate the 576 mails that were sent under the baseline scenario
insights to managers. The proposed model relies on non- and compute the predicted sales per customer, which turns out to be $183.03,
parametric assumptions and hence is ideal for the big data compared to $156.83 under the baseline scenario; that is, an increase of 17%
context. The results from our time-varying effects model ((183.03 – 156.83)/156.83 = 17%) without any additional marketing invest-
ments. Please note that our semi-parametric approach does not lend itself to
provide a strong support for our framework and we urge traditional optimization techniques. Instead, we apply the first order condi-
managers to reconsider their marketing resource allocation tion at each time point to compute the number of mails to be sent in each
decisions. Firms can significantly increase performance and period under TVEM.
create value by targeting customers at the right time, when
38
they are proven to be most responsive to the firm’s commu- We would like to remind readers that although TVEM is an explanatory
nications, as opposed to sending communications on an ad model, it has nice predictive properties that can be used for predictive
hoc basis, when they are less likely to respond to organiza- purposes.
tional communications. Our results are both practically signi- 39
For the forward-looking resource allocation task, we compare the predicted
ficant and academically relevant, as we discuss in this section. SALES for next four months (months 37 through 40) by multiplying the
parameter estimates of the two models and the recommended number of mails
To managers, we highlight that using a constant resource sent. Once again, not surprisingly, the TVEM model delivers better numbers.
allocation approach is suboptimal. Instead, we recommend The results suggest that, on average over the four month period, firms can
that firms take into account customers’ changing behaviors in increase their revenues by up to 20.55% from their current levels using the
TVEM approach compared to only 8.86% using the baseline (time-invariant)
their resource allocation decisions and constantly update the
model without additional investments.

MIS Quarterly Vol. 40 No. 4/December 2016 933


Saboo et al./Using Big Data to Model Time-Varying Effects

for four periods for this exercise, in line with the McKinsey ments for including the impact of temporal variations in the
report that encourages frequent reallocation of resources effectiveness of marketing-mix elements. Although scholars
(Manyika et al. 2011), we encourage executives to estimate acknowledge that consumer response to marketing activities
the model in every decision period to frequently update their can vary over time due to a variety of factors such as con-
resource allocation strategy. To investigate the value of fre- sumer learning (Narayanan and Manchanda 2009), effects of
quent resource allocation, we repeated the forward-looking marketing actions (Janakiraman et al. 2008), or changes in
resource allocation simulation only for one period: used the consumer preferences (Neelamegham and Chintagunta 2004),
results at the end of 36th month to allocate resources for 37th the models for “dynamic marketing resource allocation
month. Results from the one-period forward simulation are in typically assume that marketing effectiveness is constant over
line with our expectations; firms can increase their revenues time” (Raman et al. 2012, p. 910).40 Thus, ours is the first
by over 21.54% from their current levels using the TVEM empirical study in the dynamic resource allocation literature
approach compared to only 12.12% using the recommenda- to model the temporal changes in marketing-mix effectiveness
tions from the baseline model. Thus, our results provide and recover the change function for firms to act on it. We
empirical support to the McKinsey study (Manyika et al. acknowledge that the proposed methodology may not be
2011) that calls for frequent resource allocation by incor- suitable for small panels (T < 10; Tan et al. 2012), however,
porating the latest information. with the proliferation of big data and the availability of large
datasets through other sources (e.g., Wharton Research Data
While it is not surprising to learn that consumer response to Services, Wharton Customer Analytics Initiative, World
marketing mailers changes over time, ours is the first research Bank), we hope scholars consider the temporal variations in
that allows managers to recover the exact pattern of behavior their models.
change (i.e., shape of the change) without forcing any
assumptions about the functional form and take actions based To the dynamic modeling literature, we offer an alternative to
on such changes. Most importantly, the proposed model does Kalman filtering (e.g., Osinga et al. 2010; Sriram et al. 2006)
not impose any additional requirements and can be easily and dynamic linear models (DLM) (Ataman et al. 2007; Van
implemented by most organizations using their existing Heerde et al. 2004) to incorporate temporal variations in the
resources. Finally, unlike some of the other models in this effectiveness of marketing-mix elements. While both Kalman
domain that are resource intensive and can only be imple- filtering and DLM have desirable properties and are attractive
mented infrequently, the proposed framework can be easily options to include time-varying parameters, they make
run frequently so that managers can adjust their resource assumptions about the underlying state-space, “take several
allocations on a “real-time” basis. Although the TVEM hours or days” to estimate, and require complicated coding in
frameork can be applied to a long list of variables that firms matrix language (Leeflang et al. 2009, p. 15). The proposed
may have, we suggest that they use managerial insights and TVEM framework does not make strong assumptions about
simple models (e.g., stepwise regression) to identify the the underlying states and can be easily implemented, pro-
important variables and use time varying effects for these viding an attractive alternative to the two established method-
variables. Given the ease of implementation of our frame- ologies for dynamic models. More importantly, unlike
work, we encourage managers to run the proposed model in Kalman filtering and DLM that assume discrete time and
every period (when new data becomes available) and adjust discrete state space (Dekimpe et al. 2008; Pauwels et al.
their resource allocation accordingly. To facilitate the adop- 2004), our proposed framework can be used for continuous
tion of our approach, we direct managers to Ngo and Wand time and continuous state space.41 Discrete time models
(2004) and Tan et al. (2012) who provide friendly implemen- suffer from temporal aggregation bias and depend on the
tation guides for SAS, R, and S-Plus; we can share the observed data frequency for model development (Bergstrom
WinBUGS code for the Bayesian implementation upon and Nowman 2007). Continuous time modeling avoids such
request. issues and, more importantly, allow predictions for shorter
intervals, allowing firms to act in real-time (Bergstrom 1996).

Implications for Theory


40
For an exception, see Wiesel et al. (2011) who use a VAR specification to
In addition to the practical significance of our study, our explicitly account for the fact that (online and offline) media communications
research also contributes to the literature on marketing can have varying effectiveness over different stages of consumers’ “purchase
funnels,” but do not recover the shape of the change function.
resource allocation, dynamic modeling, and big data analytics.
41
Xie et al. (1997) propose an extension to the Kalman-filtering approach to
To the marketing resource allocation literature (e.g., Montoya accommodate continuous state and discrete observations in the context of
et al. 2010; Naik et al. 2005), we provide compelling argu- new product diffusion.

934 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

Similarly, discrete space models are sensitive to the assump- elements and we sincerely hope that our research will moti-
tions about the number of underlying states (Leeflang et al. vate scholars to consider temporal variations in their models.
2009). Thus, our research answers the call by Leeflang et al. We outline the potential substantive as well as methodological
(2009) who argue for the importance of accounting for time- topics that scholars can fruitfully pursue.
varying parameters and our study offers a potential solution
for scholars looking for ways to do the same and can be easily In terms of substantive topics, scholars can investigate the
implemented in other contexts with large panels where the temporal effects of a variety of marketing activities. Scholars
relationship between variables can be expected to vary. can investigate other customer-focused resource allocation
Further, TVEM offers an alternate approach to the functional activities such as targeted promotions or salesperson resource
regression analysis, which is designed for sparse longitudinal allocation. Similarly, scholars can easily extend our approach
data, where both the predictor and response are functions of to other units of analysis; for example, one can optimize
a covariate such as time (Shi and Choi 2011).42 Given our advertising spending by investigating advertising effective-
focus on the managerial application of the model where pur- ness across markets over time or make calculated market
chases (or most other covariates) do not have a smooth entry or exit decisions such as an airline’s decision to fly (or
functional form and where firms do not suffer with limited stop flying) to a market. One can easily tweak our approach
information (or measurement) issues, the proposed TVEM to investigate firm-level decisions such as investments in
approach presents a nice complement to the functional data geographical markets or product technology. Another area of
analysis. TVEM does not assume any underlying functional future research would be to examine how marketing activities
form for variables and uses all of the available data without (such as mailers) influence online and offline purchases and
any major investments within existing organizational infra- the substitution affect of the two.43
structure, and allows firms to reap the purported benefits of
big data. We firmly believe that TVEM should be viewed as Although our model is fairly robust and flexible to accom-
another tool in the toolkit of scholars working on longitudinal modate a range of topics, we believe that scholars can build
data that offers significant value without much up-front costs. upon our framework and extend it even further. First,
although we account for individual heterogeneity using the
Finally, our research contributes to the emerging stream of random-intercept specification and slope-heterogeneity (i.e.,
research on big data analytics. The TVEM approach allows allow for the slopes to vary across customers) using segmen-
firms to understand temporal variations in the relationships tation techniques, we are barely scratching the surface here.
between variables of interest, enabling them to adjust stra- Consumers react differently to marketing stimuli and scholars
tegies in real time. Thus, our research responds to the call by can find ways to personalize marketing actions instead of
Goes (2014, p. vi) to help firms in the “generation of knowl- estimating “average effect” to ensure the effectiveness of
edge and intelligence to support decision making” using big marketing resource allocation. Second, instead of the time
data. factor that we use as a reference in our framework, scholars
can experiment with other reference variables such as size or
age. Time may not be a valid reference in many situations,
Opportunities for Future Research thereby making our proposed approach of limited value, for
example when the initiation of a relationship is not observed.
Our research addresses an important gap in the dynamic Alternately, scholars can treat the unobserved starting point
marketing resource allocation literature by incorporating as a missing data problem and extend the model using data
temporal variations in the effectiveness of marketing-mix augmentation techniques to estimate the model when the
starting point is not observed. Third, scholars can examine
persistent or long-term effects of marketing-mix elements.
42
Functional regression is a particular case of functional data analysis (FDA) Although we can easily incorporate additional lags of
and a generalization of regression to the case when outcomes or regressors marketing-mix elements within the TVEM framework, other
or both are functional objects instead of scalars (Bapna et al. 2008; robust techniques that account for the transient and persistent
Crainiceanu et al. 2009). Thus, unlike our TVEM approach where the focus effects of marketing-mix elements should be explored.
is on estimating the smooth coefficient function β(t) (or the evolving relation
Another fruitful avenue is to model how some of the other
between variables), functional regression analysis emphasizes using non-
parametric techniques to develop functional variables x(t) from limited
variables influence the shape of the change function and how
observed data. An essential assumption of the FDA approach is that the ith firms can alter the shape of the change function to their
response is a smooth real function with associated covariate vector (i.e., there advantage. For instance, our results (Figure 2) show that the
exists a smooth functional behavior of the generating process that underpins
the data; Faraway 1997). This property allows one to use the limited number
of data points available (possibly with error) to recover the functional form
43
of the variables and use the same to carry out functional regression. We thank an anonymous reviewer for this suggestion.

MIS Quarterly Vol. 40 No. 4/December 2016 935


Saboo et al./Using Big Data to Model Time-Varying Effects

effectiveness of mailers (MAILS) decreases with time and any Bergstrom, A. R. 1996. Survey of Continuous Time Econometrics,
research that can highlight ways to prevent this decline in the New York: Cambridge University Press.
effectiveness of MAILS will be highly valuable. Finally, Bergstrom, A. R., and Nowman, K. B. 2007. A Continuous Time
scholars can examine alternate dependent variables such as Econometric Model of the United Kingdom with Stochastic
profitability (which we could not model as we do not have the Trends,New York: Cambridge University Press.
cost information) or the rate of coupon redemption.44 Bierens, H. J., and Pott-Buter, H. A. 1991. “Specification of
Household Engel Curves by Nonparametric Regression,”
Econometric Reviews (9:2), pp. 123-184.
Blundell, R., and Bond, S. 1998. “Initial Conditions and Moment
Acknowledgments
Restrictions in Dynamic Panel Data Models,” Journal of
Econometrics (87:1), pp. 115-143.
We thank the coeditors of the special issue on big data and the
Blundell, R., Bond, S., and Windmeijer, F. 2000. “Estimation in
review team for their valuable guidance in the revision process. We
Dynamic Panel Data Models: Improving on the Performance of
thank the firm for providing us the data for the study. We thank
the Standard GMM Estimator,” in Nonstationary Panels, Panel
Anita Luo, Denish Shah, Yi Zhao, and Gayatri Shukla for their
Cointegration, and Dynamic Panels, B. H. Baltagi (ed.),New
comments on an earlier version of the manuscript. We also thank
York: Elsevier Science Inc., pp. 53-91.
Renu for copyediting the manuscript.
Chapra, S. C. 2011. Applied Numerical Methods with Matlab for
Engineers and Scientists (3rd ed.), New York: McGraw-Hill.
References Chen, H., Chiang, R. H., and Storey, V. C. 2012. “Business Intel-
ligence and Analytics: From Big Data to Big Impact,” MIS
Anderson, E. T., Hansen, K., and Simester, D. 2009. “The Option Quarterly (36:4), pp. 1165-1188.
Value of Returns: Theory and Empirical Evidence,” Marketing Chen, J., and Stallaert, J. 2014. “An Economic Analysis of Online
Science (28:3), pp. 405-423. Advertising Using Behavioral Targeting,” MIS Quarterly (38:2),
Andrews, R. L., Ainslie, A., and Currim, I. S. 2002. “An Empirical pp. 429-449.
Comparison of Logit Choice Models with Discrete Versus Con- Clark, A., Etilé, F., Postel Vinay, F., Senik, C., and Van der
tinuous Representations of Heterogeneity,” Journal of Marketing Straeten, K. 2005. “Heterogeneity in Reported Well Being:
Research (39:4), pp. 479-487. Evidence from Twelve European Countries,” The Economic
Arellano, M., and Bover, O. 1995. “Another Look at the Instru- Journal (115:502), pp. C118-C132.
mental Variable Estimation of Error-Components Models,” Cotton, B., and Babb, E. M. 1978. “Consumer Response to Promo-
Journal of Econometrics (68:1), pp. 29-51. tional Deals,” Journal of Marketing (42:3), pp. 109-113.
Ataman, M. B., Mela, C. F., and Van Heerde, H. J. 2007. “Con- Crainiceanu, C. M., Staicu, A.-M., and Di, C.-Z. 2009. “Gener-
sumer Packaged Goods in France: National Brands, Regional alized Multilevel Functional Regression,” Journal of the
Chains, and Local Branding,” Journal of Marketing Research American Statistical Association (104:488), pp. 1550-1561.
(44:1), pp. 14-20. Deighton, J., Henderson, C. M., and Neslin, S. A. 1994. “The Ef-
Bago d'Uva, T. 2005. “Latent Class Models for Use of Primary fects of Advertising on Brand Switching and Repeat Purchasing,”
Care: Evidence from a British Panel,” Health Economics (14:9), Journal of Marketing Research (31:1), pp. 28-43.
pp. 873-892. Dekimpe, M. G., Franses, P. H., Hanssens, D. M., and Naik, P. A.
Baker, F. 2013. “Improving Targeting: Lessons from DIY Beha- 2008. “Time-Series Models in Marketing,” in Handbook of
viour Online,” (http://marketingblogged.marketingmagazine.co. Marketing Decision Models, B. Wierenga (ed.), New York:
uk/2013/08/20/improving-targeting-lessons-from-diy-behaviour- Springer, pp. 373-398.
online/; retrieved December 31, 2013). Dekimpe, M. G., and Hanssens, D. M. 1995. “The Persistence of
Baladandayuthapani, V., Mallick, B. K., and Carroll, R. J. 2005. Marketing Effects on Sales,” Marketing Science (14:1), pp. 1-21.
“Spatially Adaptive Bayesian Penalized Regression Splines Dick, A. S., and Basu, K. 1994. “Customer Loyalty: Toward an
(P-Splines),” Journal of Computational and Graphical Statistics Integrated Conceptual Framework,” Journal of the Academy of
(14:2), pp. 378-394. Marketing Science (22:2), pp. 99-113.
Bapna, R., Jank, W., and Shmueli, G. 2008. “Price Formation and Divakar, S., Ratchford, B. T., and Shankar, V. 2005. “Practice
Its Dynamics in Online Auctions,” Decision Support Systems Prize Article—CHAN4CAST: A Multichannel, Multiregion
(44:3), pp. 641-656. Sales Forecasting Model and Decision Support System for Con-
Bendapudi, N., and Berry, L. L. 1997. “Customers’ Motivations for sumer Packaged Goods,” Marketing Science (24:3), pp. 334-350.
Maintaining Relationships with Service Providers,” Journal of Doctorow, D., Hoblit, R., and Sekhar, A. 2009. “Measuring Mar-
Retailing (73:1), pp. 15-37. keting: McKinsey Global Survey Results,” McKinsey &
Berger, P. D., and Bechwati, N. N. 2001. “The Allocation of Pro- Company (http://www.mckinsey.com/insights/marketing_sales/
motion Budget to Maximize Customer Equity,” Omega (29:1), measuring_marketing_mckinsey_global_survey_results; accessed
pp. 49-61. December 1, 2013).
Dubé, J. P., Hitsch, G. J., and Rossi, P. E. 2010. “State Dependence
and Alternative Explanations for Consumer Inertia,” RAND
44
We thank an anonymous reviewer for this suggestion. Journal of Economics (41:3), pp. 417-445.

936 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

Eastlick, M. A., Feinberg, R., and Trappey, C. 1993. “Information tive Pricing Implications,” Marketing Science (18:3), pp.
Overload in Mail Catalog Shopping: How Many Catalogs Are 317-332.
Too Many?,” Journal of Direct Marketing (7:4), pp. 14-19. Kumar, V. 2013. Profitable Customer Engagement: Concept,
Eilers, P. H., and Marx, B. D. 1996. “Flexible Smoothing with Metrics and Strategies, Thousand Oaks, CA: SAGE Publications.
B-Splines and Penalties,” Statistical Science (11:2), pp. 89-102. Kumar, V., Bhaskaran, V., Mirchandani, R., and Shah, M. 2013.
Elberse, A., and Eliashberg, J. 2003. “Demand and Supply “Creating a Measurable Social Media Marketing Strategy for
Dynamics for Sequentially Released Products in International Hokey Pokey: Increasing the Value and ROI of Intangibles and
Markets: The Case of Motion Pictures,” Marketing Science Tangibles,” Marketing Science (32:2), pp. 194-212.
(22:3), pp. 329-354. Kumar, V., George, M., and Pancras, J. 2008. “Cross-Buying in
Elsner, R., Krafft, M., and Huchzermeier, A. 2004. “Optimizing Retailing: Drivers and Consequences,” Journal of Retailing
Rhenania's Direct Marketing Business through Dynamic (84:1), pp. 15-27.
Multilevel Modeling (DMLM) in a Multicatalog-Brand Environ- Kumar, V., and Reinartz, W. 2012. Customer Relationship
ment,” Marketing Science (23:2), pp. 192-206. Management: Concept, Strategy, and Tools (2nd ed.), New York:
Fahrmeir, L., Kneib, T., Lang, S., and Marx, B. 2013. Regression— Springer Science & Business Media.
Models, Methods, and Applications, New York: Springer. Kumar, V., Zhang, X. A., and Luo, A. 2014. “Modeling Customer
Faraway, J. J. 1997. “Regression Analysis for a Functional Opt-in and Opt-out in a Permission-Based Marketing Context,”
Response,” Technometrics (39:3), pp. 254-261. Journal of Marketing Research (51:4), pp. 403-419.
Foekens, E. W., Leeflang, P. S., and Wittink, D. R. 1994. “A Com- Leeflang, P. S., Bijmolt, T. H., Van Doorn, J., Hanssens, D. M., Van
parison and an Exploration of the Forecasting Accuracy of a Heerde, H. J., Verhoef, P. C., and Wieringa, J. E. 2009.
Loglinear Model at Different Levels of Aggregation,” Inter- “Creating Lift Versus Building the Base: Current Trends in
national Journal of Forecasting (10:2), pp. 245-261. Marketing Dynamics,” International Journal of Research in
Fruk, M., Hall, S., and Mittal, D. 2013. “Never Let a Good Crisis Marketing (26:1), pp. 13-20.
Go to Waste,” McKinsey Quarterly (4), pp. 56-59. Leeflang, P. S. H., Wittink, D. R., Wedel, M., and Naert, P. A.
Garen, J. 1984. “The Returns to Schooling: A Selectivity Bias 2000. Building Models for Marketing Decisions, Boston:
Approach with a Continuous Choice Variable,” Econometrica Kluwer Academic Publishers.
(52:5), pp. 1199-1218. Lewis, M. 2004. “The Influence of Loyalty Programs and Short-
Givon, M., and Horsky, D. 1990. “Untangling the Effects of Pur- Term Promotions on Customer Retention,” Journal of Marketing
chase Reinforcement and Advertising Carryover,” Marketing Research (41:3), pp. 281-292.
Science (9:2), pp. 171-187. Lin, M., Lucas Jr., H. C., and Shmueli, G. 2013. “Too Big to Fail:
Goes, P. 2014. “Editor’s Comments: Big Data and IS Research,” Large Samples and the P-Value Problem,” Information Systems
MIS Quarterly (38:3), pp. iii-viii. Research (24:4), pp. 906-917.
Gönül, F. F., and Ter Hofstede, F. 2006. “How to Compute Opti- Luan, Y. J., and Sudhir, K. 2010. “Forecasting Marketing-Mix
mal Catalog Mailing Decisions,” Marketing Science (25:1), pp.
Responsiveness for New Products,” Journal of Marketing
65-74.
Research (47:3), pp. 444-457.
Gupta, S., and Steenburgh, T. 2008. “Allocating Marketing
Mahajan, V., Bretschneider, S. I., and Bradford, J. W. 1980.
Resources,” in Marketing Mix Decisions: New Perspectives and
“Feedback Approaches to Modeling Structural Shifts in Market
Practices, R. A. Kerin and R. O’Regan (eds.), Chicago:
Response,” Journal of Marketing (44:1), pp. 71-80.
American Marketing Association.
Manchanda, P., and Chintagunta, P. K. 2004. “Responsiveness of
Hastie, T., and Tibshirani, R. 1993. “Varying-Coefficient Models,”
Physician Prescription Behavior to Salesforce Effort: An
Journal of the Royal Statistical Society Series B (Methodological)
Individual Level Analysis,” Marketing Letters (15:2-3), pp.
(55:4), pp. 757-796.
129-145.
Heckman, J. J. 1979. “Sample Selection Bias as a Specification
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh,
Error,” Econometrica (47:1), pp. 153-161.
C., and Byers, A. H. 2011. “Big Data: The Next Frontier for
Hoch, S. J., and Ha, Y.-W. 1986. “Consumer Learning: Adver-
tising and the Ambiguity of Product Experience,” Journal of Innovation, Competition, and Productivity,” McKinsey Global
Consumer Research (13:October), pp. 221-233. Institute.
Hollander, M., Wolfe, D. A., and Chicken, E. 2013. Nonparametric McGoldrick, P. J., and Collins, N. 2007. “Multichannel Retailing:
Statistical Methods (3rd ed.), Hoboken, NJ: John Wiley & Sons. Profiling the Multichannel Shopper,” International Review of
Imbens, G. W., and Wooldridge, J. M. 2007. “Control Function and Retail, Distribution and Consumer Research (17:2), pp. 139-158.
Related Methods,” What’s New in Econometrics, Lecture Notes Mela, C. F., Jedidi, K., and Bowman, D. 1998. “The Long-Term
6, Summer 2007, National Bureau of Economics Research. Impact of Promotions on Consumer Stockpiling Behavior,”
Jain, M. K. 2003. Numerical Methods for Scientific and Engi- Journal of Marketing Research (35:2), pp. 250-262.
neering Computation, New Delhi, India: New Age International. Mizik, N., and Jacobson, R. 2003. “Trading Off between Value
Janakiraman, R., Dutta, S., Sismeiro, C., and Stern, P. 2008. Creation and Value Appropriation: The Financial Implications
“Physicians’ Persistence and its Implications for Their Response of Shifts in Strategic Emphasis,” Journal of Marketing (67:1), pp.
to Promotion of Prescription Drugs,” Management Science 63-76.
(54:6), pp. 1080-1093. Montoya, R., Netzer, O., and Jedidi, K. 2010. “Dynamic Allocation
Kopalle, P. K., Mela, C. F., and Marsh, L. 1999. “The Dynamic of Pharmaceutical Detailing and Sampling for Long-Term
Effect of Discounting on Sales: Empirical Analysis and Norma- Profitability,” Marketing Science (29:5), pp. 909-924.

MIS Quarterly Vol. 40 No. 4/December 2016 937


Saboo et al./Using Big Data to Model Time-Varying Effects

Murray, M. P. 2005. Econometrics: A Modern Introduction Raman, K., Mantrala, M. K., Sridhar, S., and Tang, Y. E. 2012.
Boston: Prentice Hall. “Optimal Resource Allocation with Time-Varying Marketing
Naik, P. A., and Raman, K. 2003. “Understanding the Impact of Effectiveness, Margins and Costs,” Journal of Interactive
Synergy in Multimedia Communications,” Journal of Marketing Marketing (26:1), pp. 43-52.
Research (40:4), pp. 375-388. Rao, A. R., and Monroe, K. B. 1988. “The Moderating Effect of
Naik, P. A., Raman, K., and Winer, R. S. 2005. “Planning Prior Knowledge on Cue Utilization in Product Evaluations,”
Marketing-Mix Strategies in the Presence of Interaction Effects,” Journal of Consumer Research (15:2), pp. 253-264.
Marketing Science (24:1), pp. 25-34. Reinartz, W., Thomas, J., and Kumar, V. 2005. “Balancing
Narayanan, S., and Manchanda, P. 2009. “Heterogeneous Learning Acquisition and Retention Resources to Maximize Customer
and the Targeting of Marketing Communication for New Profitability,” Journal of Marketing (69:1), pp. 63-79.
Products,” Marketing Science (28:3), pp. 424-441. Roodman, D. 2009. “A Note on the Theme of Too Many
Neelamegham, R., and Chintagunta, P. K. 2004. “Modeling and Instruments,” Oxford Bulletin of Economics and Statistics (71:1),
Forecasting the Sales of Technology Products,” Quantitative pp. 135-158.
Marketing and Economics (2:3), pp. 195-232. Rossi, P. E. 2014. “Even the Rich Can Make Themselves Poor: A
Ngo, L., and Wand, M. P. 2004. “Smoothing with Mixed Model Critical Examination of IV Methods in Marketing Applications,”
Software,” Journal of Statistical Software (9:1), pp. 1-54. Marketing Science (33:5), pp. 655-672.
Osinga, E. C., Leeflang, P. S., and Wieringa, J. E. 2010. “Early Ruppert, D., Wand, M. P., and Carroll, R. J. 2003. Semiparametric
Marketing Matters: A Time-Varying Parameter Approach to Regression, Cambridge, UK: Cambridge University Press.
Persistence Modeling,” Journal of Marketing Research (47:1), Rust, R. T., Ambler, T., Carpenter, G. S., Kumar, V., and
pp. 173-185. Srivastava, R. K. 2004. “Measuring Marketing Productivity:
Pagan, A., and Ullah, A. 1999. Nonparametric Econometrics, Current Knowledge and Future Directions,” Journal of Marketing
Cambridge, UK: Cambridge University Press. (68:4), pp. 76-89.
Pan, Y., and Lehmann, D. R. 1993. “The Influence of New Brand Saboo, A. R., Grewal, R., and Chakravarty, A. 2016. “Organiza-
Entry on Subjective Brand Judgments,” Journal of Consumer tional Debut on the Public Stage: Marketing Myopia and Initial
Research, pp. 76-86. Public Offerings,” Marketing Science (35:4), pp. 656-675.
Parmar, R., Mackenzie, I., Cohn, D., and Gann, D. 2014. “The Schunck, R. 2014. Transnational Activities and Immigrant Integra-
New Patterns of Innovation,” Harvard Business Review (92:1/2), tion in Germany, New York: Springer.
pp. 86-95. Seiders, K., Voss, G. B., Grewal, D., and Godfrey, A. L. 2005. “Do
Parsons, L. J. 1975. “The Product Life Cycle and Time-Varying Satisfied Customers Buy More? Examining Moderating Influ-
Advertising Elasticities,” Journal of Marketing Research (12:4), ences in a Retailing Context,” Journal of Marketing (69:4), pp.
pp. 476-480. 26-43.
Parsons, L. J., and Schultz, R. L. 1976. Marketing Models and Shankar, V. 2008. “Strategic Marketing Resource Allocation:
Econometric Research, Amsterdam: North-Holland Publishing Methods and Insights,” in Marketing Mix Decisions: New
Company. Perspectives and Practices, R.A. Kerin and R. O’Regan (eds.).
Pauwels, K., Currim, I., Dekimpe, M. G., Hanssens, D. M., Mizik, Chicago: American Marketing Association, pp. 154-183.
N., Ghysels, E., and Naik, P. 2004. “Modeling Marketing Shankar, V., Azar, P., and Fuller, M. 2008. “Practice Prize
Dynamics by Time Series Econometrics,” Marketing Letters Paper—BRAN* EQT: A Multicategory Brand Equity Model and
(15:4), pp. 167-183. Its Application at Allstate,” Marketing Science (27:4), pp.
Pauwels, K., and Hanssens, D. M. 2007. “Performance Regimes 567-584.
and Marketing Policy Shifts,” Marketing Science (26:3), pp. Shi, J. Q., and Choi, T. 2011. Gaussian Process Regression
293-311. Analysis for Functional Data, Boca Raton, FL: CRC Press.
Pazzani, M. J., and Billsus, D. 2007. “Content-Based Recommen- Simester, D. I., Sun, P., and Tsitsiklis, J. N. 2006. “Dynamic
dation Systems,” in The Adaptive Web, P. Brusilovsky, A. Kobsa Catalog Mailing Policies,” Management Science (52:5), pp.
and W. Nejdl (eds.), Berlin: Springer-Verlag, pp. 325-341. 683-696.
Pesaran, H., Smith, R., and Im, K. 1996. “Dynamic Linear Models Simonoff, J. S. 1996. Smoothing Methods in Statistics, New York:
for Heterogenous Panels,” in The Econometrics of Panel Data, Springer.
L. Mátyás and P. Sevestre (eds.), Dordrecht, The Netherlands: Sloot, L. M., Fok, D., and Verhoef, P. C. 2006. “The Short- and
Springer Netherlands, pp. 145-195. Long-Term Impact of an Assortment Reduction on Category
Petersen, J. A., and Kumar, V. 2009. “Are Product Returns a Sales,” Journal of Marketing Research (43:4), pp. 536-548.
Necessary Evil? Antecedents and Consequences,” Journal of Slotegraaf, R. J., and Pauwels, K. 2008. “The Impact of Brand
Marketing (73:3), pp. 35-51. Equity and Innovation on the Long-Term Effectiveness of
Petrin, A., and Train, K. 2010. “A Control Function Approach to Promotions,” Journal of Marketing Research (45:3), pp. 293-306.
Endogeneity in Consumer Choice Models,” Journal of Marketing Sriram, S., Chintagunta, P. K., and Neelamegham, R. 2006.
Research (47:1), pp. 3-13. “Effects of Brand Preference, Product Attributes, and Marketing
Petris, G., Petrone, S., and Campagnoli, P. 2009. Dynamic Linear Mix Variables in Technology Product Markets,” Marketing
Models with R, New York: Springer. Science (25:5), pp. 440-456.

938 MIS Quarterly Vol. 40 No. 4/December 2016


Saboo et al./Using Big Data to Model Time-Varying Effects

Steenkamp, J. B. E., and Baumgartner, H. 1992. “The Role of West, M., and Harrison, J. 1997. Bayesian Forecasting and
Optimum Stimulation Level in Exploratory Consumer Behavior,” Dynamic Models, New York: Springer.
Journal of Consumer Research (19:3), pp. 434-448. West, M., Harrison, P. J., and Migon, H. S. 1985. “Dynamic
Stern, P., and Hammond, K. 2004. “The Relationship Between Generalized Linear Models and Bayesian Forecasting,” Journal
Customer Loyalty and Purchase Incidence,” Marketing Letters of the American Statistical Association (80:389), pp. 73-83.
(15:1), pp. 5-19. Wiesel, T., Pauwels, K., and Arts, J. 2011. “Practice Prize Paper—
Stremersch, S., and Lemmens, A. 2009. “Sales Growth of New Marketing's Profit Impact: Quantifying Online and Off-Line
Pharmaceuticals across the Globe: The Role of Regulatory Funnel Progression,” Marketing Science (30:4), pp. 604-611.
Regimes,” Marketing Science (28:4), pp. 690-708. Wooldridge, J. M. 2010. Econometric Analysis of Cross Section
Tan, X., Shiyko, M. P., Li, R., Li, Y., and Dierker, L. 2012. “A and Panel Data (2nd ed.), Cambridge, MAL The MIT press.
Time-Varying Effect Model for Intensive Longitudinal Data,”
Wu, H., and Zhang, J.-T. 2006. Nonparametric Regression
Psychological Methods (17:1), pp. 61-77.
Methods for Longitudinal Data Analysis: Mixed-Effects
Teel, J. E., Williams, R. H., and Bearden, W. O. 1980. “Correlates
Modeling Approaches, Hoboken, NJ: John Wiley & Sons.
of Consumer Susceptibility to Coupons in New Grocery Product
Xie, J., Song, X. M., Sirbu, M., and Wang, Q. 1997. “Kalman
Introductions,” Journal of Advertising (9:3), pp. 31-46.
Filter Estimation of New Product Diffusion Models,” Journal of
Van Heerde, H. J., Mela, C. F., and Manchanda, P. 2004. “The
Dynamic Effect of Innovation on Market Structure,” Journal of Marketing Research (34:3), pp. 378-393.
Marketing Research (41:2), pp. 166-183. Zikopoulos, P., Parasuraman, K., Deutsch, T., Giles, J., and
Venkatesan, R., and Farris, P. W. 2012. “Measuring and Managing Corrigan, D. 2012. Harness the Power of Big Data, New York:
Returns from Retailer-Customized Coupon Campaigns,” Journal McGraw Hill.
of Marketing (76:1), pp. 76-94.
Venkatesan, R., and Kumar, V. 2004. “A Customer Lifetime Value
Framework for Customer Selection and Resource Allocation
Strategy,” Journal of Marketing (68:4), pp. 106-125. About the Authors
Venkatesan, R., Kumar, V., and Ravishanker, N. 2007. “Multi-
channel Shopping: Causes and Consequences,” Journal of Alok R. Saboo is an assistant professor of Marketing and assistant
Marketing (71:2), pp. 114-132. director of the Center for Excellence in Brand & Customer Manage-
Verhoef, P. C., Venkatesan, R., McAlister, L., Malthouse, E. C., ment, J. Mack Robinson College of Business, Georgia State
Krafft, M., and Ganesan, S. 2010. “CRM in Data-Rich Multi- University, Atlanta.
channel Retailing Environments: A Review and Future Research
Directions,” Journal of Interactive Marketing (24:2), pp. 121-137. V. Kumar (VK) is the Regents’ Professor, Richard and Susan
Walls, T. A., Jung, H., and Schwartz, J. E. 2006. Multilevel Models Lenny Distinguished Chair, and Professor of Marketing, and
for Intensive Longitudinal Data, New York: Oxford University executive director, Center for Excellence in Brand & Customer
Press. Management, J. Mack Robinson College of Business, Georgia State
Wand, M. P. 2003. “Smoothing and Mixed Models,” Computa-
University, Atlanta, GA, Chang Jiang Scholar, Huazhong University
tional Statistics (18:2), pp. 223-249.
of Science and Technology, China, Senior Fellow, Indian School of
Wang, R., Saboo, A. R., and Grewal, R. 2015. “A Managerial
Business, India, and and Faculty Fellow, Texas A&M University
Capital Perspective on Chief Marketing Officer Succession,”
Institute for Advanced Study, College Station, TX.
International Journal of Research in Marketing (32:2), pp.
164-178.
Wedel, M., and Kamakura, W. A. 2000. Market Segmentation: Insu Park is a doctoral student in Marketing at the Center for
Conceptual and Methodological Foundations, Boston: Kluwer Excellence in Brand & Customer Management, J. Mack Robinson
Publishing. College of Business, Georgia State University, Atlanta.

MIS Quarterly Vol. 40 No. 4/December 2016 939


940 MIS Quarterly Vol. 40 No. 4/December 2016
Copyright of MIS Quarterly is the property of MIS Quarterly and its content may not be
copied or emailed to multiple sites or posted to a listserv without the copyright holder's
express written permission. However, users may print, download, or email articles for
individual use.

You might also like