2010-07-19-A Niranjan Manuscript Markup 6-7-2011 Ready For Pub 7-13

An Alternative Theoretical Explanation and Empirical Insights
into Over-ordering Behavior in Supply Chains

Tarikere T. Niranjan
, Stephan M. Wagner, and Christoph Bode

Department of Management, Technology, and Economics, Swiss Federal Institute of Technology
Zurich, Scheuchzerstrasse 7, 8092 Zurich,
e-mail: ttniranjan@ethz.ch, stwagner@ethz.ch, cbode@ethz.ch

Cite as:
Niranjan T. T., Wagner, S. M., Bode, C. (2011): An Alternative Theoretical
Explanation and Empirical Insights into Overordering Behavior in Supply
Chains, Decision Sciences, Vol. 42, No. 4, November, pp. 859-888

Acknowledgement: The authors would like to thank B. A. Metri, Vijay Aggarwal, Nagananda
Kumar, and Srinagesh Gavirneni for their inputs during the initial stages of this research. The
authors are also grateful to Dinesh Sharma and his colleagues at the case study company for their
support. Finally, they would like to thank the editorial team for their help in greatly improving
the quality of this paper.

Corresponding author.

ABSTRACT
The beer game and the supply line underweighting (SLU) theory are central to our knowledge of
decision making in dynamic environments such as supply chains. The core of these theories is
that people are incapable of recognizing the pipeline inventory and this is the main cause of
overordering and dysfunctional behavior. This article identifies lacunae in the theoretical and
empirical foundations of extant literature and proposes an alternate explanation, a correction
model, explaining why overreactions occur. We adopt a multi-method research design,
comprising a field case study and laboratory experiments, to ground our findings.
Subject Areas: Dynamic Decision Making, Misperception of Feedback, Supply Chain
Management.

1
INTRODUCTION
Dynamic decision tasks are those in which decisions made today alter the state of the system and
thus affect information that decisions to be made tomorrow will be based on (Diehl & Sterman,
1995). The feedback loop connecting decisions and environmental cues embodies the dynamic
nature of the system. Experiments have shown that people learn poorly when they are delayed in
receiving feedback about the outcomes of their actions. These studies have been conducted and
applied to a variety of settings such as firefighting (Brehmer, 1995), inventory control (Sterman,
1989a, b; Davis & Kottemann, 1995), and management of software development teams
(Sengupta & Abdel-Hamid, 1993). This substantial body of knowledge rests on the supply line
underweighting (SLU) theory. At its core, SLU holds that people are incapable of recognizing
the delayed effects of their actions (Sterman, 1989a, b; Diehl & Sterman, 1995; Croson &
Donohue, 2002, 2005, 2006; Senge, 2006; Croson, Donohue, Katok, & Sterman, 2008). SLU is
generally considered an undesirable cognitive bias that causes several organizational problems.
SLU implies that inventory managers are incapable of keeping track of in-transit shipments
and as a result, they overorder. The results are delayed learning, poor performance, and system-
wide chaos (Sterman, 2000; Senge, 2006). SLU bias is articulated through the stock management
formulation model (Sterman, 1989a), which is the building block of system dynamics modeling
in supply chain management, inventory management, and capacity management. The stock
management formulation model also has various other applications such as human learning, and
the modeling of organizational-level dynamics (Sterman, 2000, 2006; Dogan, 2007). Sterman
(1989a) first proposed this model in the beer game, which simulates a serial supply chain game
with four players. Participants are asked to minimize the total supply chain cost within the rules
of the game; customer demand is exogenous, and stock-out costs and inventory costs are charged
to each player. Decision making is complicated by the fact that there are lead times for
2
information and physical flows. Today the beer game is one of the most popular business
simulation games. More detailed descriptions may be found at www.systemdynamics.org and in
Sterman (1989a). Stermans model was later validated by several beer game based studies
(Sterman, 2000; Croson & Donohue, 2002, 2005, 2006; Sterman, 2006; Wu & Katok, 2006;
Croson et al., 2008).
Surprisingly, despite being developed back in 1989, SLU has only been studied in
laboratory settings and has never been tested in a field setting (to the best of our knowledge).
Exploring how SLU plays out in a real world supply chain setting is an interesting question in its
own right, given the salient position SLU occupies in literature. In addition to the lack of
empirical support, we believe that there are several reasonable arguments which cast serious
doubts on the robustness of SLUs theoretical basis. Evidence of SLU even within beer games is
not unequivocal; in fact, as we show in this paper, it is just as likely that the direction of effects is
opposite to that predicted by SLU. This exposes a need to reexamine the theoretical and
empirical support of SLU and seek out more robust explanations in order to understand and
improve managerial decision making. Our study responds to this need.
This study uses a multi-method approach (Boyer & Swink, 2008) involving a case study
and laboratory experiments to identify an alternative explanation that complements existing
theory. The behavior posited here is centered on interpersonal dynamics: frequently, customers
in the real world yell at unresponsive suppliers hoping that it will improve future deliveries, or at
least to vent their frustration. In some cases, however, free communication is not possible. For
example, in the beer game, participants are not allowed to speak to their supply chain partners.
Similarly, in the real world, a powerful supplier will not listen to the smaller buyer, so
communication may not be possible. In such cases, customers may inflate their orders
commensurate with the level of undersupply, hoping this correction will influence and
3
improve the suppliers behavior. This behavior, which we call the correction model, is developed
through a field case study and a lab experiment.
ANALYSIS OF LITERATURE
Overview
SLU bias was discovered by Sterman (1989a) through an experiment based on the beer game.
SLU was subsequently incorporated into the literature on organizational behavior and decision
making, and found applications in many settings outside inventory control. Sterman (1989a)
developed this theory through the stock management formulation, Equation (1), which is now the
workhorse of system dynamics modeling, and has applications in a wide array of business
domains (Sterman, 2000). In this formulation,
t
S and
t
SL

represent the inventory and supply
line positions respectively at time t . The desired stock and desired supply line are denoted by

*
S

and
*
SL , respectively, and are assumed to be constants chosen by the managers through the
anchoring and adjustment (A&A) heuristic (explained in the next section) at the commencement
of the task. Denoting
t
O as order rate,
t
L as the expected loss rate (demand from downstream),
S
! and
SL
! as the fractional adjustment rates for stock and supply line, and
SL
S
!
"
!
= as the
weighting ratio, the hypothesized decision rule is

( ) ( )
* *
0,
t t S t SL t
O Max L S S SL SL ! !
" #
= + $ + $
% &
. (1)
The beer game simulation is then used to generate data and estimate the coefficients of this
model. A weighting ratio less than 1implies a lower weight for the supply line than for inventory.
It is then argued that because there is no rational justification for weighting the two variables
differently it must constitute a cognitive bias and considered evidence of SLU.
4
Empirical Support for SLU
There is abundant evidence that people perform poorly in the beer game (Senge, 2006; Sterman,
2006; Gino & Pisano, 2008). SLU is thought to be a robust phenomenon based on the
consistently poor performance noticed in beer game studies (Sterman, 1989a; Sterman, 2000;
Croson & Donohue, 2002, 2005, 2006; Oliva & Gonalves, 2005; Dogan & Sterman, 2006;
Senge, 2006; Sterman, 2006; Croson et al., 2008). Consider, for example, the decision heuristic
hypothesized by Croson and Donohue (2006). These authors relaxed Stermans assumptions that
people anchor and adjust, and instead proposed a simpler formulation:

[ ]
0,
t O I t R t S t N t t
O Max I R S N ! ! ! ! ! ! = + + + + + (2)
Orders
t
O

are regressed against the independent variables: in-hand inventory
t
I , orders
received
t
R , incoming shipment
t
S , and total outstanding orders
t
N . According to this
formulation, if there were no SLU, then the decision-makers would be fully cognizant of each of
the four variables; therefore 1
I N S
! ! ! = = = " . Croson and Donohue (2006) tested this
formulation using their beer game simulation data, reasoning that
[i]f participants are accurately accounting for the supply line, the coefficient of orders outstanding
should be the same as the coefficient on inventory. If Stermans conjecture holds (i.e. participants
are underweighting the supply line), then we should find
N I
! ! > .
(Croson & Donohue, 2006, p. 330)
Their analysis did in fact seem to support their hypothesis. Based on the average values
( 0.0302
N
! = " and 0.2368
I
! = " ) it was concluded that SLU was supported (i.e.
N I
! ! > ).
Croson and Donohue first estimated the individual-level parameters using regressions of
individual players decisions and then averaged them over the complete sample to base their
conclusions. Averaging the parameters can, however, mask the individual effects. A similar
5
example would be to conclude normalcy based on the combined average weight of a sample
comprising only obese and underweight people. A more appropriate approach would be to use
the individual-level results and model estimates, or to first aggregate the data, and then
investigate the model for group effects. Therefore we first re-examined Croson and Donohues
(2006) results on the model estimates individually. This led to some interesting findings.
The absolute value of
I
! exceeded that of
N
! , as predicted by theory. It must be noted,
however, that for this dataset to support SLU, not only should
I
! exceed
N
! , but the sign of
each parameter must also be within the bounds set by the SLU framework. For about 60% of the
participants, the weight for the supply line was within the hypothesized range of 1 0
N
! " < <
which was consistent with Stermans SLU framework. However, surprisingly, for the remaining
40% of the participants the weight was 0 1
N
! < < (i.e. people were placing opposite-to-
hypothesized sign of weight to the supply line). It implied that the more the quantity on order,
the more the present order quantity. This behavior is quite at odds with the hypothesized SLU
behavior: rather than forgetting to subtract the supply line, the results indicated that subjects
were cognizant of the supply line, and were adding it to their net quantity. Moreover, whereas
the average value of
N
! was found to be !0.0302 (as noted previously, this was statistically
significant), it did not seem to be substantially significant:
N
! was very close to 0. Even if one
or two data points were removed, the results could swing the other way and indicate a positive
value instead of a negative. As Combs remarked with concern about management research,
I see more and more studies in which correlations and standardized regression coefficients of 0.05
or less receive the prized label highly significant I wonder whether the corresponding
phenomenal statistical power might mask other shortcomings of our research designs and leave us
with (phenomenal power and) itty-bitty effect sizes that limit the relevance of our research.
6
(Combs, 2010, p. 9)
Effect sizes of less than 0.05 would mean that the field is plagued by false positives (Ioannidis,
2005). Therefore, the beer game evidence of SLU does not seem robust.
Metters (1997) averred that ignoring prior orders is highly implausible in real-life
inventory management. However, Sterman offers the following anecdotal evidence in support of
SLU:
You cook on an electric range. To get dinner going as soon as possible, you set the burner under
your pan to high. After a while you notice the pan is just hot enough, so you turn the heat down.
But the supply line of heat in the glowing coil continues to heat the pan even after the current is
cut, and your dinner is burned anyway.
You arrive late and tired to an unfamiliar hotel. You turn on the shower, but the water is freezing.
You turn up the hot water. Still cold. You turn the hot up some more. Ahhh. Just right. You step
in. A second later you jump out screaming, scalded by the now too-hot water. Cursing, you
realize that, once again, youve ignored the time delay for the hot water to heat the cold pipes and
get to your shower.
(Sterman, 2006, p. 47)
These examples are intuitively appealing, but it is not readily evident to what extent they
support the SLU model. To begin with, they do not resemble real-life inventory management
settings. Managers usually have refined information about their own operations and in general
make good decisions (Bowman, 1963; Bolton & Katok, 2008). In regular and repeated tasks such
as inventory management, managers have at least a passing familiarity with their environment,
know the range of demand to expect, use rules of thumb for ordering, and would not routinely
make egregious mistakes, just as most of us do not routinely burn dinner or scald ourselves in
bathtubs at home or even in unfamiliar hotels. As previous studies in field environments have
7
shown, experienced decision makers deal effectively with temporal dependencies; Joslyn and
Hunts (1998) field experiment of police dispatchers decisions regarding disposition of police
units, and Kanfer and Ackermans (1989) study of air traffic controllers are only two examples.
As Gibson (2000, p. 144) notes, this contrasts with Stermans (1989a) original observation that
decision makers are fundamentally unable to cope with feedback delays.
Moreover, in real-life inventory management, the supply line is almost always explicitly
recorded by means of enterprise resource planning (ERP) systems or other electronic or manual
recording systems. It therefore seems difficult to miss the supply line while computing current
orders. Stermans anecdotal evidence may not be representative of such settings. Be that as it
may, barring these anecdotal examples and the beer game evidence, to the best of our
knowledge, there is no evidence from the field of real-life inventory management that supports
SLU. We now present analysis of the theoretical support of SLU.
Theoretical Support of SLU
Let us recapitulate the anchoring and adjustment (A&A) heuristic (Tversky & Kahneman, 1974).
This heuristic captures the bias wherein people anchor or overly rely on certain values, and then
adjust them to account for other elements of the situation. Usually, once the anchor is set, there is
a bias toward that value, as described by Tversky and Kahneman (1974). In this study, people
were asked to guess the percentage of African nations that are members of the United Nations.
Those asked whether the percentage was more, or less, than 10% guessed lower values (their
answer averaged 25%) than those who were asked if the percentage was more, or less, than 65%
(their answer averaged 45%) (Figure 1). This indicated that, firstly, people subconsciously
anchored their answer to the cue. This is the anchoring part of the heuristic. Secondly, suppose
30% was the answer that the subject pool would have given on average in the absence of any
externally provided anchor, the people then adjusted the anchored value toward their
8
independent answer. This meant that the adjustment was in the correct direction (i.e. toward their
independent belief of the answer) but its magnitude was insufficient (i.e. it stopped somewhere
mid-way between the anchor and the independent answer). This is the adjustment part of the
A&A heuristic.

Insert Figure 1 here

Against this background, the stock management formulation may be analyzed. We find
that although A&A is the theoretical foundation of the stock management formulation, the
literature has been ambiguous on what exactly the anchor is. Sterman (1989a, p. 324; 2000, p.
670; 2006, p. 29) identifies the anchor explicitly, but inconsistencies remain in how it is
explained with reference to the supporting equations (Sterman, 1989a, p. 331):
t t t t
IO L AS ASL = + +
( )
*
t S t
AS S S ! = "
Here the anchor is the expected loss rate [emphasis added]
L . Adjustments are then made

to correct discrepancies between the desired and actual stock (AS), and between the desired and
actual supply line (ASL) (Sterman, 1989a, p. 324).
How do subjects select the desired stock? ...subjects strongly anchor desired stocks on
their initial level [emphasis added] (Sterman, 1989a, p. 334).
Thus, at the heart of the theoretical support of the stock management formulation, it is not
readily evident what the theorized anchor is that was later statistically supported through
Equation (1): was it was the expected loss rate (p. 324), or the initial stock level (p. 334)?
Let us now examine the main assumption behind the stock management formulation, that
desired stock
*
S is anchored to the initial stock at the commencement of the game and remains
at this level throughout the game. Note that in the beer game, demand varies spectacularly at
9
upstream echelons (of the order of 100s and even 1000s are common). Further, as the game
proceeds, one sees a series of high demands, followed by a series of zero demands, which is the
essence of the beer game (Senge, 2006). Sterman (1989a) characterizes this phenomenon as the
amplification, phase lag, and oscillation of orders as they propagate through the supply chain.
While Sterman suggests that players of the beer game will remain impervious to prevailing
demand when making ordering decisions (as indicated by their anchoring to a constant value), it
seems more reasonable to expect that players will periodically revise their desired stock (
*
S )
based on prevailing demand (perhaps the most salient environmental cue for inventory
decisions), especially in cases where demand fluctuates wildly. In real life, most suppliers, when
they see customers repeatedly placing extremely high order volumes, would not stick to their low
initial desired stock. For example, in the beer game, for an initial demand of 4, a beginning
inventory of 12 (as in the classic board version of the game), and an initial desired stock (
*
S ) of
20, it seems reasonable that when the demand moves to the 1000s, then the desired stock will be
revised substantially, likely to a value of several hundred, rather than remaining fixed at 20.
Specifically, experimental studies have established that inventory orders are anchored at
the mean of the incoming demand, which means that the incoming demand has a major influence
on the orders (Schweitzer & Cachon, 2000; Bolton & Katok, 2008; Gavirneni & Isen, 2010). If
anything, people may over-react to incoming signals (Bowman, 1963); this can also plausibly
influence their desired stock. Therefore the assumption that players remain impervious to
changes in demand seems unreasonable and the issue begs closer examination.
Sterman (1989a, p. 331) observes that [t]heory suggests the desired stock should be
chosen to minimize expected costs given the cost function and expected variability of deliveries
and incoming orders and that the desired supply line is in theory a variable. The regression
models reported in Sterman (1989a) assume, however, that the desired stock and desired supply
10
line are constant because complex calculations and revisions to these quantities during the course
of the game seem implausible on cognitive grounds: In the absence of a procedure to calculate
optimal inventory levels, however, one might expect the subjects choice of
*
S to be anchored to
the initial level of 12 units (p. 331).
Whereas it seems plausible that decision makers might not carry out complex
computations, to us it did not appear that they did not even update their desired stock to
reasonable levels. Instead, they might have set reasonable desired stocks intuitively or by some
implicit heuristics rather than by making complex calculations. We suspect this because in real
life managers would not likely have 12 as the desired stock when incoming customer orders are
in the thousands. For example, Croson et al. (2008) noted upstream orders in the tens of
thousands in the beer games even as the customer demand was a constant eight units per week. It
is quite unlikely that players would hold on to the initial desired stock level of 12 despite
incoming orders in the thousands. Therefore the selection of a constant
*
S in the stock
management formulation does not seem defensible.
We end this section with a final observation on the theoretical support of the stock
management formulation. It is believed that the A&A heuristic is the theoretical underpinning of
the formulation. From a careful examination of literature, however, A&A seems quite distinct
from and unconnected to the stock management formulation. In A&A, one typically estimates an
unknown quantity by beginning with (anchoring on) a known quantity, usually something
salient. One then adjusts the initial estimate provided by the anchor based on other factors.
The stock management formulation, however, seems to mix up the desired state (which in
system dynamics is derived from some locally rational heuristic or cognitive process) and the
anchor (which does not have any such justification in the A&A heuristic): in the A&A heuristic
people move away from the hypothesized anchor and toward the independent choice, which is
11
the subjects best answer if they were not influenced by the experimental treatment. In the stock
management formulation, people always move away from the current state, and toward the
hypothesized anchor. A simplified version is depicted in Figure 2.


Therefore, the A&A heuristic, the main theoretical support, does not seem related to SLU
formulation. It follows that SLU lacks both adequate empirical support and a theoretical basis,
and hence there is a need for a new theory to explain ordering behavior. The case study responds
to this need.
CASE STUDY
Company Background
The focal firm chosen was Phoenix (a pseudonym), located in one of the largest countries in
Asia. Phoenix began operations in 1985 as a small tier-1 supplier in collaboration with JTEKT
Corporation, Japan, and grew rapidly to become the domestic market leader for automotive
steering systems, with a market share of around 55% in 2006. It is a well-respected firm, and a
recipient of many prestigious national and international awards for its quality and management
practices.
Phoenix had implemented the Oracle ERP system in 2001, and some of its major
customers, primarily subsidiaries or joint-ventures with Japanese automotive OEMs Toyota and
Suzuki, operated on a just-in-time basis with real-time information sharing (100% visibility) of
production line status with their tier-1 suppliers.
Phoenix supplies more than 50 products (the exact number varies every few weeks due to
retirement of existing auto models, introduction of new ones, modifications, and customers
sourcing from other suppliers) to more than 20 domestic and foreign automotive OEM
12
customers. During 2006-07, which is the period we studied, the market was booming and was
relatively unaffected by the economic crisis in the West. There was severe pressure on capacity
in upstream echelons of the supply chain.
A centralized purchasing department with ten purchase executives handles procurement of
the approximately 1,500 different parts. The first stage of research involved socialization and
familiarization of their buying practices. One of the authors was assigned workspace among the
buyers in the open plan workplace almost daily over a three-month period and intermittently over
a six-month period; this made it possible to listen to the buyers and to closely monitor their work.
Incoming information from customers includes long-term forecasts, medium-term
forecasts, and firm and tentative production plans for the coming week or fortnight with day-
wise details. The short-term information consists of final call-offs. The production planning and
control department (PPC) uses the medium-term information to generate the company-wide
month plan that details the day-wise production for each of the four plants. Using this month
plan, each plant develops its own plan that details the shift-wise production plan, subject to the
stock position, production capacity, manning levels, and other considerations. The buyers feed
this information into the ERP system which then calculates orders and forecasts for suppliers.
Ideally, this information ought to be sent to the supplier unaltered.
We carefully measured the information and physical flows between the OEMs, Phoenix
PPC, Phoenix purchasing department, and the tier-2 suppliers for a select few components, and
we noticed an interesting phenomenon. The undercurrent in this supply chain was that everyone,
with the exception of some of the OEMs, was overordering in a systematic manner.
Consider the case of Phoenixs largest OEM customer, formerly a partial subsidiary of
Suzuki. Phoenix was among Suzukis preferred suppliers who were nurtured in its industrial hub
to enable just-in-time supply delivery. Suzuki constitutes over 40% of Phoenixs revenues. The
13
sales department of Phoenix received long term forecasts (three months to two years horizon)
from Suzuki at regular intervals. However, PPC also received copies of this information by
default. This was in addition to the full visibility PPC had over Suzukis production plans in the
short term; PPC could view the production line status in real time electronically, and its location
a mere 30 km from Suzuki ensured high quality information sharing including informally
transmitted inputs.
Surprisingly, information transmission internally within Phoenix or at the Phoenixtier 2
supplier interface was not smooth; there seemed to be significant information distortion. A direct
comparison of the information flows was not possible because Phoenix and Suzuki used different
information formats; neither the frequency nor the nature of information coincided. Also, the
term ordering is not straightforward and simple as a unitary number. Rather, information of
demand came at different time horizons. Suzuki issued a firm production plan for the upcoming
week, and tentative plans for the subsequent two weeks; this information was updated each week
on a rolling basis. Superseding this, the next days production plan detailed hour-wise
requirements, and this was updated (but rarely modified substantially) each shift. The important
point is that each of these conveyed some information about the demand to be expected,
although, technically, only the final call-off was the true order. This order was, however, not
literally placed by Suzuki, rather PPC viewed it electronically (they had access to the production
line status of Suzuki) and realized the true demand.
PPC consolidated information flowing in from various customers (in different formats) and
made firm production plans for the next month and tentative plans for the subsequent two
months. In addition, they would upload tentative requirements on the ERP system: a firm plan
for the upcoming week and a tentative plan for the subsequent fortnight. The buyers use this
information and transfer it in the same format to the order-generating module of the ERP, which
14
releases the firm demand (detailing the day-wise requirements) and tentative plans for the
subsequent fortnight. Buyers use this as a baseline and add corrections based on various day-to-
day exceptions that the ERP would not have considered. This is transmitted to tier-2 suppliers by
email. In addition, each day buyers would pass on the latest updates by phone to ensure that only
the required quantity arrived. Note that over 95% of Phoenixs suppliers were located within 100
km of its plant, and their goal was to make it 100% within 50 km soon, just as Suzuki had for its
tier-1 suppliers. All high-value items arrived in multiple batches per day, and Phoenix typically
held inventory sufficient to last from a couple of hours up to one day. A few low-value, low-
volume items such as fasteners and rubber items sourced from distant areas arrived in larger
batches once every few weeks. There was a high level of heterogeneity in the parts.
Despite the different information formats, we would expect the information to be
comparable at the aggregate level of, for example, a month across the supply chain. We carefully
recorded data on orders for a few high-value representative parts. Consider the axle assembly for
one of the most popular car models in the Suzuki portfolio. We carefully recorded the day-wise
data from different sources (the data sheets recorded by PPC in paper-based archives as well as
daily work sheets, the ERP system, and emails sent out by the buyer to tier-2 suppliers) and
aggregated the data over a month. At most levels, there was significant overordering. For
example, Suzuki had released tentative plans for the next fortnight totaling 17,359. The
approximately comparable value released by PPC was 17,690 and the corresponding value
passed by the buyer to the supplier was 19,392.
As for the day-wise information, the final call-offs by Suzuki totaled 17,280. This
compares favorably with actual production of 17,150 (which is the true demand, because all of
Suzukis demand is met 100% of the time throughout the year). The difference of 130 is worth
about a quarter of a days production, which could be explained by the inventory level variation
15
at Phoenix. However the sum of the day-level planned production figures passed by PPC to
buyers was 19,764, which clearly points to overordering. There was no corresponding day-level
information passed from buyers to tier-2 suppliers in written form. This information is conveyed,
as noted earlier, by telephone, and had a propensity for even more exaggeration. We could not
record this because the buyer was handling approximately 50 suppliers and made several dozen
calls each day, and watching him constantly for a month would be highly intrusive and difficult.
Instead, we observed the buyer for several hours each day to form our opinions. We also
collected quantitative data more or less completely for six items and discussed the findings with
several managers who admitted that these findings would qualitatively apply to the whole firm,
and probably the entire industry in tier1 suppliers and further upstream.
The conclusion is that despite having a leading ERP system in place for over six years,
decision-makers were introducing information distortion not only between firms, but also within
the firm. There are two main reasons a more systematic analysis of the underlying behavioral
issues was not possible:
(i) The high frequency of and manner in which information was exchanged made it
impossible to record final orders. The final and binding orders from Phoenix to
the tier-2 suppliers were conveyed telephonically; all prior information were
simply estimates. As a result the binding orders could not be recorded.
(ii) There were vastly different information structures across different buyers. For
example, Toyota was among the best-practice customers, but it followed a
different information regime than Suzuki and collected its shipments from
Phoenix following its own milk-run schedule.
There was one buyer, however, whose behavior could possibly be studied closely, whose
ordering process was completely independent of the ERP system. This constitutes the more
16
focused part of the case study. The unit of analysis was the dyad comprising a buyer at Phoenix
(Buyer), and a tier-2 supplier of raw castings (Supplier). Buyer was wholly and solely
responsible for purchasing raw castings for 12 models of axle assemblies produced by Phoenix.
These are the biggest and most expensive of the purchased components, with annual purchase
values of $3,000,000. This was a high-priority area, and whereas all other buyers ordered
through the Oracle ERP system, Buyer operated outside due to the complexities of the task.


Figure 3 depicts the supply chain with the lead times. Supplier was located about 16 hours
away by road, or 12 hours by rail. Shipments are received at Phoenixs warehouse, which serves
as a preliminary inspection and transit point for castings (in addition to holding the bulk of the
inventory, especially of the smaller, inexpensive parts from remote suppliers which are supplied
in bulk and infrequently). Whereas the month-level usage of castings was usually fairly
predictable, intra-month variations and departures from scheduled production were significant.
These fluctuations and discrepancies were caused mainly by intra-week fluctuations in OEM
demand and uncertainties in production. Maintaining some stock was therefore necessary (the
warehouse did not hold this inventory).
The raw castings were machined at the five machining suppliers who were located within
20 km of the Phoenix plant and also served as stocking points for the raw castings. Typical
stocking levels were 3-4 days of stock for items with heavy and predictable usage, and up to 10
days for others, although Phoenix would have preferred to hold more.


As shown in the timeline depicted in Figure 4, the primary information on expected
production at Phoenix was the monthly Firm Plan released by PPC on the 25
th
of each month.
17
Buyer processes this information, and around the 28
th
of each month, emails orders of each
casting for the next month. Based on these orders, Supplier dispatches consignments of a mix of
castings by truck. Each shipment typically consists of 4-5 types of castings and about 15-20
shipments are sent per month. Buyer does not know the exact mix of castings until the truck is
loaded and ready to depart from Supplier, at which time he is given these details. Supplier had
consistently fallen short of the demanded quantity over the month, reducing the stock Phoenix
held and frequently forcing the Buyer to make expedited orders by telephone and to procure
critical castings by premium freight.
The crucial decision for Buyer was to determine the total tonnage of the orders to be placed
each month, and to then allocate it among the 12 castings, considering the product-specific
information Buyer was privy to.


Figure 5 shows the relevant data collected for each of the 12 castings during the two year
period between January 2006 and December 2007. There are 8,760 (12 castings " 365 days) data
points for each of the following variables: Orders to Supplier, monthly PPC Firm Plan at
Phoenix, and Receipt of shipments from Supplier. Note that although the decisions were at item-
level, the key decision is the total weight of castings ordered once a month. Further, its
periodicity is monthly. Therefore, we aggregated these data suitably to month level and total
tonnage of all castings. Although the production plans and sales are in numbers ordered, the raw
castings were priced and ordered by weight. Such aggregation would mask the product-specific
sources of variation that would have obfuscated the underlying ordering behavior that is of
interest to this study.
18
The data collection period was chosen after careful consideration to minimize extraneous
sources of variation. As with most large firms in this industry, the supply chain was arborescent
(i.e. parts being procured independently by more than one buyer for use in more than one
assembly line) and the customer base and the car models changed constantly. In this setting it
was nearly impossible to find a reasonably long period during which a single buyer was solely
responsible for the same product portfolio, buying from the same unique supplier (for smaller
and more generic components, business was frequently shifted across multiple competing tier-2
suppliers). All but one buyer had been in the firm for less than two years. Possible individual
differences among the buyers necessitated keeping the buyer as a constant to maintain continuity
for the duration of the study.
Fortunately, Buyer fulfilled all of these requirements; Buyer had been in the firm since
2005 and handled the same role for the same product portfolio, and thus provided a well-
controlled case to study. The year 2005 could be treated as the training and settling-in period,
and the period 2006-2007, which fulfilled the other conditions, was a reasonable period for
analysis. Phoenix planned to source from an additional casting supplier from 2008 onwards, at
which point decision making would no longer be a unique buyersupplier relationship, meaning
that demand allocation between the two suppliers would obfuscate the behavioral patterns of
interest. Moreover, additional products were planned to be manufactured in the future, which
would also reduce the control of the research. For these reasons we chose the period 2006-2007.
Analysis
We consolidated the item-level data (Figure 5). The key variables are Orders placed by Buyer,
PPC Firm Plan, and Receipt of raw castings by Phoenix. One can see from Figure 5 that Orders
exceeded PPC Firm Plan almost consistently, and cumulatively, by 9%. This corroborated
Buyers initial admission that he overordered. He was further aware that in turn, Supplier
19
discounted his order, thereby supplying less than the orders (3,096 metric tons of supplies against
total orders of 4,105 metric tons, as shown in Figure 5). Moreover, PPC Firm Plan and Orders
did not match perfectly. This is not surprising, given that Buyer was considering several other
factors while ordering, such as current stock position, shortfall from previous months, and other
factors, that necessitated a departure from PPC Firm Plan.
In order to estimate the degree of overordering, we first need to operationalize it. Unlike in
the beer game, here the customers knowledge of the true demand keeps evolving over time. For
example, Buyer bases his estimate of true demand on the planned production as well as on the 3-
month forecast issued by PPC. However, this differs from the day-wise production plan and
actual production, sometimes for reasons beyond the control of PPC, and often because PPC
releases plans and forecasts that Buyer knows from experience or through informal channels of
information flow, are optimistic. Therefore, overordering is not a straightforward term. Even if
the customer demand is known with certainty, the quantity demanded from suppliers can vary
due to schedule instabilities or the lack of other parts and other process uncertainties that can halt
the line and alter the final demand for hundreds of other components (especially in a low-
inventory environment such as this). Therefore, while it was straightforward to record orders in
the case study, it was trickier to determine the actual demand that Buyer faced.
Phoenixs production coincided with the true OEM demand over a reasonably long time
horizon of a day or two: there was neither overproduction, nor underproduction. There were
several instances of delays by a few hours due to non-availability of parts or overproduction to
save on set-up costs, but at the day-level these variations were absorbed and OEM demand was
met 100% on all days. Individual castings for which shortage was imminent were procured
through expedited orders made over telephone and transported by premium freight (by train at
four times the normal cost). Therefore, at the aggregate level, Receipt is a close proxy of the
20
usage or true demand at Phoenix. We now define the degree of overordering as the difference
between Orders (X) and Receipt (Y) (i.e. X minus Y) if X is greater than Y, and we define
undersupply as the difference between Receipt (Y) and Orders (X) (i.e. Y minus X), if Y is greater
than X. This definition of overordering calculates the difference between the observed decision
(orders) by the buyer, without any indication of actual demand at the plant, and the receipts. The
degree of undersupply was computed in both metric tons and percentage and is depicted in
Figure 6. It is clear that there is a spike of overordering (undersupplying) in the middle of the
period under investigation.


We discerned a similar pattern amid the graphs presented in Figure 5. It appeared that even
as the order reached a maximum, the corresponding delivery was at its minimum (January 2007),
suggesting a curious inverse relation between orders and supplies! This pattern persisted across
most of the 12 castings when considered individually. Moreover, the orders seemed to be serially
correlated even when demand was not (i.e. in some months, although demand rose and fell
orders were stepped up successively without showing a corresponding drop).
It is clear that overordering was occurring. What caused it? We began our investigation by
focusing on SLU, with the question would SLU exist in this set-up, and if so, how? The typical
buyer handled 150 part numbers from about 20 different suppliers. In addition to watching the
Phoenix production line status closely, buyers were also responsible for expediting many of the
parts dependent on production line variations; this increased their workload considerably. The
warehouse released the company-wide daily stock report in an Excel spreadsheet at the
beginning of each day. This list included over 1,500 part numbers, and was therefore too
unwieldy to be handled by individual buyers. Therefore, the buyers maintained their own
21
spreadsheets separately, which were hyperlinked to the daily stock report, to elicit information
on the stock position of only their respective parts. Only the respective buyer was privy to
additional details pertaining to parts under his charge. This included information on the quantity
of parts in various stages of the supply chain. Completed parts were stored at nearby third-party
supplier warehouses, the Phoenix warehouses, or were in transit (with critical items in express
freight). Other parts were works-in-progress in the Phoenix production line. The buyer updated
these data manually for his own use as and when they became available.
Each Wednesday the detailed company-wide production plan was released and the ERP
system generated the outgoing orders for suppliers late in the night. Buyers used this information
the next day to finalize the weekly release of firm orders. This being their key decision for the
week, they considered additional subjective factors before finalizing the orders. We observed and
participated with some buyers during this process on several Thursdays and they were found to
keep most of the data columns hidden most of the time, opening them only during data entry.
Although individual buyers customized their own spreadsheets, each sheet had a column that
totaled the net stock. So unless one specifically wanted to underweight the supply line, in which
case one would have to deliberately unhide the columns to be able to separate out the in-transit
and in-hand stock, it was almost impossible to weight the material in-hand and material in-transit
differently. SLU did not, therefore, seem to be a possibility with buyers in this company.


In particular, Buyer seemed to have even less possibility of underweighting the supply line.
Unlike other buyers, the parts under his charge, raw castings, were not stored in the Phoenix
warehouse. Instead, the shipment was split upon receipt and different castings were routed to the
five machining suppliers. As soon as Buyer was informed that the truck carrying the shipments
22
had departed from Suppliers yard, he updated his records and recorded the quantity as good as
in hand, as shown in the Figure 7 column Total backlog incl. backlog. Buyer could not
distinguish between the material in-transit and stock in hand while computing his order
quantities unless he specifically decided to and took steps to do so. SLU as a cognitive bias was
therefore virtually impossible. What else could explain the overordering here?
With this question as a guide, we probed Buyer further, with one of the authors
participating in the decision-making process for the last three months of 2007 to comprehend the
thought processes accompanying the decisions. The salient part of the resulting insights is
explained here through a simplified numerical illustration. Suppose the quantity ordered and
quantity received during each of the last ten months were 100 and 90 respectively. Note that this
makes no allusion to the true demand; one hundred was simply the quantity ordered each month.
Suppose the total undersupply (orders outstanding) is 100 over these ten months. How would the
decision for month 11 be made? Suppose true demand for month 11 is 200. True demand is the
desired inflow considering several factors such as the PPC production plan, replenishment of
stock, and correction for quality rejects. Buyer would attempt to game Supplier by ordering in
excess of this true requirement, but he would not add a correction factor equal to the entire
cumulative backorder of 100. Rather, he would add only a certain fraction, say one third, of the
recent accumulated orders outstanding, for perhaps the last three months. We denote this the
correction factor
C
! . In this case the correction would be
C
! " orders outstanding from the last
3 months =
1
30
3
! = 10. The corrected value of the order would then be 200 + 10 = 210. The
correction factor (
C
! ) acknowledges that Buyer (i) considers only the recent past while making
the current months ordering decision, and (ii) weights the recent past less than 1. In effect, if
there are orders in the distant past (perhaps over three months ago) that are not yet satisfied by
23
the supplier, the buyer assumes that these orders will never be shipped. We call this the
correction mechanism; the parameter
C
! determines the extent of correction.
Note how this mechanism contrasts with SLU although ultimately both result in
overordering. The difference arises from how they treat and weight the supply line. In SLU, the
decision-maker forgets or fails to completely account for the on-order quantity and weights the
order quantity by a factor ! such that 0 < ! < 1, where a value of zero indicates that the supply
line is ignored completely (Sterman, 1989a). This approach assumes that backorders continue in
perpetuity and that a rational buyer ought to expect them to be eventually filled completely. As
an illustration, if there have been 10 units backordered during each of the previous 10 months,
and if the expected consumption in the next month is 200 units, then
ordered quantity = expected consumption ! " accumulated backorders
= 200 0.0 " 100 = 200 if ! = 0 (ignoring the SL)
= 200 0.5 " 100 = 150 if ! = 0.5 (SLU)
= 200 1.0 " 100 = 100 if ! = 1 (perfectly rational)
According to SLU logic, for non-zero values of !, as the number of parts on-order
increases, the current order will decrease. In contrast, in the correction model, for non-zero
values of
C
! , as the undersupply (parts on-order) increases, the current order increases. In the
SLU regime, implicitly, from the suppliers point of view ! continues to be effectively 1 as he
attempts to fill the entire accumulated backlog. Eventually he would dispatch 200 + 1 " 100 =
300.
In contrast, in the correction model, not only does the supplier not consider the whole
backlog, but he also does not believe the orders completely. While the supplier may initially
strive to increase the shipment rate to make up for the previous undersupply, he soon begins to
behave in the opposite manner. Key to this effect, at the aggregate level, is that in our case
24
setting Supplier senses that despite continuing to supply 90 units, the Phoenix line never really
comes to a standstill for want of castings. This means that 90 rather than 100 ought to have been
the true demand at Phoenix in all these months, and Buyer had been overordering all along. Over
time, Supplier also learns of this behavior through his informants at the machining suppliers who
sense or guess what must have been the approximate true production at Phoenix.
Because of this realization, Supplier begins to discount future orders by a factor that is an
increasing function of Buyers exaggeration. The direction of this effect is that as the size of the
orders increases (including orders in excess of true demand), the amount supplied decreases.
This is evident in Figure 5 where the orders (Orders) and supplies (Receipt) move away from
each other.
In this case, overordering was a defensive mechanism against supplier short-supplying,
much like stocking to guard against coordination risk (Croson et al., 2008) wherein players
deviate from equilibrium and build inventory (knowing it to be suboptimal to do so) to protect
themselves against the perceived risk that others will not behave optimally. However, Croson et
al.s (2008) study does not address how the coordination stock would be computed by the
decision-makers, and more importantly, why it is assumed to be a constant; intuitively one would
expect coordination stock to be revised dynamically and not remain static.
In the proposed correction model, however, the stock correction varies dynamically and
reflects the recent strategic behavior between Buyer and Supplier. This behavior is conceptually
similar to the strategic behavior of buyers and suppliers in forecast sharing: buyers often over-
forecast and suppliers discount these figures based on experience (Terwiesch, Ren, Ho, &
Cohen, 2005). Had it been a one-period model of a one-off transaction, the behavior could be
explained by the classic prisoners dilemma game, in which the non-Pareto equilibrium would be
that the buyer would not cooperate (i.e. would over-forecast) and the supplier would also not
25
cooperate (i.e. would discount the forecasts and not build enough capacity). This can be
explained as a rational action or can be viewed from the behavioral lens that would consider trust
and reputation issues. The economics literature, however, predicts that repeated games can lead
to more cooperative outcomes. In repeated games (the ordering in the Phoenix case and beer
game share this feature) parties are likely to adopt a tit-for-tat strategy: they cooperate (the buyer
forecasts orders correctly on average and the supplier reacts to the forecast order) as long as the
other party does the same and retaliate (the buyer overforecasts and the supplier ignores forecast
orders) upon the other partys defection (Axelrod, 1981; Kreps, Milgrom, Roberts, & Wilson,
1982; Terwiesch et al., 2005). The key point is that recent behavior is considered while arriving
at the current decision.
The data fall into place with the proposed correction model nearly perfectly for January
through December 2006. However, a discontinuity at the 12
th
and 13
th
month of analysis begs
explanation. The reason for this is that in December 2006 the orders reached a maximum which
coincided with the lowest supplied quantity in the 24 months. In other words, a situation arose
such that the more the buyer ordered, the less they received! It did not help matters that the
period also coincided with a generally buoyant market and hence increased pressure on capacity
for both Buyer and Supplier.
At this point, both parties had realized the impasse and held a face-to-face negotiation
meeting at the Phoenix site. A detailed description of this meeting is impossible because it took
place before the present research began. We only know unambiguously that this meeting was a
significant event at the company; Supplier (the Chief Operating Officer who also owned a
substantial share in the company) visited only about once a year whereas most face-to-face
meetings for smaller issues would take place through Buyer visiting the suppliers location a few
times a year. In the January 2007 meeting, however, larger teams from both sides met,
26
underscoring the significance of this meeting to both sides. It was mutually agreed there, and
subsequently implemented, that both parties would henceforth control their respective double-
guessing and order responsibly.
The learning literature contains some theoretical parallels to the main behavior identified in
the case study (i.e. that the buyer ordered more when there were more previous orders), but they
are opposite in direction to SLU. Prior research views the inventory decisions as discrete
instances of judgment or choice rather than adopting a continuous perspective. For example, the
variants of decision heuristics (Croson & Donohue, 2006; Sterman, 2006) model each decision
as a function of the value of a variable in a single period. Todays orders are a function of only
todays inventory, shipment receipt, downstream orders, and todays supply line. In contrast, in
the continuous perspective, judgment is an ongoing process; the reasoning behind current
decisions often also incorporates knowledge of the past actions and reactions: a given error in
judgment need not have dire consequences, since there may be subsequent opportunities to catch
and correct the error using feedback from the choice outcomes (Hogarth, 1981; Kleinmuntz &
Thomas, 1987, p. 342). In such dynamic decision making environments a player can make a
decision, see if it results in a reasonably quick desired outcome, and modify his subsequent
decisions (orders) suitably to move the outcome closer to the desired state. This can be an
alternative explanation to SLU, of the overordering behavior observed in beer games. Since the
subjects in the beer game are prevented from communicating with their suppliers regarding the
alternating crises of shortages and excess inventory that arise in their game, they might plausibly
react by reinforcing their previous orders until the supplies match the desired rate. This would be
akin to yelling at the supplier for poor delivery performance. In other words, if the most recent
orders have been high, and the supplies do not reflect the same, then the decision maker would
27
place an even higher order in the next period because he cannot really yell or even talk to his
game team members.
The simple recurrent network (SRN) learning model (Cleeremans & McClelland, 1991;
Gibson, 2000) provides an interesting theoretical lens through which to view this kind of
behavior. SRN assumes that people learn by iteratively minimizing the difference between their
actual decision and feedback from the environment. In other words, in dynamic decision
environments where the actions of the decision maker are one of the factors impacting the
system, during each period the decision maker considers not only the environmental cues
relevant to the task but also corrects it by a factor based on his previous experience of the
sensitivity of the system to his decisions. Thus, the internal categorization or the mental model
of the decision maker evolves over time, reflecting the latest learning. Expressed mathematically,
( ) ( ) ( ) , 1 Internal categorization t f Environmental Cues t Internal Categorization t ! " = #
$ %

where, in each period, the internal categorization that results in the decision for the period
is a function (f) of the current environmental cues and the internal categorization developed in
the previous period.
The foregoing case analysis leads to the proposition: As Buyer notices that his previous
decision does not result in the expected outcome, his propensity to overorder increases. In
accordance with this behavior, we model the current order as a function of the currently available
information (I, R, S just as in Equation (2)) corrected by a factor that is a function of the previous
order.
The correction factor captures a behavior that is not just an alternative to SLU, but in fact
acts in the opposite direction to SLU. The correction factor reflects the real world phenomenon
of customers correcting for the suppliers unreliable supplies by (i) speaking to the supplier and
exerting pressure or (ii) in the absence of such communication options due to power relations
28
between the two or other practical reasons, the customer can over-order. The latter is relevant to
the present case study and beer game. In such situations the buyer responds by reinforcing the
previous order and over-ordering in lieu of pressuring the suppliers. The effect is that when there
are more previous orders in the pipeline, the current order will be larger (and vice versa); this is
opposite of the direction predicted by SLU. To make this proposition empirically testable, we
articulate it mathematically as

[ ]
0 1
0,
t C t I t R t S t
O Max O I R S ! ! ! ! !
"
= + + + + (3)
The restriction of non-zero orders follows the reasoning behind Equations (1) and (2).
EXPERIMENTAL VALIDATION
Equation (3) was estimated using the usual procedures followed in prior literature. The empirical
basis consists of beer game data obtained from the MBA Class of 2008 operations management
course at a leading business school where one of the authors was affiliated. The average age of
the class was 23 years; two-thirds of the class had less than 1 year of work experience. They
possessed very high analytical and quantitative skills; the students were in the top 2% of the
country-wide common admission test. Females constituted one third of the class. We used
standard features of the beer game (Croson & Donohue, 2006) such as a four-level supply chain,
two periods of physical and information lead times, a 2:1 ratio for inventory to backorder costs,
40 time periods of the game, a uniform distribution of customer demand [0,8] known to the
participants, and team cost minimization as the objective. The incentive was a credit of up to
10% in the course, proportional to the relative performance in class. In all, the results from 22
teams (88 participants) were used for further analysis, after dropping two teams that had failed to
record data completely or failed to complete the game. The participants were told that the length
of the game was uncertain, but that it would most likely be around 60 rounds.
29
In the beer game, the first few rounds may not accurately represent the behavior of the
supply chain decision makers because of the initial conditions (e.g., the orders and shipments
that were in the pipeline at the start of the game), therefore, in a departure from the traditional
approach, we excluded from our analysis the data pertaining to the first six time periods (the
results remain practically unchanged, however, even when we include these periods).
Traditionally, in almost all prior studies based on beer game experiments, all four roles are
considered on equal terms for analysis. However, whereas our proposed behavior, yelling at the
supplier holds for the retailer, wholesaler, and distributor, it ought not to hold for the factory
because the factory orders itself. Therefore, we hypothesize that Equation (3) holds for the
retailers, wholesalers, and distributors (i.e.
C
! will be significant for the downstream roles as
treatment group), but will not hold for the factory (i.e.
C
! will not be significant for the control
group). The beer game provides an opportunity to validate (or refute) this hypothesis under a
tight experimental setup: players are randomly assigned to the four roles and the treatment is that
the downstream roles can blame or game their suppliers and try to yell at them for tardy
responses, whereas the factory has no one else to blame or correct, and so acts as the control
group.
Note that previous studies supporting SLU used an econometric approach to validate SLU
by estimating the hypothesized equation parameters (Sterman, 1989a; Croson & Donohue, 2006;
Croson et al., 2008), whereas our study tests the behavior through a true experimental design
with a sharply demarcated control group (factory) and treatment group (downstream roles).
The results follow. We first tested the direction of the main effects through correlation
analysis. The prediction following SLU theory is that orders O
t
would be uncorrelated with the
supply line N (because people ignore the supply line!), or at best, weakly negatively correlated
with it (because some people might consider the supply line and subtract it from their intended
30
decision, hence the negative sign, but they underweight it, hence the weak correlation). In
contrast, our prediction is that both O
t-1
(which forms a part of N) and N ought to be positively
correlated with O
t
. Indeed, the results, shown in Table 1, support the prediction: the correlations
are 0.77 and 0.58, both positive and highly significant (p < 0.01).

Insert Table 1 here

Further analysis was carried out using Tobit regression (Greene, 2008). Hence, the non-
negativity constraint on orders was treated as censored data because an order for zero could
indicate situations in which a player intended to cancel a previously placed order, but was
restricted by the rules of the game to a minimum order of zero. Furthermore, we corrected for the
presence of heteroskedasticity by using the robust Huber-White sandwich estimator of variance.
Analysis was conducted in two steps. First, we treated the 22 retailers, 22 wholesalers, and 22
distributors together (downstream roles) as the treatment group and the 22 factories as the control
group, and estimated Equation (3) (Table 2). We note that the signs of the coefficients for
inventory (I), incoming orders (R), and shipments received (S) are consistent with the
formulation, and statistically significant. This supports the idea that the decision makers
(treatment and control group) take into consideration information cues pertinent to the problem:
I, R, and S. The coefficient of O
t-1
is significant and in the hypothesized direction ( 0.52
C
! = , p <
0.001) for the treatment group, and insignificant for control group, meaning that only the former
seems to be behave in accordance with Equation (3). Note that the effect sizes, even when taken
at the individual role-level, are substantial: 0.49, 0.64 and 0.40. Next, in order to ascertain that
the three downstream roles individually reflect the same behavior as the treatment group, we
performed similar analysis after splitting the data role-wise as retailer, wholesaler, and
distributor. The results remain qualitatively unchanged.
31

Insert Table 2 here

As we can see from Table 2, the results hold robustly: each of the three downstream roles
shows behavior consistent with the treatment group and the parameters are all highly significant,
and more importantly, in the hypothesized direction.
It must be noted that Equation (3) does not, in itself, preclude SLU; it leaves open the
possibility that both correction and underweighting could be occurring in tandem and we only
see the sum effect of these two distinct behaviors. To check this possibility, and increase the
robustness and uniqueness of the proposed behavior, we replicated the analysis following Croson
and Donohue (2006); this was as an additional check over the same conclusions from analysis of
their results in the literature review section of this paper (Section 2 ibid). If SLU were occurring
simultaneously, then
N
! ought to be negative in accordance with Equation (2) in our dataset as
well. The results are presented in Table 3.

Insert Table 3 here

As seen from Table 3,
N
! was positive (0.07 by Tobit, and 0.09 by OLS regression).
While this was statistically significant, it does not seem substantially significant: the difference
between Croson and Donohues (2006) value of 0.0302 and our value of 0.09 seems small in
magnitude. Therefore we would not claim strong support for
N
! to be positive; rather, we point
out that there is no strong evidence that it is negative as predicted by SLU. Therefore, the bulk of
the explanatory power comes from the posited correction model.

The beer game is known to remain essentially the same wherever and by whoever it is
played (Senge, 2006; Sterman, 2006). In the present research, analyses of two different datasets,
32
Croson and Donohues (2006) and our own, point in the same direction supporting our proposed
explanation and making the findings more conclusive.
Whereas the proposed behavior may seem to be at odds with SLU, it does seem consistent
with the A&A heuristic. Apparently, in each decision period, the previous decision serves as the
anchor, and the final choice is adjusted in the direction in accordance with the available
information of incoming orders, incoming shipments, and inventory status.
DISCUSSION AND CONCLUSION
Decision making in dynamic environments has attracted researchers from areas ranging from
human learning and infant learning (Elman, Bates, Johnson, Karmiloff-Smith, Parisi, & Plunkett,
1996; Munakata, McClelland, Johnson, & Siegler, 1997) to management of software
development teams (Sengupta & Abdel-Hamid, 1993) and macroeconomics (Sterman, 1989b).
The fundamental paradigm upon which these studies were built, and either supported or
challenged, was that people have an innate inability to consider the delayed response of their
actions, and this is called the SLU bias. This position was based on studies in the supply chain
context (Sterman, 1989a). The stock management formulation as captured in Equation (1)
underpins our current understanding of inventory decision making and also forms the baseline
for system dynamics modeling of inventory management as well as many other areas of
application such as capital investment, human resources, cash management, marketing, hog
farming, agricultural commodities, and commercial real estate (Sterman, 2000).
We have carefully analyzed the limited empirical evidence and the theory behind the stock
management formulation heuristic and SLU bias. Our analysis revealed some limitations of SLU
and strongly suggested the presence of at least some other biases that we then investigated
through a multi-method research design. The proposed explanation was uncovered through a
case study of a large component manufacturers supply chain practices. Drawing upon the SRN
33
learning model, the resulting behavior was explicitly formulated and tested through a carefully
controlled laboratory experiment. The key effect tested was that inventory decision makers resort
to reinforcing their previous orders (i.e. in an endogenous loop, the decision maker corrects his
previous decision hoping for the desired result), which we call the correction behavior. Note that
several other biases studied in behavioral operations, such as Benzion, Cohen, and Shavit (2010),
Bolton and Katok (2008), Gavirneni and Isen (2010) and Schweitzer and Cachon (2000) could
still be occurring in tandem and influence the ordering behavior. In particular, Su (2008) also
found that overordering occurred even when supply line was simply non-existent, underscoring
the fact that SLU is not a unique explanation for overordering.
Correction behavior applies in situations where free communication with the supplier is not
possible, as in the case study where the buyer had little power over the powerful supplier, and in
the beer game where communication among channel members is not allowed. Curiously, we
found that this behavior occurred even in the beer game where players have a common team cost
minimization objective (fully integrated supply chain); so we believe this effect is likely to hold
even more strongly in real world supply chains where players have local profit maximization
objectives and therefore the incentive to yell at suppliers is stronger.
The distinction between the two behaviors can be sharpened using the intentions, actions
and reactions framework used in behavioral operations (Bendoly, Donohue, & Schultz, 2006).
According to this framework, SLU falls under the actions category, which pertains to cognitive
biases (i.e. failure of the decision maker to make a choice consistent with his or her true
intentions or personal preference). Loch and Wu (2007) subsume all three of these categories
(intentions, actions and reactions) under individual decision biases due to cognitive limitations,
and place SLU within this broad category. This implies that rather than choosing not to recognize
the supply line, SLU connotes incapability to recognize it due to faulty mental model (Senge,
34
2006; Bendoly, Croson, Gonalves, & Schultz, 2010). In contrast, we position the correction
model under the intentions category: the decision maker is hypothesized to consciously consider
the previous decision (the recent, relevant part of the supply line); he overorders because of
rather than in spite of the existing supply line (i.e. he is explicitly modeled as recognizing the
supply line).
All research studies have limitations, and this one is no exception. Although we used a
formal experimental test of the proposed behavior instead of relying exclusively on econometric
testing, the general issues pertaining to the use of classroom experiments persist in this study.
Our study also derives some propositions from qualitative responses obtained from the beer
game participants. We caution against reliance on this approach. The main reason is that
cognitive biases and bounded rationality are, by default, not clearly known to the people
themselves; therefore their responses would likely be either incomplete or inaccurate. Adoption
of alternate research methods in future research can overcome some of the limitations of our
study.
The other noteworthy limitation of this study is the use of a single case study. While we
developed deep knowledge of the case setting and the undercurrents of ordering behavior in the
whole company, it remains that only a single BuyerSupplier dyad provided a level of control
high enough to make it amenable for a rigorous quantitative analysis. Statistical generalizability
of single case studies has been the subject of much debate in management research, and we defer
the matter to more authoritative sources (Yin, 2009).
We now offer our brief thoughts on future research possibilities that could build on this
contribution. Although the correction model was developed from a case study and validated in a
lab setting, we believe the findings are tentative. Replication studies outside of the inventory
management context need to be carried out before the model can gain wider acceptance and we
35
especially hope that researchers from outside the operations management community will take
up this task. Further, overordering or over-reaction is a symptom of a broad problem with
numerous underlying rational and behavioral causes; SLU, coordination risk mitigation (Croson
et al., 2008), over-reaction to backorders (Oliva & Gonalves, 2005), buyer over-stating and
supplier discounting forecasts (Terwiesch et al., 2005), and now correction behavior, are but a
few. We suspect that these behaviors seldom occur in isolation but most likely occur
simultaneously and interact with each other. Most academic studies, including this one, isolate
and study one cause at a time. We hope future research will model the interactions of these
behaviors, for example, by combining two or three of these behaviors in simulation models and
testing how they add to, multiply with, or cancel each other out. We also hope SLU will receive
increased attention with future studies offering theoretical and empirical support to it. At a broad
level, the correction model is a tentative model that departs from the mainstream literature and in
a sense, stands alone; we hope future studies will either lend it support or refute it, and in either
case extend knowledge on this important topic.
To conclude, the correction model provides an alternate perspective to why and how
people over-react in dynamic decision settings such as inventory management. A better
understanding of these behaviors can provide value to both academics (to build on these results
and incorporate these behaviors into existing models to make them more robust) and managers
(who, by being sensitized to these behaviors can improve their decision making).
36
REFERENCES
Axelrod, R. (1981). The emergence of cooperation among egoists. American Political Science
Review, 75(2), 306-318.
Bendoly, E., Croson, R., Gonalves, P., & Schultz, K. L. (2010). Bodies of knowledge for
research in behavioral operations. Production and Operations Management, 19(4), 434-
452.
Bendoly, E., Donohue, K., & Schultz, K. L. (2006). Behavior in operations management:
Assessing recent findings and revisiting old assumptions. Journal of Operations
Management, 24(6), 737-752.
Benzion, U., Cohen, Y., & Shavit, T. (2010). The newsvendor problem with unknown
distribution. Journal of the Operational Research Society, 61(6), 1022-1031.
Bolton, G. E., & Katok, E. (2008). Learning by doing in the newsvendor problem: A laboratory
investigation of the role of experience and feedback. Manufacturing & Service Operations
Management, 10(3), 519-538.
Bowman, E. H. (1963). Consistency and optimality in managerial decision making. Management
Science, 9(2), 310-321.
Boyer, K. K., & Swink, M. L. (2008). Empirical elephants Why multiple methods are essential
to quality research in operations and supply chain management. Journal of Operations
Management 26(3), 338-344.
Brehmer, B. (1995). Feedback delays in complex dynamic decision tasks. In P. A. Frensch, & J.
Funke (Eds.), Complex problem solving: The European perspective. Hillsdale, NJ:
Erlbaum, 103-130.
Cleeremans, A., & McClelland, J. L. (1991). Learning the structure of event sequences. Journal
of Experimental Psychology: General, 120(3), 235-253.
Combs, J. G. (2010). From the editors: Big samples and small effects: Lets not trade relevance
and rigor for power. Academy of Management Journal, 53(1), 9-13.
Croson, R., & Donohue, K. (2002). Experimental economics and supply chain management.
Interfaces, 32(5), 74-82.
Croson, R., & Donohue, K. (2005). Upstream versus downstream information and its impact on
the bullwhip effect. System Dynamics Review, 21(3), 249-260.
Croson, R., & Donohue, K. (2006). Behavioral causes of the bullwhip effect and the observed
value of inventory information. Management Science, 52(3), 323-336.
Croson, R., Donohue, K., Katok, E., & Sterman, J. D. (2008). Order stability in supply chains:
Coordination risk and the role of coordination stock. MIT Sloan Working Paper.
Cambridge, MA.
Davis, F. D., & Kottemann, J. E. (1995). Determinants of decision rule use in a production
planning task. Organizational Behavior and Human Decision Processes, 63(2), 145-157.
Diehl, E., & Sterman, J. D. (1995). Effects of feedback complexity on dynamic decision making.
Organizational Behavior and Human Decision Processes, 62(2), 198-215.
Dogan, G. (2007). Bootstrapping for confidence interval estimation and hypothesis testing for
parameters of system dynamics models. System Dynamics Review, 23(4), 415-436.
37
Dogan, G., & Sterman, J. D. (2006). I am not hoarding, I am just stocking before the hoarders get
there. Proceedings of the 24
th
International Conference of the System Dynamics Society.
Nijmegen, The Netherlands.
Elman, J. L., Bates, E. A., Johnson, M. H., Karmiloff-Smith, A., Parisi, D., & Plunkett, K.
(1996). Rethinking Innateness: A Connectionist Perspective on Development. Cambridge,
MA: MIT Press.
Gavirneni, S., & Isen, A. M. (2010). Anatomy of a newsvendor decision: Observations from a
verbal protocol analysis. Production and Operations Management, 19(4), 453-462.
Gibson, F. P. (2000). Feedback delays: How can decision makers learn not to buy a new car
every time the garage is empty? Organizational Behavior and Human Decision Processes,
83(1), 141-166.
Gino, F., & Pisano, G. (2008). Toward a theory of behavioral operations. Manufacturing &
Service Operations Management, 10(4), 676-691.
Greene, W. H. (2008). Econometric Analysis (6
th
ed.). Upper Saddle River, NJ: Prentice Hall.
Hogarth, R. M. (1981). Beyond discrete biases: Functional and dysfunctional aspects of
judgmental heuristics. Psychological Bulletin, 90(2), 197-217.
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8),
e124.
Joslyn, S., & Hunt, E. (1998). Evaluating individual differences in response to time-pressure
situations. Journal of Experimental Psychology: Applied, 4(1), 16-43.
Kanfer, R., & Ackerman, P. L. (1989). Motivation and cognitive abilities: An
integrative/aptitude treatment interaction approach to skill acquisition. Journal of Applied
Psychology, 74(4), 657-690.
Kleinmuntz, D. N., & Thomas, J. B. (1987). The value of action and inference in dynamic
decision making. Organizational Behavior and Human Decision Processes, 39(3), 341-
364.
Kreps, D. M., Milgrom, P., Roberts, J., & Wilson, R. (1982). Rational cooperation in the finitely
repeated prisoners' dilemma. Journal of Economic Theory, 27(2), 245-252.
Loch, C. H., & Wu, Y. (2007). Behavioral Operations Management. Foundations and Trends in
Technology, Information and Operations Management, 3(1), 121-232.
Metters, R. (1997). Quantifying the bullwhip effect in supply chains. Journal of Operations
Management, 15(2), 89-100.
Munakata, Y., McClelland, J. L., Johnson, M. H., & Siegler, R. S. (1997). Rethinking infant
knowledge: Toward an adaptive process account of successes and failures in object
permanence tasks. Psychological Review, 104(4), 686-713.
Oliva, R., & Gonalves, P. M. (2005). Behavioral causes of demand amplification in supply
chains: Satisficing policies with limited information cues. Proceedings of 23rd
International Conference of the System Dynamics Society. Boston, MA.
Schweitzer, M. E., & Cachon, G. P. (2000). Decision bias in the newsvendor problem with a
known demand distribution: Experimental evidence. Management Science, 46(3), 404-420.
Senge, P. (2006). The Fifth Discipline: The Art and Practice of the Learning Organization. New
York, NY: Doubleday.
38
Sengupta, K., & Abdel-Hamid, T. K. (1993). Alternative conceptions of feedback in dynamic
decision environments: An experimental investigation. Management Science, 39(4), 411-
428.
Sterman, J. D. (1989a). Modeling managerial behavior: Misperceptions of feedback in a dynamic
decision making experiment. Management Science, 35(3), 321-339.
Sterman, J. D. (1989b). Misperceptions of feedback in dynamic decision making. Organizational
Behavior and Human Decision Processes, 43(3), 301-335.
Sterman, J. D. (2000). Business Dynamics: Systems Thinking and Modeling for a Complex
World. Homewood, IL: McGraw-Hill.
Sterman, J. D. (2006). Operational and behavioral causes of supply chain instability. In O.
Carranza, & F. Villega (Eds.), The Bullwhip Effect in Supply Chains. New York, NY:
Palgrave Macmillan, 17-56.
Su, X. (2008). Bounded rationality in newsvendor models. Manufacturing & Service Operations
Management, 10(4), 566-589.
Terwiesch, C., Ren, Z. J., Ho, T. H., & Cohen, M. A. (2005). An empirical analysis of forecast
sharing in the semiconductor equipment supply chain. Management Science, 51(2), 208-
220.
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases.
Science, 185(4157), 1124-1131.
Wu, D. Y., & Katok, E. (2006). Learning, communication, and the bullwhip effect. Journal of
Operations Management, 24(6), 839-850.
Yin, R. K. (2009). Case Study Research: Design and Methods (4
th
ed.). Thousand Oaks, CA:
Sage.

39
Table 1: Descriptive statistics and intercorrelations.

O

O
t1

I

R

S

N

t
O 1
O
t-1
0.77
**
1
I 0.52
**
0.55
**
1
R 0.75
**
0.71
**
0.47
**
1
S 0.27
**
0.39
**
0.27
**
0.43
**
1
N 0.58
**
0.74
**
0.58
**
0.53
**
0.64
**
1
t 0.11
**
0.10
**
0.01 0.13
**
0.17
**
0.23
**
1
M 6.57 6.41 2.27 5.76 6.49 28.40
SD 9.87 9.63 32.24 7.15 10.14 40.37
Note: Pearson correlation coefficients are shown.
* p < 0.05 (equals |r| > 0.04), ** p < 0.01 (equals |r| > 0.05) (two-tailed).

Table 2: Results of regression analysis according to Equation (3).

Factory
Total
Downstream
Retailer Wholesaler Distributor
Variables b

SE b

SE b

SE b

SE b

SE
Intercept 1.99

(1.27) 0.70
*
(0.34) 0.53

(0.36) 0.83

(0.55) 1.07

(0.59)
O
t-1
0.31

(0.20) 0.52
***
(0.06) 0.49
***
(0.08) 0.64
***
(0.08) 0.40
***
(0.08)
I 0.08

(0.05) 0.09
***
(0.01) 0.06
***
(0.01) 0.10
***
(0.02) 0.13
***
(0.03)
R 0.79
***
(0.13) 0.67
***
(0.07) 0.60
***
(0.06) 0.60
***
(0.09) 0.76
***
(0.10)
S 0.23
***
(0.06) 0.24
***
(0.04) 0.10
**
(0.04) 0.35
***
(0.07) 0.26
***
(0.05)
t 0.11
**
(0.04) 0.03
***
(0.01) 0.02

(0.01) 0.03

(0.02) 0.05
*
(0.02)
n 721 2166 723 721 722
Log likelihood 2099.69 5287.61 1575.42 1718.13 1823.52
Note: Tobit regression models were estimated. b refers to unstandardized regression estimates. Robust standard
errors (Huber-White) in parentheses.
p < 0.10, * p < 0.05, ** p < 0.01, ***

p < 0.001 (two-tailed).

40
Table 3: Regression analysis: Replication of Croson and Donohue (2006).

Tobit OLS
Variables b

SE b

SE
Intercept 1.83
***
(0.53) 0.92

(0.49)
I 0.12
***
(0.02) 0.02
*
(0.01)
R 0.93
***
(0.11) 0.87
***
(0.10)
S 0.34
***
(0.05) 0.25
***
(0.04)
N 0.07
***
(0.02) 0.09
***
(0.01)
t 0.07
***
(0.02) 0.02

(0.01)
n 2887 2887
Adj. R
2

0.65
F
1095.32
Log likelihood
7815.72 9168.77
Note: b refers to unstandardized regression estimates. Robust standard errors (Huber-White) in parentheses.
p < 0.10, * p < 0.05, ** p < 0.01, ***

p < 0.001 (two-tailed).

41
Figure 1: Anchoring and adjustment (A&A) heuristic.

Note: The decision will usually lie between the subjects independent best estimate of the true value, and the anchor
provided. Note that these are average values across the subjects; individuals behavior may differ.

Figure 2: Comparing and contrasting the A&A heuristic and stock management formulation.

Independent answer: 30%
Anchor:10%
Average answer: 25%
Average answer: 45%
Anchor: 65%
Dotted arrows signify
the magnitude of
"insufficient adjustment"
Solid arrow indicates
adjustment toward
independent value
Optimal level
(independent answer)
Anchor (cue)
Decision
(selected value)
Current stock
Anchor: Initial stock
or something else
Decision
(order quantity)
Dotted arrow signifies
the magnitude of
insufficient adjustment
A&A heuristic Stock management
formulation
42
Figure 3: Castings supply chain showing transit lead-times.

Figure 4: Sequence of events.

Castings
supplier
Buyer PPC
Phoenix
Receipt
at warehouse
Machining
suppliers
Production
line
OEMs
Information flow
Physical flow
1.5 days
0.5 days 0.5 days
JIT 2 - 3 days
Month n + 1
Releases expediting orders
by telephone as and
when needed
Observes actual
pickup by OEM
Phoenix PPC
Phoenix Buyer
Supplier
Supplies approx. 20 times
over the month
Month n
31
st
28
th
Receives firm plans
from various OEMs
1
st
Releases firm
orders by email
25
th
Releases production
plans
43
Figure 5: Summary of aggregated (12 castings) data.

Figure 6: Differences between aggregated (12 castings) orders and receipts.

2010-07-19-A Niranjan Manuscript Markup 6-7-2011 Ready For Pub 7-13

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2010-07-19-A Niranjan Manuscript Markup 6-7-2011 Ready For Pub 7-13

Uploaded by

Copyright:

Available Formats

An Alternative Theoretical Explanation and Empirical Insights

into Over-ordering Behavior in Supply Chains

, Stephan M. Wagner, and Christoph Bode

L . Adjustments are then made

You might also like