You are on page 1of 20

Nonnancial Information andAccounting: A

Reconsideration of Benets and Challenges


Joan Luft
SYNOPSIS: Recent years have seen widespread interest in supplementing or replac-
ing accounting information with nonnancial information NFI in a variety of uses such
as incentive compensation, prediction of costs and prots, and rm valuation. The joint
use of NFI and accounting has had mixed results, however. Research has documented
benets to such use but has also documented signicant challenges. This commentary
summarizes research that addresses two particularly important challenges in using
combinations of accounting and NFI: measuring nonnancial performance accurately
and weighting measures appropriately when multiple accounting and nonnancial mea-
sures are used together. These challenges are related, in that the nature and magni-
tude of measurement error helps to determine appropriate weights on multiple mea-
sures. Two common themes appear in strategies for dealing successfully with these
challenges. The rst is that matching information properties to decision types can limit
the need for costly or infeasible improvements in measurement. Measurement errors
that have signicant negative impact on some decisions can be innocuous when the
information is used for other decisions. The second theme is a portfolio approach to
measurement error: the negative decision effects of error in individual measures can be
signicantly mitigated by well-chosen combinations of NFI and accounting measures.
INTRODUCTION
P
roposals to supplement conventional accounting with the use of nonnancial information
NFI have exerted a powerful appeal in recent years. Balanced scorecards and similar
performance measurement systems have been advocated intensively and are widely used by
organizations e.g., Eccles et al. 2001; Kaplan and Norton 2001a, 2001b, 2001c, 2008. Business-
risk or strategic-systems audits, which rely on NFI to understand the clients business, have been
put forward as a way to conduct efcient high-quality audits in a challenging economic and
regulatory environment Bell et al. 2002; Peecher et al. 2007. Financial analysts use NFI to
forecast earnings and stock prices Dempsey et al. 1997; Chandra et al. 1999; Rajgopal, Ven-
katachalam, and Kotha 2003; Peecher et al. 2007, and the Financial Accounting Standards Board
Joan Luft is a Professor at Michigan State University.
The author is grateful to Karen Sedatole, Tyler Thomas, two anonymous reviewers, and Ella Mae Matsumura editor for
helpful comments.
Accounting Horizons
Vol. 23, No. 3 American Accounting Association
2009 DOI: 10.2308/acch.2009.23.3.307
pp. 307325
COMMENTARY
Submitted: May 2007
Accepted: May 2009
Published Online: August 2009
Corresponding author: Joan Luft
Email: luftj@bus.msu.edu
307
FASB has considered mandating the reporting of nonnancial measures along with traditional
nancial statements FASB 2001; Maines et al. 2002; Upton 2001.
1
Recent evidence, however, suggests that high initial expectations about the value of NFI were
not fullled in many instances. NFI appeared particularly value relevant that is, associated with
stock prices for Internet rms in the later 1990s, but this value relevance fell signicantly not to
zero, however after the end of the Internet bubble Demers and Lev 2001; Rajgopal, Venkatacha-
lam, and Kotha 2003. Many rms that adopted NFI-based incentive systems subsequently dis-
carded them e.g., 42 percent in the sample analyzed by HassabElnaby et al. 2005. Recent
research on business risk audits has reported considerable unwillingness by auditors to rely on
NFI-based approaches Knechel 2007; Curtis and Turley 2007. After relatively intensive consid-
eration at the beginning of the decade, the FASB has not acted to mandate NFI reporting.
Given the retreat to the nancial that appears in this recent research, a number of questions
arise for accountants. First, why, if at all, should accountants be involved with the development
and use of NFIrather than, for example, leaving customer-satisfaction measurement to market-
ers and employee-morale measurement to human resource specialists? Second, what has been
learned from the experience of recent years about the actual benets and challenges of using NFI
in conjunction with accounting information? Third, if and when accountants are involved with NFI
development and use, what assistance does accounting research provide to deal with the observed
challenges in the development and use of NFI? The remainder of this commentary addresses these
three issues in turn.
NFI AND ACCOUNTING
Most organizations use a wide range of data that is important to the organizational mission but
falls outside the purview of the organizations nancial function. Accountants typically have little
to do with NFI documenting for example procedures for engineering experiments or biometric
indicators for high-security employee IDs. What makes selected NFI the business of accountants
or users of accounting information?
Accounting research summarized below provides evidence that selected NFI can be used both
to substitute for and to complement accounting information in tasks for which accounting is
typically important, such as forecasts of future nancial performance or evaluation of current
performance. Accounting and NFI work together as a portfolio of measures, in which the value of
using and rening accounting measures depends on the information properties of NF measures
included in the portfolio, and the information value of any specic NF measure depends on the
properties of accounting.
In consequence, the accountants tasks depend on the properties of NFI as well as of account-
ing information. Whether accountants should, for example, devote signicant effort to developing
nancial measures of intellectual capital as an input to the valuation of knowledge-intensive rms
depends on how cost-effectively NF measures such as patents and publications can provide the
same information. In this case, accounting and NF measures are substitutes, and more informative
NFI means less need for accountants to develop or users to seek out nancial measures.
In contrast, when NFI complements accounting, more use of NFI means more use of account-
ing, because accounting is more valuable when used together with NFI than when used alone. For
1
The denition of nancial and nonnancial varies across users. Some regard all measures denominated in dollars or
other currency e.g., cost of quality measures as nancial, while measures like defect counts or satisfaction ratings are
nonnancial e.g., Nagar and Rajan 2001. Others regard nancial measures as consisting primarily of GAAP earnings
and its components and stock prices or returns, while measures like customer protability or cost of quality are
nonnancial even though dollar-denominated Kaplan and Norton 2001a; Upton 2001. In general, the observations in
this paper apply to NFI identied by either denition.
308 Luft
Accounting Horizons September 2009
American Accounting Association
example, in Amir and Lev 1996, accounting earnings alone appear irrelevant to stock prices for
wireless communication rms; but when NF measures of growth potential are included in the
model, earnings become signicantly value relevant.
2
Similarly, in performance evaluation and
reward systems, accounting earnings that are imperfect measures of employees actions can be
more heavily weighted i.e., more dollars of reward can be provided for a given increase in
earnings when appropriate NFI is also included in the reward base Feltham and Xie 1994; Datar
et al. 2001. In such cases, more informative NFI means that accountants can more condently
advocate the use of earnings or other accounting information in decision making, even though
earnings is not a perfect measure of rm, business-unit, or individual performance.
Thus accounting and nonnancial measurement can usefully inform each other and do not, in
principle, benet from being performed in isolation. But as described in more detail below, results
of recent experience with joint use of accounting and NFI have been mixed, and users of multiple-
measure systems have expressed a number of specic frustrations with them.
BENEFITS AND CHALLENGES OF NFI USE
Benets
Benets of combining selected NFI with accounting measures have been documented in
numerous studies in recent years. The incremental explanatory power for earnings and stock
returns provided by NFI has been well established e.g., Amir and Lev 1996; Ittner and Larcker
1998; Hughes 2000; Trueman et al. 2000; Nagar and Rajan 2001; Rajgopal, Venkatachalam, and
Kotha 2002, 2003; Rajgopal, Shevlin, and Venkatachalam 2003; Smith and Wright 2004, al-
though prior earnings is often a stronger predictor e.g., for stock returns in Francis et al. 2003.
Brazel et al. 2007 nd that NFIspecically, inconsistency between patterns in nancial and NF
informationis a signicant indicator of nancial fraud.
The use of diverse nancial F and NF measures to manage organizations e.g., to allocate
resources and reward employees appears to be positively associated with organizational perfor-
mance on average. The association is often weak, however; there is considerable variation in the
experience of individual organizations, and attempts to predict which types of organizations will
benet more from NFI use than others have had mixed results Hoque and James 2000; Banker et
al. 2000; Ittner, Larcker, and Randall 2003; Said et al. 2003; Chenhall 2005; Van der Stede et al.
2006.
Another stream of studies focuses on associations between NFI and learning, which may not
be captured clearly in tests of the direct especially short-term association between NFI use and
organizational performance. Chenhall 2005, using survey data, nds that organizational learning
mediates the effect of customer-measure use on strategically important customer outcomes. That
is, how strongly the use of customer measures is associated with successful customer outcomes
depends on how strongly the use of customer measures is associated with organizational learning.
Campbell 2008, using archival data from a restaurant chain, nds that NF-based incentives
increase performance, and a portion of the performance improvement remains after the incentive
is reduced or eliminated, apparently because of nonreversible learning gains. Similarly, experi-
mental results in Farrell et al. 2008 indicate that even in settings where leading NF indicators
have no incentive value because employees have long horizons, employee performance is higher
2
A discussion of this study by Shevlin 1996 expresses some reservations about the analyses employed, but the general
principle that accounting can be more informative when complemented by NFI remains valid. For example, Brazel et al.
2007 nd that a measure developed by combining revenue growth and growth in NF measures e.g., number of
employees or facilities is a signicant predictor of fraud; revenue growth without the comparison to NF growth seems
unlikely to be an equally valuable fraud indicator.
Nonnancial Information and Accounting: A Reconsideration of Benets and Challenges 309
Accounting Horizons September 2009
American Accounting Association
when compensation contracts include forward-looking NF measures than when they do not. The
inclusion of forward-looking measures appears to induce more focused testing of task strategies,
which increases performance over time.
3
Sometimes incentives for current performance conict with learning, but well-chosen NFI can
contribute to resolving this conict. Dye 2004 observes that the actions by managers that do the
most to increase prot in the current period do not always provide the most valuable information
about which actions will improve prots in the future. Thus the evaluation and reward system
must carefully balance incentives for performance against incentives for experimentation and
learning.
4
Analytic results in Dye 2004 indicate that the better an organizations information
system tracks the intermediate outcomes product quality, customer satisfaction, etc. between
managers actions and nancial performance, the more worthwhile it is for managers to experi-
ment, because they can learn more from the outcomes of their experimentation.
Challenges to Effective Use of NFI
In spite of the benets of NFI use documented above, attempts to implement systems of F and
NF measures have had mixed success. Some of the challenges to effective NFI use are primarily
management issues: for example, leadership failures in implementation Kaplan 2006 and failure
to link the NFI to strategy Ittner and Larcker 2003 or even agree on a strategy to which NFI
could be linked Kasurinen 2002. Other challenges relate to information design and use, however,
and thus fall more clearly in the domain of accounting.
Two key information-related challenges to effective NFI use are measuring important non-
nancial factors and weighting multiple F and NF indicators appropriately in decisions like resource
allocation, planning, and performance evaluation and reward. A Deloitte 2007 survey of senior
executives and board members identies understanding of how to measure non-nancial drivers
of performance as the primary requirement for more successful use of NFI. According to Ittner
and Larcker 2003, one of the most common mistakes organizations make in using NFI is
incorrect measurement. A number of studies have documented the importance of measurement
problems in blocking the development and use of multiple-performance-measure systems e.g.,
Malina and Selto 2001, 2004; Cavalluzzo and Ittner 2004; Andon et al. 2007.
Incorrect weighting of multiple measures can also lead to disappointments with NFI, and
determining appropriate weights is likely to be a difcult and conict-ridden process see Malina
and Selto 2001, 2004, for examples.
5
Users of NFI are sometimes simply unsure about appro-
priate weights; for example, a manager quoted in one eld study says: I dont have a sense of
which of these measures have the most leverage compared to others Malina and Selto 2004,
459.
Even when users of NFI have more condence about weights on multiple measures, the
weights can be systematically mistaken, thus supporting decisions with disappointing outcomes.
Daniel and Titman 2006 and Rajgopal, Shevlin, and Venkatachalam 2003 argue that the equity
markets systematically overweight selected NFI. In single-rm studies, Ittner, Larcker, and Meyer
3
Results of an experiment by Webb 2004 indicate that incentive effects of contracting on forward-looking measures
depend on the prima facie plausibility of the measures effects on future performance, and learning might also be
affected by this factor.
4
Note that the tradeoff Dye 2004 describes is not the usual tradeoff between actions that improve current performance
and actions that improve future performance, but between actions that improve current performance and actions that
reduce managers uncertainty about what will improve future performance.
5
Weights can be explicit in formal prediction models or incentive-compensation formulas, or implicit in subjective
evaluations or predictions that are more inuenced by some measures than others. They can also be implicit in other
elements of the control system; for example, the intensity with which managers respond to variances on different
measures Lillis 2002.
310 Luft
Accounting Horizons September 2009
American Accounting Association
2003 and Malina and Selto 2001 nd that managers initially put signicant weights on NFI
when a new performance measurement system is introduced but soon redistribute the weight to
more traditional nancial or market-share measures, possibly because managers regard the
initial weights on NFI as mistakes. Moers 2005, using proprietary information from a large
European industrial rm, nds that the use of more diverse performance measures is associated
with more lenient and more compressed evaluations. That is, it appears that supervisors weight
multiple measures differently for different subordinates to avoid the unpleasant task of giving low
evaluations or evaluations that differ much across individuals.
Thus both measurement and weighting pose signicant problems for organizations. Popular
understanding of measurement and weighting problems sometimes seems limited to a belief that
accurate or reliable measures should be chosen, and that more important measures should
be weighted more heavily. In this view, when performance is multidimensional e.g., both inno-
vation and cost management are important, an appropriate approach would be to choose for each
dimension innovation and cost management the most accurate measure that can be acquired or
constructed at a reasonable cost and then weighting the measures based on the relative strategic
importance of the performance dimensions.
Both analytical and empirical research indicate, however, that sometimes inaccurate measures
are quite serviceable in decision making, and at other times important measures need to be
weighted lightly see below for examples. Hence a more rened approach to measurement and
weighting can be helpful, and research suggests two important principles for making such rene-
ments.
The rst principle is to match information properties appropriately to decisions. The type and
magnitude of error that make a measure virtually useless for one decision can be relatively
harmless in another. Because error reduction is often costly, identifying the decision settings in
which it is more valuable can make NFI use more cost effective.
The second principle is to take a portfolio approach in performing this matching: that is, to
consider the information properties of the set of F and NF measures used, rather than evaluating
the measures one by one. Important dimensions of performance are sometimes difcult to mea-
sure, but even a poor measure can be useful if other measures in the set can reduce the rst
measures negative effects on decision quality.
The remainder of this commentary rst provides brief denitions of decision and measure-
ment error types as a basis for successful matching. The following sections then summarize
recommendations for measurement of NFI that can be drawn from recent research, followed by
recommendations for weighting of F and NF measures in multiple-measure portfolios.
DEFINITIONS
Decisions
Based on Demski and Felthams 1976 distinction between decision-inuencing and
decision-facilitating uses of information, decisions are categorized as follows:
1 Performance evaluation for purposes of reward is a decision-inuencing use of informa-
tion. Here NFI, in combination with accounting, is used as a basis for determining
performance-based rewards such as bonuses, equity compensation, and promotions. A
measure or portfolio of measures is better for this purpose, the better it captures em-
ployee efforts and talents that increase rm value or other relevant organizational objec-
tives, and the less it responds to window-dressing or the occurrence of random exter-
nal events that inuence organizational outcomes.
2 Provision of predictor variables is a decision-facilitating use of information. Here NFI is
used, for example, as a leading indicator to forecast future nancial performance, and to
Nonnancial Information and Accounting: A Reconsideration of Benets and Challenges 311
Accounting Horizons September 2009
American Accounting Association
estimate the expected return of alternative investment projects as a basis for choosing
between them. A measure or portfolio of measures is better for this purpose when it
supports more accurate predictions.
Measurement Errors
Measurement errors can be usefully divided into two categories, bias and random error. When
NFI is upwardly or downwardly biased, on average it overstates or understates actual values. NFI
can also, or instead, contain random error noise. If the measure is noisy but unbiased it is
accurate on average but overstates or understates in particular instances.
MEASUREMENT
Measurement issues in the use of NFI take a number of forms. When multiple measures of a
particular performance dimension e.g., quality, customer satisfaction are available, criteria are
needed for choosing among the available measures. When accountants and managers are con-
structing portfolios of F and NF measures, they need to respond to users concerns about the
possible inaccuracy of the measures included. See Malina and Selto 2001, 2004 for examples of
these problems in a eld study. Methods of addressing these problems include not only reducing
the error in a given measure, but also mitigating the negative decision consequences of irreducible
error and identifying decision settings in which the particular errors are relatively innocuous and
thus NFI can be useful in spite of measurement error.
The recommendations below begin with identifying relevant errors and matching them to
decisions, and then continue with means of mitigating errors and/or their effects.
Identify Error in Measuring the Construct, Not the Indicator
A measure such as patent counts is an indicator of an underlying construct such as rm-value-
increasing innovation. In economic models of management accounting see, e.g., Lambert 2007,
both bias and random error are dened with respect to the underlying construct that the measure
intends to capture, not with respect to the indicator. Thus a patent count that correctly reports the
number of patents the organization actually received is not error-free in the sense that is relevant
here.
A focus on accuracy in the indicatorfor example, choosing NF indicators that are precisely
countable and discarding those that are notcan lead to disappointing experiences with NF
measures that are accurate but are neither good predictors nor good motivators. For predictions
a patent count would be error-free in this sense if it measured exactly the innovativeness that
generates future expected revenues. For reward decisions a patent count would be error-free if it
measured exactly the employees efforts that contributed to this innovativeness. Neither F nor NF
measures are likely to be error-free in this sense. Sometimes softer or fuzzier measures such
as knowledgeable subjective judgments of innovation quality can be more accurate in measuring
the underlying construct and can provide better support for predictions and performance evalua-
tions.
Care needs to be taken about possible biases in subjective measuresfor example, favoritism
in subjective performance evaluationsbut such measures should certainly not be discarded out of
hand in favor of harder more objectively countable measures. Analytic research nds that
including subjective measures of performance can improve overall performance evaluation and
motivation Prendergast 1999, and Gibbs et al. 2004 nd empirical evidence consistent with this
prediction, particularly when employees have long tenureperhaps because long tenure is an
indicator of their trust in the subjective evaluation system, and/or subjective evaluation is more
accurate for employees with longer track records.
312 Luft
Accounting Horizons September 2009
American Accounting Association
Match Measures to Decisions: Different Errors in Performance Evaluation and Prediction
A measure that is inaccurate and unsatisfactory for one decision type can sometimes serve
very well for another. For example, a division might generate a large number of high-value patents
based largely on work that was done before the present divisional managers arrival. This can be
true for some years after the managers arrival in settings where R&D is a slow and cumulative
process. The patent count can thus be quite inaccurate as a measure of the current divisional
managers contributions to rm value but quite accurate as a predictor of future revenues.
Careful matching of measures to decision types based on construct-measurement error avoids
two mistakes in judgment about NFI measures. One mistake in this setting would be to regard the
patent count as a good measure because it is an excellent predictor, and therefore to insist on
using it as a basis for rewarding the manager. The other mistake would be to regard it as a bad
measure because it works poorly as a basis for reward, and therefore to fail to use it in predictions.
Match Measures to Decisions: Random Error and Prediction Characteristics
The predictive power of NFI for future nancial performance or for other NFI is often low,
and measurement error in both predictor and predicted variables is one of the sources of this low
predictive power.
6
In consequence, the errors of NFI-based predictions will be large; but both the
magnitude of the prediction error and the seriousness of its consequences depend on specics of
the decision setting. A NF measure with considerable random error in it can still be valuable in
making some predictions and need not be discarded; but conversely, the fact that the measure
provides valuable predictions in some decision settings does not mean it will provide valuable
predictions in other settings.
Key elements of the decision setting are the number of observations to be predicted and the
use to which the predictions are to be put. Holding the prediction model constant, the expected
error in the prediction of mean future nancial performance will be smaller for the mean of a large
number of observations than for the mean of a small number or for the prediction of a single
observation. Thus a model with an R
2
that provides tolerable error levels in predicting the future
prots of a large portfolio of rms can be problematic for predicting the future prots of a small
number of business units cf., use of NFI in contemporary budgeting techniques Hansen et al.
2003, or forming expectations as a basis for judging the plausibility of an audit clients unaudited
numbers cf., Bell et al. 2002.
The consequence of large prediction errors in small-sample predictions depends on whether
the decision effects of the errors offset. Consider, for example, an auditor who is responsible for
four clients. In this setting, error effects do not offset. Forming too low an expectation of earnings,
resulting in unnecessary audit work and conict with one client, does not make up for forming too
high an expectation of earnings and failing to nd a signicant error or irregularity at another
client.
In contrast, consider a manager who forecasts sales of four different product lines and adds up
the forecasts to get a total sales revenue forecast, and assume that accuracy of the total sales
revenue forecast is the primary objective in this case. In this setting the errors tend to offset. An
overstatement in the revenue prediction for one product line is partly offset by an understatement
in the revenue prediction for another product line. Holding constant the predictive ability of the
6
Although NFI is a signicant predictor of future nancial performance, the R
2
s of models relating NFI to nancial
performance are often low: 1 percent to 5 percent for customer satisfaction in Ittner and Larcker 1998, and single-digit
or low double-digit incremental R
2
s for various NFI in Nagar and Rajan 2001, Francis et al. 2003, Amir and Lev
1996, Trueman et al. 2000, and Rajgopal, Venkatachalam, and Kotha 2002, 2003. Lambert 1998 makes the point
that measurement error in the NF predictors could account for the low R
2
s of some models.
Nonnancial Information and Accounting: A Reconsideration of Benets and Challenges 313
Accounting Horizons September 2009
American Accounting Association
model and the sample size four clients or products, the overall error effect is less damaging in the
case of the sales forecast. In consequence, different levels of measurement error can be tolerated
in the two different settings.
Match Measures to Decisions: Innocuous Measurement Biases
Bias can seem like a more serious problem than mean-zero random error, because a biased
measure does not represent the true value of the underlying construct even on average, while a
noisy but unbiased measure does. For some decisions, however, pure bias is a relatively innocuous
type of measurement error. The presence of bias in measures need not always be an obstacle to
implementing NFI-based performance measurement and prediction, and accepting some bias in
return for a reduction in random error can be worthwhile when such tradeoffs are possible.
7
Situations in which bias is relatively harmless are identied separately below for performance
evaluation and prediction.
Performance Evaluation
Performance evaluations and rewards are often based not on the observed NFI measure itself
but on the change in the measure or a comparison between the measure and a target Murphy
2000. These practices signicantly mitigate the effect of bias. For example, suppose that a cus-
tomer satisfaction measure has an upward bias of two points on a 10-point scale. Perhaps the
questions are designed to make customers reect on positive more than negative experiences.
When the true value changes from four to six, the reported value changes from six to eight. In this
case, the change of two units is an unbiased measure although the absolute levels are biased. Users
do not need to know the amount of the bias; they only need assurance that it is stable over time.
8
Similarly, when performance evaluation is based on comparing a measure to a target, the
comparison can do much to eliminate the effects of bias. If the target is based on past performance,
then the comparison to target is very similar to a change measure. If the target is based on external
benchmarks instead, and decision makers have some awareness of the existence and magnitude of
the bias, they can set the target accordingly. The more upward bias there is likely to be in the
measure, the more the target performance should exceed an unbiased external benchmark.
Providing Predictor Variables
When NFI is used to provide predictor variables, bias can be harmless as long as the predic-
tive models or individuals subjective prediction strategies were developed using previous data
with the same bias. The bias will be captured in the intercept of the model, and both the coefcient
representing the effect of NFI and the prediction itself will be unbiased. In such settings, the prot
increase associated with for example an increase in reported customer satisfaction from six to
eight in the past provides a reasonable basis for estimating the prot increase associated with an
increase in reported customer satisfaction from six to eight in the future, even though the measure
itself is overstated.
Reduce Error by Aggregating Multiple Measures
It is not always intuitively obvious that total measurement error can be reduced by using more
measures that contain error, but this is often a cost-effective way of decreasing random error. Such
7
For example, in sample-based measures like customer satisfaction surveys or defect counts, sample composition and
sample size choices can determine the magnitudes of noise and bias and the terms of tradeoff between the two.
8
When as is often the case, a NF measure captures the underlying construct with random error as well as bias, then a
change in the reported measure or a comparison to target will still contain error. But it is important to realize in such
cases that the random error, not the bias, is the problem that must be addressed.
314 Luft
Accounting Horizons September 2009
American Accounting Association
reductions can be worthwhile for organizations, because random error has signicantly negative
effects on decision making. Random error in predictor variables reduces the accuracy of predic-
tions, and random error in employees evaluations reduces the motivation that a given level of
monetary incentive provides to risk-averse employees. Random error in performance measures can
also reduce the ability of an organization to attract talented but risk-averse individuals, because it
reduces their certainty that they will be rewarded for the exercise of their talents.
The basic principle of reducing random error by averaging multiple observations is intuitive
in some instances. For example, average divisional earnings over several periods are likely to be
regarded as being a more reliable measure of a divisional managers talents and efforts than a
single periods earnings.
Research also provides examples of more sophisticated combinations of measures via statis-
tical methods in prediction settings. Rajgopal et al. 2002 use factor analysis to reduce a large
number of specic NF information items such as new product introductions and managerial
team-building actions. Taken singly, the specic actions are too numerous, diverse, and ambiguous
in their implications to be easily used as predictors. But combined into two factors, they explain a
substantial portion of the cross-sectional variation in e-commerce rms stock market returns
post-IPO, even after controlling for reported earnings and analysts forecasts of future earnings
and revenues. Similarly, Demers and Lev 2001 and Dikolli and Sedatole 2007 use factor
analysis to combine multiple measures of website performance into two factors that have signi-
cant explanatory power for stock returns and future protability of e-commerce rms. Data-
reduction techniques of this kind offer considerable promise for reducing random error in NFI
measurement.
Mitigate Error Effects: Offsetting Deliberate Bias
Bias is particularly troubling when it is the result of deliberate actionswindow-dressing or
gamingon the part of individuals whose performance is being measured. In such cases it may
not be stable across individuals or time, as motivations to introduce bias will vary across indi-
viduals and across time. A well-designed portfolio of measures for performance evaluation and
reward can reduce intentional bias, however, by including measures on which window-dressing
has countervailing effects. That is, actions taken to window-dress one measure and increase the
employees reward will tend to make another measure look worse and thereby decrease the
employees reward, thus reducing the overall incentive to window-dress. See Feltham and Xie
1994 and Datar et al. 2001 for analyses of the construction of multiple-measure evaluation and
reward systems.
For example, consider rewarding managers for a particular NF measure, high inventory turn-
over, as a measure of effective inventory management. Managers may game the measure, resulting
in high values of reported turnover but not effective inventory management, as too-low inventories
lead to stockouts, poor customer service, and even reduction in product innovation, because of
managers uncertainty about whether radically innovative products will move quickly enough to
keep the inventory turnover measure high see examples in Melnyk et al. 2005 and Melnyk et al.
forthcoming. Thus gaming of the inventory management measure has negative effects if the
measure is used alone. But when it is used as a component in a portfolio of measures that includes
GAAP income, managers incentives to game the measure are mitigated. Understocking will
improve inventory turnover but reduce income by reducing sales. Conversely, the use of inventory
turnover as a measure mitigates the tendency of managers to bias reported income upward through
overproduction see Roychowdhury 2006 for evidence of overproduction as an earnings man-
Nonnancial Information and Accounting: A Reconsideration of Benets and Challenges 315
Accounting Horizons September 2009
American Accounting Association
agement technique.
9
Overproduction will increase current GAAP absorption costing income,
but will make inventory turnover worse. The combination of measures limits gaming and moti-
vates decisions more congruent with organizational goals.
10
WEIGHTING MULTIPLE MEASURES
Just as the problem of measurement does not reduce to a problem of nding the single most
accurate measure for a given construct, so the problem of weighting does not reduce to a question
of which measures are more important in general. In prediction models using NFI, error in the
measures as well as the predictive importance of the underlying constructs can inuence weight
estimation. In performance evaluation and reward systems, optimal weights on F and NF measures
depend not only on the underlying constructs strategic importance or contribution to rm value,
but also on a complex array of factors such as contract length and individuals time horizon
Dikolli 2001; Dutta and Reichelstein 2003, product architecture in supply chains Baiman et al.
2001, whether the incentive contract is implicit or explicit Budde 2007, how tasks are bundled
togetherfor example, whether one employee is responsible for sales only and another for service
only, or each is responsible for a mix of sales and service Hughes et al. 2005and whether the
incentive compensation is simply paid out based on measured performance or a bonus pool is
determined rst and then divided among employees Rajan and Reichelstein 2006.
This section focuses on how measurement propertiesa key concern of accountantsaffect
the weighting of NFI and related nancial measures. Measurement-property effects are not always
intuitively obvious. For example, Hemmer 1996 shows analytically that adding a measure of
customer satisfaction to an incentive system based on accounting earnings can either increase or
decrease the optimal weight on earnings, depending on whether the customer satisfaction measure
is the mean level of satisfaction or the number of customers that exceed a certain satisfaction
threshold.
Because appropriate weighting is not always intuitive and weighting decisions are often made
subjectively, an important element of effective use of NFI is avoiding common biases in subjective
decision making. Hence the recommendations below include notice of potential biases in subjec-
tive weighting and techniques for mitigating these biases, insofar as recent research has addressed
these issues. Weighting problems and solutions differ considerably between performance evalua-
tion and prediction settings, and thus the two settings are presented separately.
Weighting in Performance Evaluation
Weight High-Random-Error Measures Cautiously, Even When Important
The effect of measurement properties on incentive weighting that has received the most
attention in accounting research is the negative effect of random measurement error
11
on optimal
weights when a measure is used as a basis for rewarding risk-averse employees Lambert 2007.
The larger the error typically is in a measure of employees efforts and talents, the more uncertain
9
An alternative solution for the overproduction problem under absorption costing is of course to use variable costing
instead, but this is not the best solution in all settings. For example, the public nature of GAAP income can mean that
important rewards are attached to absorption-costing income e.g., reputation, career concerns even if variable costing
is used internally. Or, if incentive compensation and its basis must be made public, the organization may prefer to use
GAAP earnings as the basis because it is already public information. In such settings, adding NFI to the evaluation
system may be preferred.
10
Datar et al. 2001 point out that reducing gaming by creating combinations of measures is not always an equally
feasible solution: it is more feasible when the number of different activities performed by the individual being evaluated
is not large compared to the number of performance measures.
11
In this context, random measurement error means error in the measure as a measure of employee actions, not a measure
of quality, innovation, etc. as such.
316 Luft
Accounting Horizons September 2009
American Accounting Association
and less motivating is the compensation based on the measure, and the less valuable it is for an
organization to weight the measure heavily; that is, to pay large amounts for changes in the level
of the unreliable measure.
Low incentive weights on strategically important NFI measures such as innovation or cus-
tomer satisfaction of course reduce the motivational value of the NFI. If there is no way of
mitigating the risk created by using unreliable high-error measures, then low weighting is often
the lesser of two evils. However, as the following recommendations indicate, there are often ways
of mitigating this risk by taking advantage of the portfolio properties of sets of F and NF mea-
sures. To the extent that irreducible random error remains, there are potential gains from avoiding
common decision errors in dealing with this error, as described in the last set of recommendations
for performance-evaluation uses.
Use Measures with Negatively Correlated Errors to Allow Higher Weights
Random error in both accounting and NFI as measures of employee efforts and talents is often
caused by external shocks such as macroeconomic changes. In a well-constructed portfolio of
measures, the effects of these shocks on some measures will be negatively correlated with their
effects on other measures, resulting in a lower error in the performance evaluation based on
combining all the measures.
For example, a plant manager might be responsible for both unit cost and quality of the
product but might be unable to predict or signicantly inuence production volume. cf., manag-
ers responsibility for unit costs, quality, and customer service, all of which are inuenced by
uncontrollable volume uctuations, in the disk-drive manufacturer studied by Davila and Wouters
2005. In this case, an unexpected upward spike in volume adds positive error to the cost
measure unit costs go down, but not because of the managers efforts or talents and adds
negative error to the quality measure unexpected volume stresses the production system and
increases defects, but not because of the managers lack of effort or talent. The reverse happens
when production volume spikes downward. In this case, if only cost or only quality was included
as a basis for the managers performance evaluation, the volume shocks could add considerably to
errors in evaluation. But if the manager is evaluated on a weighted sum of the cost and quality
measures, the positive and negative errors tend to offset, and the error in the overall evaluation is
relatively small. In consequence, substantial errors in individual measures do not result in equally
substantial errors in the overall evaluation on which compensation is based. Both measures can
therefore be weighted relatively heavilythat is, signicant monetary incentives can be offered
for performance on both dimensionswithout imposing excessively costly risk on the manager
example from Krishnan et al. 2005, based on Feltham and Xie 1994.
Leverage the Effects of Reducing Error in One Measure to Allow Higher Weights on Other
Measures
Reducing the random error in an important NF measure to allow it to be weighted heavily is
often costlyfor example, customer satisfaction measures can be improved through more sophis-
ticated survey design and the collection of larger samples. Cost-benet analyses of such actions
should not neglect the fact that improving one measure can improve overall performance evalua-
tion and motivation by allowing heavier weights on other measures as well. Further development
of the example from the previous recommendation illustrates this potential benet.
Suppose that a measure of product quality contributes signicant random error to the overall
performance evaluation of the plant manager in the example, and the error is not fully offset by
negatively correlated error in other measures. In this case, the managers compensation cannot
depend too heavily on quality because the measure is too unreliable. In consequence, compensa-
tion also cannot depend too heavily on cost or other measuresthat is, the managers pay cannot
Nonnancial Information and Accounting: A Reconsideration of Benets and Challenges 317
Accounting Horizons September 2009
American Accounting Association
be very performance-basedbecause a high weight on cost and a low weight on quality will skew
the managers efforts suboptimally toward cost. In such a case, lowering the measurement error in
quality will allow not only quality but also cost to be weighted more heavily, and pay can be more
strongly performance-based without skewing employees allocation of attention and effort
Feltham and Xie 1994. Similarly, improvements in accounting that reduce the measurement error
in cost e.g., a well-designed activity-based costing system allow compensation to depend more
heavily not only on cost but also on quality, thus increasing employee motivation for both objec-
tives.
Avoid Common Subjective Weighting Errors
Because weights on performance measures are often determined subjectively in incentive-
compensation systems, avoiding common subjective decision errors can increase gains from using
NFI. For example, Krishnan et al. 2005 provide experimental evidence that nonexpert compen-
sation system designers tend to incorporate negative error correlation effects into their weighting
choices Recommendation 2, but are less likely to realize that decreasing the independent random
error in one measure means that the weights not only on that measure but also on other measures
should be increased Recommendation 3.
The basic principle that compensation for risk-averse employees should not depend heavily
on high-error performance evaluations is often intuitively clear. But in some instances it is not,
resulting in signicant obstacles to successful implementation of portfolios of F and NF measures.
One recurring problem appears to be weighting a NF measure heavily based on its strategic
importance without discounting for its error. The overweighting can lead to unsatisfactory results
large compensation changes unconnected with changes in employee efforts and potentially an
overreaction against the measure. Malina and Selto 2001, in a eld study of a balanced scorecard
adoption, nd that initially heavy weights on learning and growth and corporate citizenship
measures were sharply reduced later because of the unreliability of the measures. Ittner, Larcker,
and Meyer 2003 describe another large rm in which signicant initial weights on NF measures
were rapidly reduced, perhaps in part because of reliability concernsand arguably representing
too extreme a reaction, as the NF measures were given zero or near-zero weights in bonus
determination, even though they were signicantly associated with future nancial performance
and could be inuenced by managers actions.
An opposite problem is that evaluators can be sensitive to random error in performance
measures but respond by deliberately increasing the incentive weight in response to higher random
error. They believe that a larger amount of risky pay instead of a xed risk premium is a good
way to compensate employees for risk. Krishnan et al. 2005 document this belief experimentally,
noting that the practitioner literature sometimes expresses similar beliefs e.g., Bloom 1999.
Arguments for a larger amount of risky pay as an appropriate way of compensating for risk can
sometimes be found in the justications offered for high levels of risky executive compensation
e.g., justications mentioned by Bettis et al. 2008. But in general, making employees compen-
sation more dependent on a measure when it is more unreliable as an indicator of their efforts and
talents seems to be an unpromising basis for incentive compensation.
Another frequently observed problem in subjective weighting arises from comparative evalu-
ation of multiple managers. Lipe and Salterio 2000, in a much-replicated experiment, nd that
when managers of two divisions are being evaluated subjectively, based on balanced scorecards
tailored to the strategy of each division, evaluators tend to put more weight on the measures shared
by both divisions than on those unique to each division, although the unique measures are meant
318 Luft
Accounting Horizons September 2009
American Accounting Association
to be equally important to divisional strategy.
12
Banker et al. 2004, Libby et al. 2004, and Dilla
and Steinbart 2005 replicate this nding and identify ways of reducing though not usually
eliminating the apparent overweighting of common measures. Providing additional training on
the balanced scorecard, emphasizing the strategic relevance and reliability of the unique measures,
or requiring that the evaluator explicitly justify the evaluation all increase relative weights on the
measures unique to individual divisions.
13
Weighting in Predictions
The accuracy of predictions based on portfolios of F and NF measures depends in part on the
accuracy of the weights placed on individual predictors. These weights coefcients in predictive
models can also play an important role in resource allocation. For example, a higher weight on
NF measure 1 than on NF measure 2 in a model predicting prots suggests that, if the cost of
improving performance on either measure is the same, more resources should be devoted to
improving 1 than to improving 2.
As noted in the previous section, stable bias is relatively innocuous when estimating and using
weights on multiple predictors: weights in predictive models are unaffected by stable bias. Ran-
dom error can be more problematic, but the nature and magnitude of the problems depend on the
predictive decisions being made and on characteristics of the random error.
Consider a balanced scorecard, in which learning and growth is expected to lead to improve-
ments in internal business processes, which lead in turn to customer-measure improvements and
higher nancial performance. Internal business process measures can be used both to test the
effects of learning and growth initiatives e.g., is a higher level of measured employee skills
associated with higher product quality? and to predict customer and/or nancial measures. Rec-
ommendations for matching NFI characteristics with decisions are made in the context of this
example.
Match Error Reduction Efforts to Prediction-Model Characteristics and Uses
The quality measure plays a dual role in the example given above. It is predicted by learning
and growth measures in one model, and it is a predictor of customer and/or nancial measures in
another model. In some cases, a good estimate of one of these two predictive models may have
higher priority than a good estimate of the other predictive model. There may, for example, be
more ex ante uncertainty about the strategy component represented by one of these models than
the other, or there may be more important managerial decisions dependent on the weights in one
model than in the other.
Random error in the quality measure has different effects on determining the weights in these
two models, and so reduction in random error may matter more or less depending on the relative
importance of the two models. When quality is the dependent variable, predicted by learning and
growth, random error in the quality measure is not an obstacle to estimating unbiased weights
coefcients on these indicators. But when quality is the independent variable, predicting cus-
tomer or nancial measures, random error in the quality measure can be more problematic.
Random error in predictors creates misweighting if the error is correlated with the reported
12
Arya et al. 2005 point out that common measures can be more informative because they allow evaluators to remove
common measurement error via relative performance evaluation. Thus underweighting unique measures can be ap-
propriate because unique measures in effect contain more error. However, it is also possible that when divisions have
radically different strategies, the error in their common measures may not be common: factors other than managers
actions may affect the same measure differently in the different divisions.
13
Whether the increased weights on unique measures in these studies are better weights is unclear; but it is not unrea-
sonable to suppose that modest positive weights on the unique measures are better than the zero weights that appear in
some experimental settings.
Nonnancial Information and Accounting: A Reconsideration of Benets and Challenges 319
Accounting Horizons September 2009
American Accounting Association
predictorfor example, if instances of particularly high reported quality are likely to be overstated
and particularly low instances are likely to be understated. This kind of error will result in biased
weights on NFI in predictive models, not only with OLS regression but also with subjective
judgments based on high-low comparisons: these analyses will tend to underestimate the actual
effects of NF performance on nancial performance.
Suppose, for example, that managers compare the product-quality levels of business units
with highest and lowest prots, and nd that a difference of ve points on a 10-point quality scale
is associated with a $20 million prot difference in business units of similar size. It appears that,
at least as a rough estimate, a one-point gain in quality is associated with a $4 million difference
in prot. But suppose that measurement error in quality means that the real difference between the
relevant observations of quality is only four points, perhaps because the extreme observations are
the result of outright clerical errors or faults in the quality-measure construction. If this is the case,
then a one-point gain in actual quality is associated with a $5 million difference in earnings rather
than a $4 million difference. The measurement error in quality has downwardly biased the esti-
mate of the effect of actual quality on earnings, possibly leading to mistaken judgments about the
value of initiatives to promote quality.
14
Moreover, random error in one of the NF predictors included in a model with multiple
predictors e.g., other NFI and past earnings not only biases the estimate of the noisy predictors
coefcient; it also can bias the coefcient estimates of other measures in the model, in unknown
directions and amounts, unless the other predictors in the model are uncorrelated with the true
value of the noisy measure Greene 2000. Because it is quite likely that NF predictors are
correlatedconsider innovation, quality, and customer satisfaction, for example, as predictors of
nancial performancethis can be a signicant problem in identifying weights for predictive
models.
Whether the weights in such a setting are too unreliable for use in important decisions, and
whether resources should be devoted to random-error reduction, depends in part on the properties
of the F and NF measures employed, and second, the intended uses of the predictive model. If the
variation in the true value of the measure is large relative to the variance of the measurement
errorfor example, if the product quality in different observations used in the estimate is actually
radically differentthen coefcient bias will not be large Wooldridge 2006. But if the variation
in the true value of the measure is not largefor example, if real differences in product quality
across observations are modest and random measurement error is largethen the coefcient bias
in regression analyses can be substantial.
Mitigate Scale-Compatibility Biases on Subjective Predictions When Needed
When predictions are made subjectively instead of based on regression models, a variety of
common judgment biases can affect weighting. Jackson 2008 calls attention to scale-
compatibility bias in a study of the use of NFI by nonprofessional investors in screening invest-
ments. Consistent with prior psychological research, these investors tend to weight information
more heavily when it is scaled in the same way as the screening criterion than when it is scaled
differently.
The scaling differences in Jackson 2008 are relatively slight ratings versus rankings. The
wide variety of scales used in NFI counts, ratios, seven-point scales, etc. could exacerbate this
problem: it could, for example, lead to underweighting of NF relative to F information in predict-
14
The effect is intuitively clear for a pair of high-low observations, but it also occurs with OLS regression when the
measurement errors and the reported values are correlated i.e., high reported observations are likely to contain large
positive errors. Random error in a NF predictor does not bias the coefcient, however, if the magnitude of the error is
correlated with the true level of the underlying construct rather than with the reported measure Wooldridge 2006.
320 Luft
Accounting Horizons September 2009
American Accounting Association
ing nancial performance. However, in Jacksons 2008 experiment, the scale-compatibility bias
is eliminated when investors compare several rms at once rather than completing the screening
evaluation of one rm before examining the next rm: the cognitive processes involved in simul-
taneous rather than sequential analysis counteract the judgment bias.
Mitigate Self-Serving Biases in Weighting Multiple NF and F Measures
Another potential problem with subjective weighting is self-serving biases. Because NFI can
be interpreted and weighted in a variety of ways, individuals can easily use NFI to support
self-serving judgmentsfor example, judging that their favored management initiative has a
stronger effect on prot than a nonfavored initiative. Often at least some part of the bias is not
evident to the individual suffering from it; the judgment is sincere and thus not easily altered by
incentives for greater truthfulness.
For example, Tayler 2008 provides experimental evidence that managers using balanced
scorecard information judge customer-value-creation initiatives with no nancial value as more
successful when they have chosen the initiatives, even though the evidence available customer
and nancial measures does not support this judgment, and the biased judgment generates no
nancial reward for them. This self-serving bias is reduced, however, when individuals are re-
sponsible for selecting the scorecard measures and the scorecard is explicitly framed as a causal
model of performance consistent with Kaplan and Norton 2001a, rather than simply a balanced
set of four perspectives on performance. It appears that the causal-model representation draws
attention to the failure of the expected positive association between customer measures and nan-
cial measures, and responsibility for choosing the measures induces individuals to take more
seriously the conclusions the measures suggest.
Especially When Complex Predictive Relations are Likely, Supplement Subjective Weighting
with Statistical Analysis
The relations of NF measures to each other and to nancial performance often take complex
forms. For example, Ittner and Larcker 1998 document strongly nonlinear effects of customer
satisfaction on future revenues. Nagar and Rajan 2005, Dikolli and Sedatole 2007, and Chen
2007 nd signicant interactions among F and NF measures in predictive models of individual-
rm performance: that is, the magnitude, or even the sign, of a NF measures effect on future
nancial performance depends on the level of another measure. Nagar and Rajan 2005 nd that
a path model including both direct and indirect effects of NFI provides a different and stronger
explanation of nancial performance in retail banks than a standard multiple regression.
Subjective weighting of predictors tends to be less accurate for nonlinear, interaction, and
indirect relations than for linear additive relations Karelaia and Hogarth 2008; Diehl and Sterman
1995. When it seems likely, based on managers knowledge of causal processes in the rm, that
a good predictive model will include nonlinearities, interactions, and mediated indirect relations,
it may be time to call in the statisticians rather than rely on subjective estimation if accurate
weighting in predicting models is important to the organization.
CONCLUSION
Measurement and weighting of NFI are challenging problems, and the experience summa-
rized in recent research does not provide complete solutions to these problems. It does, however,
identify important features of potential solutions. First, aiming at the highest possible accuracy in
each measure is often not the most cost-effective approach to measurement. When multiple mea-
sures F and NF are used together, the portfolio characteristics of the measuresthe way they
offset random error and bias in each othercan offer important opportunities for effective use of
imperfect measures.
Nonnancial Information and Accounting: A Reconsideration of Benets and Challenges 321
Accounting Horizons September 2009
American Accounting Association
Second, NF and F measures are not accurate or inaccurate as such: they are accurate
with respect to particular decision requirements. The type, magnitude, and effect of measurement
errors vary, depending on the decisions for which the measures are used. The fact that a particular
NF measure is useful in predicting stock returns does not necessarily make it equally useful in
managing the rm or auditing its nancial statements, and vice versa.
Finally, weights on NF and F measures, both for performance evaluation and for prediction,
depend on the error properties of the whole portfolio of measures as well as on the relevance or
importance of the measure to organizational objectives. Optimal weighting is a particularly com-
plex task, vulnerable both to statistical estimation problems and subjective judgment biases. Re-
search has engaged frequently with these questions in recent years, but more remains to be done.
REFERENCES
Amir, E., and B. Lev. 1996. Value-relevance of nonnancial information: The wireless communications
industry. Journal of Accounting and Economics 22: 330.
Andon, P., J. Baxter, and W. F. Chua. 2007. Accounting change as relational drifting: A eld study of
experiments with performance measurement. Management Accounting Research 18 2: 273308.
Arya, A., J. Glover,B. Mittendorf, and L. Ye. 2005. On the use of customized versus standardized perfor-
mance measures. Journal of Management Accounting Research 17: 721.
Baiman, S., P. E. Fischer, and M. V. Rajan. 2001. Performance measurement and design in supply chains.
Management Science 47 1: 173188.
Banker, R. D., G. Potter, and S. Srinivasan. 2000. An empirical investigation of an incentive plan that
includes nonnancial measures. The Accounting Review 75 1: 6592.
, M. Chang, and M. J. Pizzini. 2004. The balanced scorecard: Judgmental effects of performance
measures linked to strategy. The Accounting Review 79 1: 123.
Bell, T. B., M. Peecher, and I. Solomon. 2002. The 21st Century Public-Company Audit: Conceptual Ele-
ments of KPMGs Global Audit Methodology. Montvale, NJ: KPMG.
Bettis, C., J. Bizjak, J. Coles, and S. Kalpathy. 2008. Equity grants with performance-based vesting condi-
tions. Working paper, SSRN.
Bloom, M. 1999. The art and context of the deal: Abalanced view of executive incentives. Compensation and
Benets Review 31 1: 2531.
Brazel, J. F., K. L. Jones, and M. Zimbelman. 2007. Using nonnancial measures to assess fraud risk.
Working paper, North Carolina State University.
Budde, J. 2007. Performance measure congruity and the balanced scorecard. Journal of Accounting Research
45 3: 515539.
Campbell, D. 2008. Nonnancial performance measures and promotion-based incentives. Journal of Ac-
counting Research 46 2: 297332.
Cavalluzzo, K., and C. D. Ittner. 2004. Implementing performance measurement innovations: Evidence from
government. Accounting, Organizations and Society 29 34: 243267.
Chandra, U., A. Procassini, and G. Waymire. 1999. The use of trade association disclosures by investors and
analysts: Evidence from the semiconductor industry. Contemporary Accounting Research 16: 643
670.
Chen, C. X. 2007. Relevance of customer satisfaction measures in a setting with multiple customer groups:
Evidence from a health insurance company. Working paper, University of Illinois.
Chenhall, R. 2005. Integrative strategic performance measurement systems, strategic alignment of manufac-
turing, learning and strategic outcomes: An exploratory study. Accounting, Organizations and Society
30 5: 395422.
Curtis, E., and S. Turley. 2007. The business risk audit: A longitudinal case study of an audit engagement.
Accounting, Organizations and Society 32 45: 439462.
Daniel, K., and S. Titman. 2006. Market reactions to tangible and intangible information. The Journal of
Finance 61 4: 16051643.
322 Luft
Accounting Horizons September 2009
American Accounting Association
Datar, S., S. C. Kulp, and R. A. Lambert. 2001. Balancing performance measures. Journal of Accounting
Research 39 1: 7592.
Davila, A., and M. Wouters. 2005. Managing budget emphasis through the explicit design of conditional
budgetary slack. Accounting, Organizations and Society 30 78: 587608.
Deloitte. 2007. In the Dark II: What Many Boards and Executives Still Dont Know About the Health of Their
Businesses. New York, NY: Deloitte Touche Tomatsu.
Demers, E., and B. Lev. 2001. A rude awakening: Internet shakeout in 2000. Review of Accounting Studies
6 23: 331359.
Dempsey, S., J. D. Gatti, D. J. Grinnel, and W. Cats-Baril. 1997. The use of strategic performance variables
as leading indicators in nancial analysts forecasts. Journal of Financial Statement Analysis 2 4:
6179.
Demski, J. S., and G. A. Feltham. 1976. Cost Determination: A Conceptual Approach. Ames, IA: Iowa State
University Press.
Diehl, E., and J. Sterman. 1995. Effects of feedback complexity on dynamic decision-making. Organizational
Behavior and Human Decision Processes 62 2: 198215.
Dikolli, S. S. 2001. Agent employee horizons and contracting demand for forward-looking performance
measures. Journal of Accounting Research 39 3: 481494.
, and K. D. Sedatole. 2007. Improvements in the information content of non-nancial forward-looking
performance measures: A taxonomy and empirical application. Journal of Management Accounting
Research 19: 71105.
Dilla, W. N., and P. J. Steinbart. 2005. Relative weighting of common and unique balanced scorecard
measures by knowledgeable decision makers. Behavioral Research in Accounting 17: 4353.
Dutta, S., and S. Reichelstein. 2003. Leading indicator variables, performance measurement, and long-term
versus short-term contracts. Journal of Accounting Research 41 5: 837866.
Dye, R. 2004. Strategy selection and performance measurement choice when prot drivers are uncertain.
Management Science 50 12: 16241638.
Eccles, R., R. Herz, E. Keegan, and D. M. H. Phillips. 2001. The Value Reporting Revolution. New York, NY:
Wiley.
Farrell, A. M., K. Kadous, and K. L. Towry. 2008. Contracting on contemporaneous vs. forward-looking
measures: An experimental investigation. Contemporary Accounting Research 25 3: 773802.
Financial Accounting Standards Board FASB. 2001. Improving Business Reporting: Insights into Enhanc-
ing Voluntary Disclosures. Steering Committee Report, Business Reporting Research Project. Nor-
walk, CT: FASB.
Feltham, G., and J. Xie. 1994. Performance measure congruity and diversity in multi-task principal/agent
settings. The Accounting Review 69: 429453.
Francis, J., K. Schipper, and L. Vincent. 2003. The relative and incremental explanatory power of earnings
and alternative to earnings performance measures for returns. Contemporary Accounting Research
20 1: 121164.
Gibbs, M., K. A. Merchant, W. A. Van der Stede, and M. E. Vargus. 2004. Determinants and effects of
subjectivity in incentives. The Accounting Review 79 21: 409436.
Greene, W. H. 2000. Econometric Analysis. 4th edition. Upper Saddle River, NJ: Prentice Hall.
Hansen, S. C., D. T. Otley, and W. A. Van der Stede. 2003. Practice developments in budgeting: An overview
and research perspective. Journal of Management Accounting Research 15: 95116.
HassabElnaby, H. R., A. A. Said, and B. Wier. 2005. The retention of nonnancial performance measures in
compensation contracts. Journal of Management Accounting Research 17: 2343.
Hemmer, T. 1996. On the design and choice of modern management accounting measures. Journal of
Management Accounting Research 8: 87116.
Hoque, Z., and W. James. 2000. Linking the balanced scorecard measures to size and market factors: Impact
on organizational performance. Journal of Management Accounting Research 12: 117.
Hughes, J. S., L. Zhang, and J. Xie. 2005. Production externalities, congruity of aggregate signals, and
optimal task assignments. Contemporary Accounting Research 22 2: 393408.
Hughes, K. E.. 2000. The value relevance of nonnancial measures of air pollution in the electric utility
industry. The Accounting Review 75 2: 209228.
Nonnancial Information and Accounting: A Reconsideration of Benets and Challenges 323
Accounting Horizons September 2009
American Accounting Association
Ittner, C. D., and D. F. Larcker. 1998. Are nonnancial measures leading indicators of nancial performance?
An analysis of customer satisfaction. Journal of Accounting Research 36 Supplement: 136.
, and . 2003. Coming up short on nonnancial performance measurement. Harvard Business
Review 81 11: 8895.
, , and M. W. Meyer. 2003. Subjectivity and the weighting of performance measures: Evidence
from a balanced scorecard. The Accounting Review 78 3: 725758.
, , and T. Randall. 2003. Performance implications of strategic performance measurement in
nancial services rms. Accounting, Organizations and Society 28 78: 715741.
Jackson, K. L. 2008. Debiasing scale compatibility effects when investors use non-nancial measures to
screen potential investments. Contemporary Accounting Research 25 3: 803826.
Kaplan, R. S., and D. Norton. 2001a. The Strategy-Focused Organization. Boston, MA: Harvard Business
School Press.
, and . 2001b. Transforming the balanced scorecard from performance measurement to strategic
management. Part I. Accounting Horizons 15 1: 87104.
, and . 2001c. Transforming the balanced scorecard from performance measurement to strategic
management. Part II. Accounting Horizons 15 2: 147161.
. 2006. The competitive advantage of management accounting. Journal of Management Accounting
Research 18: 127135.
, and D. Norton. 2008. Mastering the management system. Harvard Business Review 86 1: 6277.
Karelaia, N., and R. M. Hogarth. 2008. Determinants of linear judgment: A meta-analysis of lens-model
studies. Psychological Bulletin 134 3: 404426.
Kasurinen, T. 2002. Exploring management accounting change: The case of balanced scorecard implemen-
tation. Management Accounting Research 13 3: 323343.
Knechel, W. R. 2007. The business risk audit: Origins, obstacles and opportunities. Accounting, Organiza-
tions and Society 32 45: 383408.
Krishnan, R., J. Luft, and M. D. Shields. 2005. Effects of accounting-method choices on subjective
performance-measure weighting: Experimental evidence on precision and error covariance. The Ac-
counting Review 80 4: 11631192.
Lambert, R. A. 1998. Customer satisfaction and future nancial performance: Discussion of are nonnancial
measures leading indicators of nancial performance? An analysis of customer satisfaction. Journal
of Accounting Research 36 Supplement: 3746.
. 2007. Agency theory and management accounting. In Handbook of Management Accounting Re-
search, Vol. 1, edited by C. Chapman, A. Hopwood, and M. Shields. Oxford, U.K.: Elsevier.
Libby, T., S. E. Salterio, and A. Webb. 2004. The balanced scorecard: The effects of assurance and process
accountability on managerial judgment. The Accounting Review 79 4: 10751095.
Lillis, A. 2002. Managing multiple dimensions of manufacturing performanceAn exploratory study. Ac-
counting, Organizations and Society 27 6: 497529.
Lipe, M. G., and S. E. Salterio. 2000. The balanced scorecard: Judgmental effects of common and unique
performance measures. The Accounting Review 75 3: 283298.
Maines, L., E. Bartov, P. M. Faireld, D. E. Hirst, T. E. Iannaconi, R. Mallett, C. M. Schrand, D. J. Skinner,
and L. Vincent. 2002. Recommendations on disclosure of nonnancial performance measures. Ac-
counting Horizons 16 4: 353362.
Malina, M. A., and F. H. Selto. 2001. Communicating and controlling strategy: An empirical study of the
effectiveness of the balanced scorecard. Journal of Management Accounting Research 13: 4790.
, and . 2004. Choice and change of measures in performance measurement models. Management
Accounting Research 15 4: 441469.
Melnyk, S. A., R. J. Calantone, J. Luft, D. M. Stewart, G. A. Zsidisin, J. Hanson, and L. A. Burns. 2005. An
empirical investigation of the metrics alignment process. International Journal of Productivity and
Performance Management 54 5/6: 312324.
, D. L. Stewart, R. J. Calantone, and C. Speier. Forthcoming. Metrics and the Supply Chain: An
Exploratory Study. Alexandria, VA: APICS E&R Foundation.
Moers, F. 2005. Discretion and bias in performance evaluation: The impact of diversity and subjectivity.
Accounting, Organizations and Society 30 1: 6780.
324 Luft
Accounting Horizons September 2009
American Accounting Association
Murphy, K. J. 2000. Performance standards in incentive contracts. Journal of Accounting and Economics 30
3: 245278.
Nagar, V., and M. V. Rajan. 2001. The revenue implications of nancial and operational measures of quality.
The Accounting Review 76 4: 495513.
, and . 2005. Measuring customer relationships: The case of the retail banking industry. Manage-
ment Science 51 6: 904920.
Peecher, M., R. Schwartz, and I. Solomon. 2007. Its all about audit quality: Perspectives on strategic-
systems auditing. Accounting, Organizations and Society 32: 463485.
Prendergast, C. 1999. The provision of incentives within rms. Journal of Economic Literature 31: 763.
Rajan, M. V., and S. Reichelstein. 2006. Subjective performance indicators and discretionary bonus pools.
Journal of Accounting Research 44 3: 585618.
Rajgopal, S., M. Venkatachalam, and S. Kotha. 2002. Managerial actions, stock returns, and earnings: The
case of business-to-business Internet rms. Journal of Accounting Research 40 2: 529556.
, T. Shevlin, and M. Venkatachalam. 2003. Does the stock market fully appreciate the implications of
leading indicators for future earnings? Evidence from order backlog. Review of Accounting Studies 8
4: 461492.
, M. Venkatachalam, and S. Kotha. 2003. The value relevance of network advantages: The case of
e-commerce rms. Journal of Accounting Research 41 1: 135162.
Roychowdhury, S. 2006. Earnings management through real activities manipulation. Journal of Accounting
and Economics 42 3: 335370.
Said, A. A., H. R. HassabElnaby, and B. Wier. 2003. An empirical investigation of the performance conse-
quences of nonnancial measures. Journal of Management Accounting Research 15: 193223.
Shevlin, T. 1996. The value-relevance of nonnancial information: A discussion. Journal of Accounting and
Economics 22 13: 3142.
Smith, R. E., and W. F. Wright. 2004. Determinants of customer loyalty and nancial performance. Journal
of Management Accounting Research 16: 183206.
Tayler, W. 2008. The balanced scorecard as a strategy-evaluation tool: The effects of responsibility and
causal-chain focus. Working paper, Emory University.
Trueman, B., M. H. F. Wong, and X.-J. Zhang. 2000. The eyeballs have it: Searching for the value in Internet
stocks. Journal of Accounting Research 38 Supplement: 137162.
Upton, W. S. 2001. Business and Financial Reporting: Challenges from the New Economy. Norwalk, CT:
FASB.
Van der Stede, W., C. W. Chow, and T. W. Lin. 2006. Strategy, choice of performance measures, and
performance. Behavioral Research in Accounting 18: 185206.
Webb, R. A. 2004. Managers commitment to the goals contained in a strategic performance measurement
system. Contemporary Accounting Research 21 4: 925958.
Wooldridge, J. M. 2006. Introductory Econometrics: A Modern Approach. 3rd edition. Mason, OH:
Thompson/Southwestern.
Nonnancial Information and Accounting: A Reconsideration of Benets and Challenges 325
Accounting Horizons September 2009
American Accounting Association

You might also like