Description
This paper uses recent experimental studies of financial accounting to illustrate our view of how such experiments
can be conducted successfully. Rather than provide an exhaustive review of the literature, we focus on how particular
examples illustrate successful use of experiments to determine how, when and (ultimately) why important features of
financial accounting settings influence behavior. We first describe how changes in views of market efficiency, reliance
on the experimentalist’s comparative advantage, new theories, and a focus on key institutional features have allowed
researchers to overcome the criticisms of earlier financial accounting experiments.
Experimental research in ?nancial accounting
Robert Libby *, Robert Bloom?eld, Mark W. Nelson
Johnson Graduate School of Management, 383 Sage Hall,
Cornell University, Ithaca NY 14853 6201, USA
Abstract
This paper uses recent experimental studies of ?nancial accounting to illustrate our view of how such experiments
can be conducted successfully. Rather than provide an exhaustive review of the literature, we focus on how particular
examples illustrate successful use of experiments to determine how, when and (ultimately) why important features of
?nancial accounting settings in?uence behavior. We ?rst describe how changes in views of market e?ciency, reliance
on the experimentalist’s comparative advantage, new theories, and a focus on key institutional features have allowed
researchers to overcome the criticisms of earlier ?nancial accounting experiments. We then describe how speci?c
streams of experimental ?nancial accounting research have addressed questions about ?nancial communication
between managers, auditors, information intermediaries, and investors, and indicate how future research can extend
those streams. We focus particularly on (1) how managers and auditors report information; (2) how users of ?nancial
information interpret those reports; (3) how individual decisions a?ect market behavior; and (4) how strategic inter-
actions between information reporters and users can a?ect market outcomes. Our examples include and integrate
experiments that fall into both the ‘‘behavioral’’ and ‘‘experimental economics’’ literatures in accounting. Finally, we
discuss how experiments can be designed to be both e?ective and e?cient. # 2002 Elsevier Science Ltd. All rights
reserved.
1. Introduction
Financial accounting research is a broad ?eld
that examines ?nancial communication between
managers, auditors, information intermediaries,
and investors, as well as the e?ects of regulatory
regimes on that process. Much of this literature
focuses on managers’ and auditors’ reporting deci-
sions and their relationships to analysts’ forecasts
and value estimates, investors’ trading decisions,
and resulting market prices. This clear focus on
judgment and decision making led to the large
number of experimental ?nancial accounting
studies published in major accounting journals in
the 1960s and 1970s.
Serious criticisms of this early research (e.g.
Gonedes &Dopuch, 1974) turned experimentalists’
focus away from ?nancial accounting issues in the
1980s and early 1990s. As discussed by Maines
(1995) and Berg, Dickhaut, and McCabe (1995),
major elements of these criticisms were: (1) the
irrelevance of individual behavior in market set-
tings, in which competitive forces will eliminate
individual ‘‘errors’’; (2) poor matching of research
methods to research questions; (3) the lack of
psychological or economic theory to predict e?ects
and specify the mechanisms through which they
occur; and (4) failure to capture relevant aspects
0361-3682/02/$ - see front matter # 2002 Elsevier Science Ltd. All rights reserved.
PI I : S0361- 3682( 01) 00011- 3
Accounting, Organizations and Society 27 (2002) 775–810
www.elsevier.com/locate/aos
* Corresponding author. Tel.: +1-607-255-3348; fax: +1-
607-254-4590.
E-mail address: [email protected] (R. Libby).
of the decisions of interest, in particular, decision
maker attributes and institutional features.
Beginning in the mid-1990s, there was a resur-
gence of experimental research addressing an even
broader spectrum of ?nancial accounting issues.
This paper presents our view of how this new lit-
erature has addressed prior criticisms, and how it
can continue to shed light on ?nancial accounting
questions. We argue that signi?cant evidence of
capital market ine?ciency has renewed interest in
how individuals make key accounting-related
decisions and how these decisions a?ect market
prices. Recent studies take advantage of the
experimentalist’s comparative advantage at disen-
tangling variables that are confounded in natural
settings and measuring intervening processes to
draw strong causal inferences. Theories combining
psychology and economics have allowed experi-
mentalists to specify more clearly the mechanisms
a?ecting individual and market behavior. Finally,
most of the new studies focus on issues of clear
relevance to ?nancial accounting, particularly the
e?ects of decision-maker knowledge and motiva-
tion, the complex information environment, reg-
ulation, and strategic interaction.
This paper is aimed primarily at those who plan
to conduct ?nancial accounting experiments, and
secondarily at other ?nancial accountants who are
interested in what can be learned from experi-
mental studies. Our primary goal is to use recent
experimental studies of ?nancial accounting to
illustrate our view of how such experiments can be
conducted successfully. The core of our view is
that successful ?nancial accounting experiments use
the comparative advantages of the experimental
approach to determine how, when and (ultimately)
why important features of ?nancial accounting set-
tings in?uence behavior. By elaborating on this
view, we hope to increase the impact of future
experiments and help the new literature avoid the
mistakes and fate of the earlier literature. We do
not provide an exhaustive review of the literature,
nor do we provide detailed critiques of particular
studies. Instead, we focus on how particular exam-
ples illustrate successful use of experiments to
address important ?nancial accounting issues. Our
examples include and integrate experiments that
fall into both the ‘‘behavioral’’ and ‘‘experimental
economics’’ literatures in accounting.
1
Although
these literatures evolved from di?erent traditions,
we see them as essentially similar — both use
experiments to shed light on ?nancial accounting
issues, and therefore, both present similar oppor-
tunities and challenges to researchers. Naturally,
our review is also deeply a?ected by our own bia-
ses and the ?nancial accounting issues that we
have been addressing in our own recent research.
In Section 2, we describe in more detail how
changes in views of market e?ciency, reliance on
the experimentalist’s comparative advantage, new
theories, and a focus on key institutional features
have allowed recent experiments in ?nancial
accounting to overcome the criticisms of the ear-
lier literature. In Section 3, we describe how spe-
ci?c streams of experimental ?nancial accounting
research have addressed questions about ?nancial
communication between managers, auditors,
information intermediaries, and investors, and
indicate how future research can extend those
streams. We focus particularly on (1) how man-
agers and auditors report information; (2) how
users of ?nancial information interpret those
reports; (3) how individual decisions a?ect market
behavior; and (4) how strategic interactions
between information reporters and users can a?ect
market outcomes. While we address studies of
auditors in their ?nancial reporting role, to limit
the scope of the review, we do not address issues
related to the demand for and conduct of auditing.
We also do not address studies of creditors’ deci-
sions, which have received little attention in recent
?nancial accounting experiments.
In Section 4, we discuss how experiments can be
designed to be both e?ective and e?cient. We use
the ‘‘predictive validity framework’’ (Libby, 1981;
Runkel & McGrath, 1972) to structure our discus-
sion of maximizing e?ectiveness through careful
hypothesis development and research design. Our
discussion of e?ciency focuses on the consumption
of scarce resources, such as subjects and compen-
sation to those subjects. We conclude in Section 5
with a brief summary of our main points.
1
See Haynes and Kachelmeier (1998) and Moser (1998) for
recent discussions of the integration of the behavioral and eco-
nomic approaches to experimentation.
776 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
2. Factors a?ecting the supply and demand for
experimental ?nancial accounting research
In this section, we examine four interdependent
factors that have mitigated concerns raised about
the earlier experimental literature and promoted
recent progress in experimental ?nancial accounting
research: changing views of market e?ciency,
recognition of the strengths and weaknesses of
experimental methods in addressing ?nancial
accounting questions, the availability of new the-
oretical bases for the research, and a more detailed
view of the institutional features of ?nancial
accounting settings. We discuss each of these factors
in turn.
2.1. Changing views of market e?ciency
Much of the ?nancial accounting research in the
1960s implicitly assumed that some investors’ fail-
ure to adjust fully for the e?ects of accounting
method choices would a?ect allocation of resour-
ces in the economy and disadvantage these less
sophisticated investors in their exchanges with
more sophisticated investors (see Maines, 1995 for
a review). A series of papers in ?nance (particularly
Fama, 1970) persuaded many accounting research-
ers that if just a small fraction of investors are
sophisticated enough to respond appropriately to
accounting information, they will compete among
themselves to set security prices equal to their
expected values. As a result, the market becomes a
‘‘fair game’’ in which even unsophisticated inves-
tors are protected by the informational e?ciency
of prices.
2
This research led Gonedes and Dopuch
(1974), among others, to argue that experimental
research on individual behavior could have only
limited importance for ?nancial accounting.
In the late 1980s and 1990s, however, numerous
studies reported market ine?ciencies.
3
One line of
research provides direct support for the assumptions
underlying early ?nancial accounting research:
accounting policies a?ect pricing, even when they
have no true economic e?ects (e.g. Andrade, 1999;
Hand, 1990; Sloan, 1996; Vincent, 1997). Another
line of research indicates more generally that fun-
damental analysis of public ?nancial statement
information can lead to higher stock returns (e.g.
Frankel & Lee, 1998; Lee, Myers, & Swami-
nathan, 1999; Ou & Penman, 1989). A third line of
research suggests that even sell-side analysts —
generally recognized as among the most sophisti-
cated users of ?nancial statements — are pre-
dictably biased (DeBondt & Thaler, 1990; Dechow
& Sloan, 1997; La Porta, 1996).
The best-known lines of e?ciency research focus
on momentum in earnings and prices. A volumi-
nous literature on post-earnings-announcement
drift shows that markets underreact to large earn-
ings surprises (Ball & Bartov, 1996; Bernard &
Thomas, 1989, 1990; Bhushan, 1994; Brown &Han,
2000; Foster, Olsen, & Shevlin, 1984). Another lit-
erature, primarily published in ?nance journals,
shows that after adjusting for risk, stock returns
are positively autocorrelated over periods of sev-
eral months (e.g. Chan, Jegadeesh, & Lakonishok,
1996), but negatively autocorrelated over periods
of several years (DeBondt & Thaler, 1985, 1987).
The literature on market ine?ciency is con-
troversial, and many of the papers alleging ine?-
ciency have been criticized on methodological
grounds (Ball, 1992; Fama, 1998; Kothari, 2000).
Nevertheless, many researchers now doubt whe-
ther markets satisfy the requirements of the semi-
strong form of the e?cient markets hypothesis
(that markets respond e?ciently to all publicly
available information), or even the weak form
(that markets respond e?ciently to information
contained in past market prices). Even some of the
most skeptical seem to be convinced that post-
earnings-announcement drift is not simply an
artifact of research design (Ball, 1992). Recent
research on e?ciency has also led theorists to
examine how the assumptions underlying the e?-
cient markets hypothesis might be relaxed to
account for archival results. (We discuss these
models more in Section 2.3). As a result, experi-
mental researchers can more easily argue that
individual behavior can be an important element
in determining market behavior, even in the pre-
sence of competitive forces.
2
Watts and Zimmerman (1986) also provided particularly
in?uential arguments.
3
See Fama (1998), Kothari (2000), and Thaler (1999) for
more comprehensive reviews of this literature.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 777
2.2. The comparative advantage of ?nancial
accounting experiments
Earlier ?nancial accounting experiments typically
sought to determine whether speci?c accounting
policy choices would a?ect investors’ decisions.
Answers to such research questions call for esti-
mates of the magnitude of an e?ect (or error) by
representative actors in representative circum-
stances, a task ill suited to experiments. Such a
task is more appropriate for archival-empirical
research, which examines large representative
samples of naturally occurring phenomena.
More recent experimental research strives to use
experimentalists’ comparative advantage to focus
on disentangling the e?ects of variables that are
confounded in natural settings and determining
under what circumstances and through which
processes speci?c phenomena arise. Experiments
are well suited to this task because they construct
their own research setting. In a constructed
research setting, one can manipulate the indepen-
dent variables, control for other potentially in?u-
ential variables by holding them constant or
through randomisation, and measure the inter-
vening processes (such as information search or
the path players take to equilibrium outcomes in
strategic settings) and mental states (such as
knowledge, beliefs, or con?dence) that a?ect ?nal
outcomes. This allows an experimentalist to dis-
entangle the e?ects of variables that are con-
founded in the environment to draw strong causal
inferences, and to test the e?ects of conditions that
do not yet exist or do not exist in su?cient quan-
tity in the natural environment (Libby & Luft,
1993). Experiments testing how and why (rather
than whether or not) ?nancial accounting phe-
nomena occur can be based on theories of psy-
chological, economic or institutional processes.
We discuss these theories next.
2.3. Theoretical advances in psychology, ?nance,
and economics
Earlier experimental research was criticized for
the lack of psychological or economic theory that
speci?ed the mechanisms through which e?ects of
accounting disclosures would occur. Recent
experiments in ?nancial accounting can rely on
well-developed psychological theories of judgment
and decision making
4
that were in their infancy
when the studies reviewed by Gonedes and Dopuch
(1974) were conducted. Recent research can also
rely on economic models that describe more care-
fully when and how equilibrium outcomes arise.
The major idea underlying much research on
judgment and decision making is that decision
makers are boundedly rational (Simon, 1957).
Decision makers often have limited information
on which to base their judgments and decisions,
limited ability to retain and retrieve that informa-
tion from memory, limited ability to process and
use that information, and limited insight into their
own decision processes and future preferences.
Studies over the last 25 years have focused on how
various attributes of human cognition determine
exactly what humans do well and what they do
poorly. A number of their ?ndings have in?uenced
recent thinking in ?nancial accounting and the
study of ?nancial markets.
Many decision-making studies emphasize the
role of heuristics (Tversky & Kahneman, 1974).
Heuristics are simpli?ed decision rules developed
to deal with complex situations. These heuristics
are e?cient and often work well. But in some cir-
cumstances they may lead to systematic biases
such as over- and under-con?dence in judgment
(Gri?n & Tversky, 1992) and misperceptions of
the covariation between signals and events (Lipe,
1991), which can systematically a?ect the manner
in which individuals react to ?nancial accounting
information and the manner in which that infor-
mation is impounded in prices. Learning to over-
come biases is di?cult because of the uncertainty
and poor feedback inherent in complex environ-
ments. Often what we learn from experience is not
valid (Einhorn, 1980).
The importance of (imperfect) storage and
retrieval of information from memory has also
been recognized in recent ?nancial accounting
experiments. Some of these studies rely on models
4
Syntheses of the key constructs or ideas that drive psycho-
logical theories of judgment and decision making have been
provided by Carroll and Johnson (1990), Hogarth (1993),
Bazerman (1998), and others.
778 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
of memory organization (e.g. Smith & Medin,
1981) that indicate how knowledgeable decision
makers e?ciently organize and retrieve data.
Other studies recognize that memory for events is
in?uenced by factors that are normatively rele-
vant, such as their frequency of occurrence, and
factors that are normatively irrelevant, such as
primacy, recency, and contrast e?ects (e.g.
Hogarth & Einhorn, 1992). Still others recognize
that the limited capacity of working memory
a?ects our ability to consider multiple factors in
making a judgment or choice. Consequently, even
normatively relevant factors that decision makers
are aware of often have limited in?uence on their
judgments and decisions.
Recent research in accounting and ?nance also
relies on psychological models of risk (e.g. Kahne-
man & Tversky, 1979) and ambiguity (e.g. Einhorn
& Hogarth, 1986) that characterize individuals’
responses to risk and reward in ways that deviate
from standard expected utility theory.
5
This more
recent psychology literature provides greater abil-
ity to predict under what circumstances behavior
will be more or less likely to di?er from the pre-
dictions of standard economic theory (e.g. in
earnings predictions versus trading behavior, in
di?erent information environments). A large lit-
erature on social psychology could also be used to
understand interaction between participants in
?nancial accounting settings. For example, research
related to accountability (e.g. Tetlock, 1992),
motivated reasoning (e.g. Kunda, 1990) and group
decision processes (e.g. Yetton & Bottger, 1982)
has signi?cantly in?uenced auditing studies.
Other ?nancial accounting studies use advances
in ?nancial economics to test the assertion that
biased traders will be driven out of the market
through systematic trading losses. Some of these
models focus on how biases might in?uence mar-
ket outcomes. For example, Barberis, Shleifer, and
Vishny (1998) use psychological models of how
people perceive random-walk sequences in a
model with a representative investor. Daniel,
Hirshleifer, and Subrahmanyam (1998), Gervais
and Odean (1997) and Odean (1998) incorporate
overcon?dence into trading models. Other models
focus on forces that keep unbiased traders from
exploiting price errors. For example, De Long,
Shleifer, Summers, and Waldmann (1991) show
that traders who respond irrationally to irrelevant
information (‘‘sentiment’’) create enough noise in
prices to keep rational traders from exploiting the
resulting price errors. Fischer and Verrecchia
(1999) and Kyle and Wang (1997) show that
overcon?dence, although irrational, can actually
give traders higher payo?s than their rational
compatriots. These results make it di?cult to
argue that some form of natural selection will
eliminate irrational traders in dynamic equilibria,
and provide accounting researchers with speci?c
models of how and when individual biases might
in?uence market prices.
Experiments focusing on game theoretic models
of ?nancial accounting settings can now rely on
new economic models that move beyond the tra-
ditional equilibrium view. Rather than simply
identifying an equilibrium and assuming that it
will occur, many economists have examined in
detail what assumptions about rationality must be
satis?ed for equilibria to have predictive power
(Bernheim, 1984; Pearce, 1984; Tan & Werlang,
1988). Other models have examined the process by
which equilibria are achieved, using either psy-
chological theories based on behaviorism (Herrn-
stein & Vaughn, 1980) or evolutionary theories of
natural selection (Maynard Smith, 1982). In a simi-
lar vein, Gode and Sunder (1993, 1997) used such
ideas to show that ‘‘zero-intelligence’’ traders, who
do nothing more than avoid obviously horrible
strategies, can achieve e?cient security allocations
in some markets. By focusing on processes by
which equilibria are achieved, these studies provide
indications of when equilibria will and will not
predict behavior in ?nancial accounting settings.
2.4. Key institutional features of ?nancial
accounting settings
Most early experimental studies in ?nancial
accounting took relatively narrow views of ?nan-
cial accounting institutions. They typically focused
on the set of rules governing how accounting infor-
mation could be reported in ?nancial statements,
5
See Hodder, Koonce, and McAnally (2001) for further
discussion of risk in ?nancial accounting settings.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 779
implicitly assuming that reporting choices (and
interpretations of those choices) were made neu-
trally, rather than being in?uenced by the incen-
tives of a strategic manager or auditor. Early
studies also implicitly assumed that responses to
?nancial accounting information would be inde-
pendent of the expertise or incentives of the user,
and that interactions among users and reporters
would not alter outcomes.
Consistent with the advice of Libby and Luft
(1993), recent experimental research in ?nancial
accounting has considered institutional features
more broadly, and has also focused on the inter-
action between individual and environmental
characteristics. Two key individual characteristics
are the knowledge and motivation of information
reporters and users. These determine the parties’
goals, and how they use ?nancial accounting to
achieve those goals. Key environmental char-
acteristics include the complex regulations govern-
ing reporting, the existence of ?nancial markets,
and the strategic interactions between reporters
and users, as well as between di?erent sets of
users. Regulations determine the set of choices
open to managers and auditors, and may also
determine the results of those actions (e.g. lawsuit
outcomes). Financial markets a?ect how indivi-
dual decisions result in aggregate market out-
comes, such as stock prices, liquidity and trading
volume, and may also determine wealth transfers
among di?erent sets of investors. Strategic inter-
actions capture the intertwining of the incentives
and actions of the many parties to ?nancial
accounting decisions. Financial accounting set-
tings include managers, auditors, investors and
information intermediaries (analysts and the
press) who may all interact strategically. Man-
agers and auditors negotiate to determine the
contents of the ?nancial statement and audit
report. Investors draw inferences about managers’
and analysts’ information and incentives from
observing reports. Managers may choose reports
in an attempt to ‘‘fool’’ investors, but the investors
may be able to anticipate these attempts.
6
Focusing explicitly on individual and environ-
mental characteristics allows experimental
researchers to shed light on how and when
experimental results will generalize to target set-
tings, and also indicate how variations in these
institutions will alter behavior. In this way, an
institutional focus helps researchers to exploit the
comparative advantage of experimentation. In the
next section, we describe how speci?c streams of
experimental ?nancial accounting research have
done so, and indicate how future research could
extend those streams.
3. Key ?nancial accounting questions and
experimental evidence
The goals of the literature that we review are
similar to those of the broader ?nancial account-
ing literature: to increase our understanding of the
?nancial reporting process and its e?ects. While all
of the studies that we examine share the same
general goal, they focus on di?erent elements of
the interactions of boundedly rational managers,
auditors, information intermediaries, and inves-
tors. These di?erences in emphasis led us to divide
the studies into four related categories described
by the following questions.
1. How do managers’ and auditors’ incentives
and ?nancial accounting regulations deter-
mine how they report events?
2. How do knowledge of accounting regula-
tions, managers’ incentives, and the infor-
mation content of accounting reports a?ect
users’ (investors and information inter-
mediaries) interpretations of accounting
reports?
3. How do individual responses to information
a?ect market-level phenomena?
4. How do strategic interactions between
reporters and users of information a?ect
reporting and market outcomes?
We focus primarily on papers published since
the publication of Maines’s (1995) review of this
literature.
6
Financial accounting information is also used for con-
tracting and stewardship purposes, but that has not been the
focus of signi?cant experimental research.
780 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
3.1. How do managers’ and auditors’ incentives
and ?nancial accounting regulations determine how
they report events?
Reporting performance is fundamental to ?nan-
cial accounting. Discretion provided by ?nancial
accounting regulations, coupled with the inherent
subjectivity of much accounting measurement,
allows managers some ?exibility to opportunisti-
cally report or manage earnings. Consequently,
much archival and experimental research has
focused on this area.
Archival studies typically examine opportunistic
reporting by identifying whether earnings or
accruals di?er fromexpectation in a manner favored
by managers’ incentives (see Healy & Wahlen,
1999 for a review). While these studies have
demonstrated numerous instances of apparent
earnings management, their conclusions are some-
times criticized because of methodological di?-
culties, including poor incentive proxies, misstated
discretionary accruals models, or potential omit-
ted variables such as operating choices that have
non-earnings-management rationales but that
a?ect discretionary accruals (Bernard & Skinner,
1996; Dechow, Sloan, & Sweeney, 1995). Also,
archival studies of earnings management focus on
post-audit ?nancial statements that are a joint
product of the negotiations between managers and
auditors, which makes it di?cult to distinguish the
separate contributions of managers and auditors
to earnings management or to determine how
managers’ and auditors’ separate incentives in?u-
ence their reporting and attesting behavior (Nel-
son, Elliott, & Tarpley, 2000).
Experimental studies avoid these problems by
manipulating incentives and assessing treatment
e?ects rather than attempting to measure unex-
pected accruals, and by holding constant task
characteristics that create potential omitted vari-
ables problems. Experiments can examine man-
agers’ and auditors’ judgments separately, but can
also examine auditor–client interactions. These
characteristics of experimental work have led to a
growing experimental literature that complements
the archival work in this area.
The largest group of experimental earnings-
management studies focuses on auditors’ incentives
and the circumstances under which they allow
managers to take aggressive accounting positions.
Consistent with the general auditing literature (e.g.
Kinney & Martin, 1994), results indicate that audi-
tors reduce the aggressiveness of ?nancial reports.
For example, Hirst (1994) provides evidence that
auditors consider management competence and
objectivity when evaluating management-provided
evidence. Phillips (1999) demonstrates that, after
auditors receive evidence of aggressive reporting in
high-risk accounts, they are more likely to attend
to it elsewhere, even in accounts they typically
consider to be of low risk. Kinney and Nelson
(1996) demonstrate a circumstance in which
auditors make audit-reporting judgments that
are as conservative as thought appropriate by
even those investors who are evaluating the
audit report in the presence of negative outcome
information.
However, other studies indicate that auditors
are more likely to allow their clients to take
aggressive accounting positions when the relevant
evidence or precedents o?er more room for inter-
pretation. For example, Nelson and Kinney (1997)
provide evidence that auditors are more (less)
conservative than users required when the relevant
evidence was precise (ambiguous). Similarly, Salt-
erio and Koonce (1997) provide evidence that
auditors’ treatment of clients’ capitalization versus
deferral decisions depends on whether the relevant
precedents unanimously favor one alternative.
When the precedents favor one alternative, audi-
tors follow the precedents, but when the pre-
cedents are mixed, auditors tend to follow their
client’s preference. Mayhew, Schatzberg, and Sev-
cik (2000) provide consistent evidence in experi-
mental markets. When participants in the role of
auditor were sure of the appropriate disclosure,
they made that disclosure, but as their uncertainty
about appropriate disclosure increased, they ten-
ded to misreport in favor of their client.
Other studies have focused on the role of spe-
ci?c incentives in auditors’ reporting decisions.
For example, Hackenbrack and Nelson (1996)
provide evidence that auditors are more likely to
allow their clients to take aggressive accounting
positions if the auditors’ litigation risk is reduced,
and that auditors justify the aggressive position
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 781
with aggressive interpretations of the relevant
?nancial accounting regulations. Hackenbrack
and Nelson hold constant the underlying audit
evidence while varying auditors’ incentives and
whether those incentives favored accrual or foot-
note disclosure of a contingency, allowing them to
infer with high con?dence that incentives were
driving the e?ects they observed. Using the same
case materials, Kennedy, Kleinmuntz, and Peecher
(1997) provide evidence that, even when litigation
risk is relatively high, auditors may tend to take
aggressive reporting positions when they can dif-
fuse personal responsibility by consulting other
experts within the ?rm. Wilks (2001) provides evi-
dence that auditors’ interpretations of evidence
and decisions are a?ected by the views of more
senior auditors. Beeler and Hunton (2001) provide
evidence that incentives from lowballing or man-
agement-advisory services a?ect audit partners’
going concern judgments. Bazerman, Morgan,
and Loewenstein (1997) suggest that auditors
cannot be independent because of the unconscious
e?ect of such incentives, or even because of a sense
of auditor–client a?liation that occurs through
multiple interactions. However, Dopuch and King
(1996) provide evidence that competitive pressures
can reduce the e?ect of incentives like lowballing,
and King (2001) provides evidence that, holding
constant economic incentives, professional–group
a?liation can o?set the in?uence of auditor–cli-
ent a?liation, demonstrating that o?setting
a?liations can have o?setting e?ects on auditors’
independence.
A smaller group of studies examines how man-
agers’ incentives a?ect the aggressiveness of their
reporting decisions. These studies take two
approaches. One approach is to elicit managers’
judgments directly. For example, Cloyd, Pratt,
and Stock (1996) gather data from corporate
?nancial executives at both public and private
manufacturing ?rms. They provide evidence
that, when a manager has selected an aggressive
tax treatment, the manager tends to choose a
?nancial accounting method that conforms to
the tax choice in hopes of better defending the
appropriateness of the tax choice if it is later
questioned by the IRS. Managers of public ?rms
were less likely to choose conformity than were
managers of private ?rms, presumably because
managers of public ?rms face more disincentives
for making income-decreasing ?nancial accounting
disclosures.
The second approach is to elicit the joint pro-
duct of the manager–auditor negotiation indirectly
from auditors. Three di?erent studies use di?erent
versions of this approach. Libby and Kinney
(2000) manipulate factors that a?ect managers’
incentives and ask auditors to determine how the
audited ?nancial statements would appear. They
provide evidence that correction of quantitatively
immaterial errors is much less likely if the correc-
tion would cause the ?rm to miss analysts’ EPS
forecasts (i.e. is qualitatively material), and that
the recently promulgated SAS 89 has little e?ect
on this behavior. Gibbins, Salterio, and Webb
(2000) develop a model of auditor–client negotia-
tion and support their model by surveying audi-
tors concerning their experiences negotiating
contentious accounting issues with their clients.
Nelson, Elliott, and Tarpley (2000) survey audi-
tors concerning their experiences with clients’
attempts to manage earnings, and provide evidence
concerning managers’ incentives for attempting
earnings management, the ?nancial accounting
areas in which managers attempt earnings man-
agement, and the circumstances under which
auditors pass or thwart managers’ attempts.
Overall, these studies provide direct evidence
that managers and auditors use the ?exibility
inherent in accounting rules to make disclosures
that are favored by their incentives. Holding con-
stant amount of ?exibility, changes in incentives
move disclosure in the direction favored by those
incentives. Holding incentives constant, increasing
?exibility increases the degree to which incentives
a?ect decisions.
Certainly one direction for future research is to
continue examining how managers’ and auditors’
incentives a?ect their decisions. In addition, the
literature could work more to identify the pro-
cesses through which these e?ects occur. To what
extent are these e?ects intentional and strategic
versus the unintended results of cognitive limita-
tions? Wilks (2001) provides evidence that incen-
tives a?ect decisions more when the incentives are
made apparent to subjects prior to evaluating
782 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
evidence, suggesting that incentive e?ects in?uence
the evaluation process as well as the decisions that
result from that process. Beeler and Hunton
(2001) provide evidence that incentives a?ect both
the favorability and weighting of evidence, and
that auditors believe that incentives a?ect other
auditors’ judgments, but not their own. A fruitful
direction for future research is to further under-
stand how and when such incentive e?ects occur.
Another useful direction is to examine how
changes in regulations or other interventions
might a?ect the aggressiveness of ?nancial report-
ing. For example, Libby and Kinney (2000), Hirst
and Hopkins (1998), and Maines and McDaniel
(2000) provide evidence of recent regulatory
changes that do not appear to prevent managers
from making aggressive reporting decisions. Cuc-
cia, Hackenbrack, and Nelson (1995) provide
evidence in a tax context that increasing the pre-
cision of a standard does not prevent aggressive
reporting when the underlying evidence also pro-
vides latitude for interpretation. When coupled
with evidence of the e?ect of incentives on report-
ing judgments, ?ndings indicating the ine?ective-
ness of some regulatory interventions suggest that
regulators might reduce aggressiveness more
e?ectively by addressing incentives directly via
changes in penalties. Alternatively, other approa-
ches like improvements in audit-evidence sequen-
cing (Phillips, 1999) or within-?rm consultation
(Kennedy et al., 1998) might also a?ect the
aggressiveness of ?nancial reports, by a?ecting the
extent to which auditors discourage aggressive
reporting.
Finally, future research could focus more on the
interaction among participants in the ?nancial
reporting process. Researchers are only beginning
to consider the process by which auditors negoti-
ate with their clients to produce the joint product
that investors consume. Also, the increasing role
of audit committees in this process remains lar-
gely uninvestigated. Addressing these issues via
experiments (e.g. Libby & Kinney, 2000), surveys
(e.g. Gibbins et al., 2000; Nelson, Elliot, & Tarp-
ley, 2000), and laboratory markets (e.g. Mayhew
et al., 2000) appear to be useful directions for
future research. These issues are discussed more in
Section 3.4.
3.2. How do information users interpret reports,
given their knowledge of the regulations governing
those reports, and their knowledge of the reporters’
incentives?
Three streams of literature address distinct
facets of this question:
1. How do accounting methods and disclosure
alternatives a?ect earnings predictions and
value estimates of investors and information
intermediaries?
2. How do investors and analysts use the time-
series properties of earnings to predict future
earnings?
3. What determines analysts’ forecasting and
valuation performance?
We discuss each in turn.
3.2.1. How do accounting methods and disclosure
alternatives a?ect earnings predictions and value
estimates of investors and information intermediaries?
The earliest experimental research in ?nancial
accounting tended to be motivated by the need for
evidence to address speci?c accounting policy
debates. These studies focused on whether inves-
tors and others adjusted appropriately for the
e?ects of accounting methods and disclosure
alternatives (e.g. Dyckman, 1964; Jensen, 1966).
Looking back on the earlier literature, it is readily
apparent that the answer to this question is
‘‘sometimes.’’ Some participants in nearly every
study of this type demonstrate some degree of
functional ?xation; they do not fully adjust for
di?erences in the e?ects of accounting alternatives
on the bottom line (Maines, 1995, p. 90, 91). As a
consequence, ?rms that are in identical economic
circumstances except for their choice of accounting
alternatives are sometimes judged to be di?erent.
These speci?c policy-oriented studies did little to
tell us how the extent of functional ?xation will
vary across types of decision makers or economic
circumstances, or what psychological processes
underlie insu?cient adjustments to accounting
policies. Consistent with this concern, much
recent research has heeded the advice of Maines
(1994) to focus on the dimensions of disclosure,
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 783
environmental factors, and processes that deter-
mine the degree to which appropriate adjustments
are made. In response to a recent call for more
speci?c policy-oriented experiments (Beresford,
1994), Maines (1994) noted that ‘‘Psychological and
sociological research may be most productively used
to guide behavioral accounting research on gen-
eral issues that underlie many di?erent accounting
standards, rather than focusing on issues relevant
to only one standard.’’ Understanding the e?ects
of these general factors will dramatically broaden
the relevance of this research.
Three groups of studies demonstrate progressive
re?nement in the manner in which this research
question has been addressed. The ?rst group
focuses on the mechanisms through which place-
ment and classi?cation of accounting disclosures
a?ect the use and interpretation of the disclosures.
The second group explicitly or implicitly recog-
nizes that managers issuing accounting reports
have their own strategic interests and will report
opportunistically, and examines how users
respond to voluntary disclosures by managers.
The third recognizes that analysts respond to their
own strategic interests and examines how users
respond to potential relationship induced bias in
analysts’ reports. We discuss each in turn.
3.2.2. General issues underlying functional ?xation
The development of category structures in
memory plays a major role in allowing expert
decision makers to respond e?ectively and e?-
ciently in complex decision environments. In these
structures, attributes are associated with cate-
gories as opposed to individual instances of the
category. An individual instance or event is then
interpreted based in part on its category member-
ship. This allows for e?cient and often e?ective
processing of attributes of the environment, but
sometimes produces errors when the particular
instance does not match the typical category
attributes well. A number of recent papers have
recognized that classi?cation issues like the
assignment of a ?nancial disclosure to a particular
?nancial statement, to a speci?c subsection within
a statement, or to the notes, will a?ect decision
makers’ categorization of that disclosure and
interpretation of its relevance and meaning.
Existing studies have examined three dimensions
of classi?cation. Hopkins (1996) examined the
e?ects of classi?cation of items on the right side of
the balance sheet as debt, equity, or mezzanine
?nancing on judgments of the stock price e?ects of
new ?nancing. He found that experienced buy-side
analysts who had knowledge of the di?erential
stock price e?ect of debt and equity issuances
found in ?nancial economics research responded
to the issuance of hybrid securities based on their
categorization. When the securities were classi?ed
as mezzanine, for which the analysts had no well-
de?ned category, they responded based on the
attributes of the individual security. Similarly,
Hopkins, Houston, and Peters (2000) examined
issues related to categorization of costs as operat-
ing expenses, one-time charges, or note disclosure.
Experienced buy-side analysts treated the account-
ing acquisition premium in a merger in part based
on its classi?cation. One-time charges and note
disclosures were treated as less relevant to stock
valuation than operating expenses. Finally, Hirst
and Hopkins (1998) and Maines and McDaniel
(2000) examined whether placement of elements of
comprehensive income on the income statement
versus the statement of stockholders’ equity a?ec-
ted the ability to detect earnings management and
changes in earnings volatility. Information placed
on the income statement (the primary perfor-
mance statement) was much more likely to be
treated as relevant to future performance esti-
mates by the experienced analysts in Hirst and
Hopkins (1998) as well as by the evening MBA
students in Maines and McDaniel (2000).
Maines and McDaniel (2000) also present the
beginnings of a theory of format e?ects. Their
theory lists ?ve factors that a?ect the degree to
which investors will rely on a particular disclosure
in assessments of corporate performance: place-
ment, labeling as income, linkage (to net income),
isolation, and degree of aggregation. Such a the-
ory holds the promise of allowing predictions of
e?ects beyond the scope of individual studies, as
Maines (1994) recommends. Future research can
re?ne and test the model in other circumstances.
Other studies identify the stage in the decision
process where any failure to adjust for accounting
or disclosure di?erences occurs. Following prior
784 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
credit analysis and auditing research (e.g. Abdel-
Khalik & El-Sheshi, 1980; Bonner, 1990), Lipe
(1998) uses a series of debrie?ng questions to
separate the e?ects of measurement from weight-
ing. She examines whether investors can accurately
assess the variance and covariance of returns in
making risk assessments and whether they use
those assessments in their investment decisions.
7
Maines and McDaniel (2000) use a combination
of debrie?ng questions and regression analysis to
determine whether di?erences in accessing the
information cues, interpreting or measuring the
cues, or weighting the cues caused their results.
They suggest that participants in all disclosure
conditions accessed and interpreted the cues in the
same manner, but weighted them more heavily in
the income statement presentation condition.
Another set of studies uses improved theories of
functional ?xation to de?ne ‘‘superior’’ disclosure
methods. Early studies only determined if di?erent
judgments or decisions are made and ignore the
issue of determining the superior disclosure
method. Many of the newer studies specify sub-
tasks necessary for successful ?nal judgments or
decisions, such as detection of earnings manage-
ment (Hirst & Hopkins, 1998), assessment of
variability in underlying ‘‘core’’ earnings (Maines
& McDaniel, 2000), or covariance assessment (Lipe,
1998). Alternatively, Maines, Mautz, Wright, Gra-
ham, Rosman, and Yardley (2000) approach the
question of assessing which disclosure method is
superior in a way similar to the training and deci-
sion aids literature in auditing. They suggest that
high quality reporting methods (1) allow novice
decision makers to perform like expert decision
makers and (2) allow the same decisions to be
made as completely disaggregated disclosures.
They apply their approach in a study of joint-ven-
ture ?nancial reporting standards. The approach is
consistent with the SEC and FASB’s concern for
the naive investor, as well as e?ciency concerns
and Hand’s (1990) suggestion of investor sophisti-
cation e?ects as a partial explanation for market
ine?ciencies. This study, Maines, McDaniel, and
Harris’s (1997) study of segment standards, and a
number of the above-mentioned are motivated in
part by a particular policy issue of current interest.
Again, we believe that their impact is determined
by their ability to relate the particular policy issue
of interest to more general phenomena that inform
a wider array of policy questions.
3.2.3. Responses to voluntary disclosures
The studies discussed above implicitly assume
that disclosures are generated by a neutral process.
However, managers issuing accounting reports
generally have their own strategic interests and
will report opportunistically. A number of studies
address how this strategic element a?ects users’
decisions.
The ?rst two studies examine the e?ects of the
form of disclosures. Kennedy, Mitchell, and Sefcik
(1998) examine how investors interpret the di?er-
ent allowable forms of contingent environmental
liability disclosure: minimum, best estimate, max-
imum, or range of the distribution. Experienced
?nancial executive, manager, banker, and MBA
student participants’ assessments of the distribu-
tion of possible losses implied by each disclosure
did not match the commonly accepted meaning of
the terms. For example, when the ‘‘best estimate’’
was disclosed by management, the participants
interpreted it as the minimum, and when a range
was disclosed, the participants’ estimates of the
expected value were well above the midpoint of
the range. The participants clearly believed that
managers bias their disclosures downward.
8
It also
indicates that accounting information has di?erent
e?ects on di?erent judgments, in this case, man-
agement credibility and ?rm value.
Hirst, Koonce, and Miller (1999) examine
investors’ interpretation of point versus range
forecasts and historic forecast accuracy on earnings
7
She also examines how they react when market and
accounting measures con?ict. Her study is unique at this point
in jointly examining the role of accounting and non-accounting
information. It also suggests the possibility that the weight
placed on normatively relevant information may change with
the inclusion of less-relevant information and presents a
potential explanation for the lack of diversi?cation of indivi-
dual portfolios.
8
Participants also believed that managers that decided to
disclose the minimum were the least credible, yet they valued
their ?rms the most highly. This suggests that the accounting
standard provides managers with a perverse incentive to pro-
vide the least informative disclosure.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 785
forecasts and con?dence in forecasts (which they
relate to trading). If both of these forecast attri-
butes indicate precision of the forecast, they both
should a?ect forecasts and con?dence. However,
only prior accuracy had an e?ect on earnings
forecasts, while both factors a?ected con?dence
and trading. This again indicates that normatively
relevant attributes of accounting information may
a?ect some judgments and decisions but not others.
Libby and Tan (1999) and Tan, Libby, and
Hunton (2000) investigate the e?ects of earnings
warnings or preannouncements on sell-side ana-
lysts’ forecasts of future periods’ earnings. Libby
and Tan provide a demonstration of the process
through which the same disclosure can have dif-
ferential e?ects on di?erent judgments and deci-
sions. They examine why analysts say in the press
that they reward ?rms that warn, yet punish them
in their forecasts. They demonstrate that this
inconsistency results from the simultaneous pro-
cessing of the warning and earnings announce-
ment in answers to press questions versus the
sequential processing of the same signals in the
forecasting setting. Tan, Libby, and Hunton (2000)
demonstrate that ?rms that low-ball preanno-
uncements of both positive and negative earnings
surprises will receive higher forecasts for future
period’s earnings, even though the reporting man-
agers themselves are judged as having lower
integrity and competence. Also, analysts are aware
of management’s tendency to low-ball the pre-
announcements, but do not adjust their estimates
of earnings of ?rst time preannouncers in light of
this base rate knowledge. This again indicates that
known attributes of accounting information do
not a?ect all judgments in the same manner.
3.2.4. Responses to analyst’s forecasts
Hirst, Koonce, and Simko (1995) and Ackert,
Church, and Shehata (1997) investigate the e?ects
of potential bias in analysts’ reports on investors’
use of those reports. MBA student subjects in
Hirst, Koonce, and Simko (1995) expected ana-
lysts whose employers also provide investment
banking services to the company to be more
biased than those that do not. However, this per-
ceived bias only a?ected their reliance on the report
when the report gave a negative recommendation.
Similarly, the strength of the analysts’ arguments
had an e?ect only for negative recommendations.
Ackert, Church, and Shehata (1997) extend this
study to a multiperiod setting where subjects have
the option to acquire forecasts from analysts, and
also observe actual earnings. Individuals were
much less willing to acquire analysts’ forecasts
that proved to be biased in the past, even when the
forecast information was useful. Both studies sug-
gest the need to better understand the processes
that determine when reports from analysts and
other information intermediaries will be purchased
and relied upon.
A general picture emerges from the above stud-
ies. First, management’s often cited (Beresford,
1994) preoccupation with the bottomline, and more
speci?cally with potential penalties for earnings
volatility and e?ects of cosmetic di?erences,
appears at least in part well founded. Second, we
have begun to understand that placement, categor-
ization, and labeling all play a role in the simpli?-
cations that even professional analysts apply when
evaluating accounting information. Future research
on the knowledge structures developed by experts
for di?erent types of companies and di?erent types
of ?nancial judgments and decisions promises to
increase our understanding of these e?ects.
It is also clear from the above results that the
information that decision makers rely upon in
their judgments is limited, and the information
emphasized clearly changes depending on the
?nancial judgment being made and other elements
of the environment. In fact, awareness of cosmetic
di?erences (and ability to ‘‘do the math’’) does not
ensure full consideration of their implications for
valuation. The same is true of knowledge of man-
agement’s tendency to opportunistically employ
vague reporting standards or analysts’ tendency to
bias their reports. There appear to be many cases
where the same normatively relevant factors are
ignored in one circumstance, but adequately
weighted in another by the same decision maker.
The fact that results here tie closely to archival
data gathered in prior studies adds to the cred-
ibility of the results. Future studies should focus
on systematically determining the circumstances in
which di?erent classes of information receive ?rst-
order consideration.
786 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
Earlier research on the e?ect of task complexity
on the use of alternative decision rules in credit
decisions (e.g. Biggs, Bedard, Gaber, & Linsmeier,
1985; Paquette & Kida, 1988; see Payne, Bettman,
& Johnson, 1992 for a review of psychological
studies) will provide some guidance in this area.
However, it appears that the determinants of which
information items receive ?rst order consideration
in particular judgment situations involves more
than task complexity. Findings of the importance
of cue-response compatibility (Slovic & Lichten-
stein, 1968) and other task determinants of cue
usage in early judgment and decision making
research (e.g. Einhorn & Hogarth, 1981; Slovic &
Lichtenstein, 1971) may provide useful directions
for future research in this area. Furthermore, the
interplay between these factors, investor sophistica-
tion and e?ort, and various market attributes dis-
cussed in Section 3.3 appear critical in determining
the importance of cosmetic disclosure di?erences.
3.2.5. How do investors and analysts use the time-
series properties of earnings to predict future
earnings?
Post-earnings-announcement drift has become a
very active stream of archival research. Bernard
and Thomas (1990) provide evidence that drift
arises because investors misperceive the time-series
of earnings. Speci?cally, quarterly earnings follow a
Brown–Roze? model, which has two key elements.
One element is the autoregressive component —
changes from one quarter of one year to the same
quarter of the next tend to be positively auto-
correlated. The other element is the ‘‘moving
average’’ component — the di?erences between
actual and predicted earnings tend to be negatively
correlated from one quarter to the same quarter of
the next year. Research by Bernard and Thomas
(1990) and Ball and Bartov (1996) indicate that
investors underestimate both the autoregressive
and moving-average components of quarterly
earnings; results from Abarbanell and Bernard
(1992) indicate that analysts make a similar mistake.
Recent studies have used the advantages of the
experimental approach to understand the psycho-
logical nature of investors’ and analysts’ time-ser-
ies prediction errors. Calegari and Fargher (1997)
provides a logical starting point — they attempt to
replicate drift in the laboratory, using experi-
mental controls to rule out the possibility that
prediction errors are driven by factors other than
judgment errors.
9
Just as archival studies focus
only on ?rms with extreme earnings surprises,
Calegari and Fargher use time series that exhibit
unusually large earnings changes in the most
recent quarter. Their results are largely consistent
with archival research — both individual traders
and market prices underreact to earnings surprises.
Maines and Hand (1996) extend this ?nding in
two ways. First, they present MBA students with
two di?erent 40-quarter time-series. One series has
strong autoregressive and moving-average com-
ponents. Another is simply a seasonal random
walk with no such components. Subjects under-
react to both elements when they are present, but
also act as if the autoregressive element is present
when it is not. This suggests that drift may arise in
the target environment simply because it is too
di?cult for investors to discern the autoregressive
and moving average terms. Drift may therefore be
less severe for ?rms that adhere more closely to a
seasonal random walk. Second, Maines and Hand
directly test Bernard’s (1993) hypothesis that
investors anchor too strongly on earnings from the
same quarter of the previous year, perhaps
because it is stressed in the reporting format used
in the popular press. Maines and Hand test this
supposition by presenting a new set of subjects
with a Brown–Roze? time-series, and reporting
earnings relative to earnings from four quarters
ago. The results raise doubts about Bernard and
Thomas’s (1990) hypothesis, because these sub-
jects place even more weight on the autoregressive
component of the time series. These results suggest
the need to test for alternative causes.
Bloom?eld, Libby, and Nelson (2000a) argue
that drift may arise because people naturally over-
rely on unreliable information (Bloom?eld, Libby,
& Nelson, 2000b; Gri?n & Tversky, 1992), and
old earnings numbers tend to be unreliable pre-
dictors of future earnings, once more current
9
For example, investors and analysts could appear to make
prediction errors in archival studies because they respond to
information other than earnings, because they have incentives
for something other than prediction accuracy, or because they
are attempting to manage risk.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 787
earnings are known. They test this hypothesis by
manipulating information about old earnings per-
formance, holding recent earnings performance
constant. Student subjects rely much too heavily
on old earnings numbers, and generate errors
consistent with post-earnings-announcement drift,
even when they are presented with a time series
that is much simpler than that used in other
experiments. This suggests that drift may not arise
merely because the time-series properties of earn-
ings are so complex.
Future research in time-series perceptions might
follow several directions. One direction is to inte-
grate the di?erent research approaches described
above. The realistic time-series used by Calegari
and Fargher (1997) and Maines and Hand (1996)
allow them to generalize their results readily to
archival settings, but make it di?cult for them to
ascertain how aspects of the time-series data
interact with psychological processes to cause
prediction errors. The simpler time-series data
used in Bloom?eld, Libby, and Nelson (2000a)
poses precisely the opposite problem. Future
research might attempt to work toward the middle
of these two approaches, either by using time-
series that are progressively simpler than in the
former studies, or progressively more complex
than in the latter study.
Future research might also investigate the model
of Barberis, Shleifer, and Vishny (1998). That
model assumes that earnings follow a random
walk, but that investors believe that earnings
switch between regimes of positive autocorrelation
and regimes of negative autocorrelation. This
misperception results in both underreactions to
recent earnings changes and overreactions to long-
termtrends. While such misperceptions are broadly
consistent with psychological ?ndings indicating
representativeness and conservatism biases, no
single study supports its assumptions, and their
predictions are not entirely consistent with archi-
val evidence (e.g. Lee and Swaminathan, 2000).
Finally, future studies might attempt to inte-
grate research on time-series predictions with
other research streams that consider earnings pre-
diction more broadly. For example, how might
knowledge of earnings components (accruals, cash
?ows) alter subjects’ time-series predictions?
3.2.6. What personal and process attributes
determine analysts’ forecasting and valuation
performance?
As Maines (1995) notes, a number of studies in
the 1970s and 1980s examined the manner in
which expert and novice analysts process
accounting information (e.g. Mear & Firth, 1987;
Panko? & Virgil, 1970; Slovic, Fleissner, & Bau-
man, 1972; Wright, 1977). The studies assessed
various characteristics of information search, cue
weighting, judgment consistency and consensus,
and self-insight into information processing. A
number of the more recent studies in this group
used detailed process tracing techniques in an
attempt to tie individual or process attributes to
judgment accuracy (e.g. Anderson, 1988; Biggs,
1984; Bouwman, 1984). However, most studies
were only able to relate process attributes to
experience because of subject sample constraints
or di?culty in measuring judgment performance.
These earlier experiments also did not focus on the
e?ects of analysts’ incentives, which have received
a great deal of attention in recent archival studies.
Three recent studies have added substantially to
our understanding of the relationship of personal
and process variables to forecast accuracy as well
as the impact of relationship incentives on bias in
forecasts. Hunton and McEwen (1997) emphasize
both process measurement and disentangling
variables that are confounded in natural settings.
They address whether sell-side analysts’ search
strategies and incentives (in the form of their rela-
tionship to the company) a?ected the accuracy
and bias of their earnings forecasts. Information
search strategy was assessed with an eye move-
ment measurement system that eliminates most
concerns about the reactivity and validity of ver-
bal protocols. The authors measured the accuracy
of the forecasts made in the experiment as well as
historical accuracy from company archives, which
assures external validity. Analysts that followed a
more directed (as opposed to sequential) search
strategies were more accurate both in the experi-
mental task and in practice. The analysts in the
underwriting condition gave higher (more biased)
forecasts than those in the following condition,
which were higher than those in the no relation-
ship condition. Careful use of controls eliminates
788 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
concerns about omitted variables such as informa-
tion availability, time on task, and some forms of
selection that could have explained similar ?ndings
in archival studies (see Kothari, 2000 for a review).
Few studies have examined the knowledge and
abilities that lead to successful performance by
analysts. Ghosh and Whitecotton (1997) present
evidence that two standard psychometric measures
of information processing ability (perceptual abil-
ity and tolerance for ambiguity) were correlated
with forecast accuracy. But, as in Hunton and
McEwen (1997), experience was unrelated to
accuracy. However, Whitecotton (1996) reports
that experienced analysts outperformed MBA
students, who outperformed undergraduate stu-
dents, though the experienced analysts were the
most over optimistic.
Like similar work in auditing, these ?ndings are
potentially relevant to the selection and training of
analysts, as well as the interpretation of their
forecasts and reports. Again, the fact that results
here tie closely to archival data, gathered either in
the same study in the case of Hunton and McE-
wen’s (1997) accuracy measures, or in prior studies
in the case of their incentives ?ndings, adds to the
credibility of the results. Recent archival studies
by Mikhail, Walther, and Willis (1997), Clement
(1999), and Jacob, Lys, and Neale (1999) have
documented di?erences in the experiences of more
and less accurate analysts that may indicate direc-
tions for future research. In the auditing literature,
expertise studies have re?ned such ?ndings in studies
that specify the knowledge necessary to complete
various tasks, when it is acquired, and the
mechanisms through which knowledge content and
structure a?ect performance. These studies can
provide guidance for future ?nancial accounting
research in this area. Other recent work has begun
to look at how these individual responses a?ect
market-level performance and the characteristics
of markets that will a?ect information dissemina-
tion. This research is discussed in the next section.
3.3. How do individual responses to information
a?ect market-level phenomena?
Early experimental research in ?nancial account-
ing implicitly assumed that individual behavior
would a?ect market-level prices in some straight-
forward manner (e.g. the price might be simply the
average of all investors’ beliefs), and that some
investors would lose money to more sophisticated
investors by trading unwisely at market prices.
Counter-arguments by proponents of the e?cient
markets hypothesis have led many experimental
researchers to make these assumptions explicit and
subject them to testing. We divide this literature
into three lines: those that address di?erences
between individual and aggregate behavior, infor-
mation aggregation, and excess trading volume.
3.3.1. Di?erences between individual and
aggregate behavior
A number of papers examine whether or not
individual responses to information extend to the
market level. Two papers examine whether indivi-
dual responses to risk extend to the market level.
Coller (1996) shows that both individual traders
and market prices respond to uncertainty in public
disclosures in a manner roughly consistent with
Bayesian rationality. Bloom?eld and Wilks (2000)
show that, consistent with theoretical and archival
work on disclosure, more accurate disclosures
increase individual and market prices relative to
expected values, and also increase individual and
market liquidity. A larger number of papers show
that biases in individual decisions result in biased
market prices as well. For example, Calegari and
Fargher (1997) show that post-earnings-announce-
ment drift persists in a double auction market, and
Bloom?eld, Libby, and Nelson (2000a) show that
over-reliance on previous years’ earnings persists
in a clearinghouse market. Tuttle, Coller, and
Burton (1997) show that recency e?ects extend to
the market level.
Dietrich, Kachelmeier, Kleinmuntz, and Lins-
meier (2000) conduct a study closely related to the
functional ?xation (e.g. Hopkins, 1996) and volun-
tary disclosure (e.g. Kennedy et al., 1998) studies
discussed in Section 3.2.1. They demonstrate that
more explicit disclosure of accounting information
about oil-producing properties leads to more e?-
cient market prices even though the same infor-
mation can be inferred from the balance sheet and
income statement. Di?erent disclosure forms
either mitigate or exacerbate biases in prices. The
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 789
authors test their process explanation by tying
individual participant’s behavior to prices to
ensure that the market price results are the result
of individual information processing biases.
Other research investigates how competitive for-
ces might allow less biased traders to have more
in?uence on price, and use that explanation to
guide examination of when this is more likely to
occur. Of particular interest is the ‘‘smart-trader’’
hypothesis, which states that traders who are less
susceptible to the bias trade more actively than
other traders, driving prices to unbiased levels
(Camerer, 1987, 1992). The intuition behind this
hypothesis underlies the strong-form of the e?-
cient markets hypothesis, which states that prices
will fully re?ect information even if it is held only
by a small number of traders.
Anderson and Sunder (1995) provide evidence
that the smart-trader hypothesis might be more
predictive among professional traders than among
student traders. They compare the extent of base-
rate neglect in markets involving student subjects
with the bias in markets involving professional
traders. They report that price biases in markets of
professional traders exhibit less base-rate neglect
over time, while price biases in markets of students
do not. This is so even though the professional
traders’ individual value estimates do not appear
to di?er from the students’ estimates. This suggests
that the professional traders are able to trade in a
way that reduces bias more (or increases it less).
Bloom?eld, Libby, and Nelson (1996) provide
evidence favoring the smart-trader hypothesis in a
market in which security values are determined by
the answer to general business knowledge ques-
tions. Traders with more accurate answers do
indeed trade more actively than other traders.
When prices are in?uenced by trading volume,
prices become more accurate than the simple
average of all traders’ value estimates. (Prices are
no more accurate than average estimates when
they are not in?uenced by trading volume.) This
study might support the smart-trader hypothesis
more strongly than the studies above because
inaccurate traders are not biased, but merely
uninformed. It is possible that uninformed peo-
ple are more likely to know that their answers
are inaccurate (and therefore trade less aggres-
sively) than biased people, because biases are
unconscious.
Kachelmeier (1996a) uses an analysis of bids
and asks to show the di?culty in determining
exactly how markets can debias prices. He induces
a sunk-cost fallacy that signi?cantly increases sell-
ers’ asking prices and buyers’ bidding prices.
However, these biases have no e?ect on transac-
tion prices, because the higher bids and asks cause
more trades to take place at the bids, which keeps
prices low.
Other recent studies show that market structure
can be important in determining when the smart-
trader hypothesis is likely to be supported.
Ganguly, Kagel, and Moser (1994) present student
subjects with a problem that leads to base-rate
neglect. They ?nd that, because traders are not
allowed to sell shares they do not own (short-sell-
ing is prohibited), market prices are set by the
traders with the highest valuation. As a result,
market prices exhibit base-rate neglect most
strongly (weakly) when the biased prices are
higher (lower) than the Bayesian expected values.
Bloom?eld and Wilks (2000) ?nd strong indivi-
dual evidence of an ‘‘endowment’’ e?ect — incon-
sistent with Bayesian optimization, traders choose
higher ask (selling) prices for riskier securities,
even as they simultaneously enter lower bid (buy-
ing) prices. However, higher risk does not cause
the market ask price to rise. This form of irra-
tionality at the individual level is eliminated at the
market level because the market ask is determined
by the lowest individual ask. The market ask,
therefore, re?ects the selling price of the investor
who succumbs least to the endowment e?ect. In
this way, the structure of the market combines
with the nature of the bias to mitigate the bias at
the market level.
Future research could examine the foundations
of the smart-trader hypothesis more directly. In
particular, what factors might induce less-biased
traders to exploit biases, or keep them from doing
so? What factors might make more-biased traders
curtail their trading activity? How might changes
in market structure, or the degree of market depth
and liquidity, a?ect bias mitigation? (Archival
studies routinely show larger biases in less liquid
stocks.) Future research could also examine how
790 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
the nature of ?nancial accounting information will
a?ect the di?erence between individual and
aggregate behavior. To the extent that informa-
tion induces biases, rather than degrees of
informedness that di?er across traders, prices
would seem more likely to represent an average of
all traders’ beliefs.
3.3.2. Information aggregation and underreaction
A di?erent stream of research examines the
ability of ?nancial markets to aggregate informa-
tion held by di?erent traders. Like studies of the
smart-trader hypothesis, aggregation studies are
motivated by the belief that traders who know a
security value does not re?ect their own informa-
tion will trade aggressively to exploit that fact,
thereby revealing their information to the market.
Early studies on information aggregation
showed that markets do often aggregate informa-
tion. They do so most e?ectively when security
values are tied to states of nature in very simple
ways (O’Brien & Srivastava, 1991; Plott & Sunder,
1988), and when experienced traders have com-
mon knowledge regarding the information envir-
onment (Forsythe & Lundholm, 1990).
More recent studies have examined how uncer-
tainty a?ects information aggregation. In a series
of double-auction markets, Lundholm (1991)
manipulates the ‘‘aggregate uncertainty’’ that
remains after combining investors’ information
about security value. He ?nds that markets with
aggregate uncertainty aggregate information much
less e?ciently than those with aggregate certainty.
Imperfect aggregation can lead markets to under-
react to information, because prices will be too
high when the aggregate information indicates a
very low value, and too low when the aggregate
information indicates a very high value. Bloom-
?eld (1996a, 1996b) shows a similar type of
underreaction in a setting which allows aggregate
certainty, but in which the information structure is
su?ciently complex that information aggregation
is still very di?cult.
Other papers show that market prices can even
underreact to public information that need not be
aggregated. Gillette, Stevens, Watts and Williams
(1999) construct a market in which security values
are determined by a sequence of random dividends.
The authors analyze the market’s reactions as the
dividends are announced publicly one-by-one.
They ?nd that the individual traders’ estimates of
value underreact slightly to the dividend announce-
ments, possibly because they erroneously believe
that random events tend to reverse over time (the
‘‘gambler’s fallacy’’). More interesting is the fact
that market prices underreact substantially more
than individual value estimates. The reason for
this sluggishness in market prices is not clear, but
the authors replicate it in both double-auctions
and call markets. Bloom?eld, Libby, and Nelson
(2000b) also observe a similar e?ect in clearing-
house markets. Bloom?eld (1996a) shows that
markets react to a public signal when it is subject
to manipulation by a self-interested seller, but not
when the signal is purely random. These results
raise the possibility that post-earnings-announce-
ment drift and underreactions to other informa-
tion (e.g. fundamental values, analysts’ estimates)
may arise simply due to a generic underreaction of
market prices to information, rather than infor-
mation-speci?c biases.
Several future directions for research in this area
entail making endogenous the distribution of
information among subjects. All of the aggrega-
tion studies described above manipulate informa-
tion distribution by exogenously altering who is
given information and who is not. Future studies
might relax this assumption by recognizing that
collection of information is an intentional action
that is driven in part by the perceived bene?t of
becoming informed, as in Tucker (1997). Alter-
natively, one might recognize that some informa-
tion may be e?ectively widely distributed because
it is more easily analyzed. For example, Sloan’s
(1996) archival evidence that prices are too high
(low) when ?rms have high (low) accruals might
simply re?ect an underreaction to ?nancial state-
ment information that is not widely known. This
result is consistent with Bloom?eld and Libby’s
(1996) ?nding that laboratory markets respond
more strongly to information that is more widely
available. However, a more direct test of this
hypothesis would be to give all traders the same
information (e.g. a complete ?nancial statement),
and vary the ease with which the information can
be analyzed (as in Dietrich et al., 2000), as well as
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 791
the traders’ knowledge and training that would
help with such analysis.
More generally, researchers might start with the
features we argue are essential for progress in
functional ?xation research — explicitly under-
standing how people process and interpret the
information in ?nancial statements, and then con-
sidering how di?erences in that processing might
alter market behavior.
3.3.3. Trading volume
A third line of research examines the determi-
nants of trading volume in laboratory markets.
Many of these studies are motivated by a general-
ization of the ‘‘no-trade’’ theorem (Milgrom &
Stokey, 1982), which shows that under fairly gen-
eral conditions, information releases should not
induce any trading between traders. The intuition
is that if one trader expects to make money trad-
ing at a given price, the trader on the other side of
the transaction must expect to lose money (since
trading is a zero-sum game).
Gillette et al. (1999) ?nd routine violations of
the no-trade theorem: trading volume is generally
quite high, and is even higher after very high or
low dividend announcements. These results are
consistent with archival evidence on trading
volume (e.g. Bamber, 1987; Bamber, Barron, &
Stober, 1997), which have generated a number of
theoretical models that generate trade through
complex interactions between public and private
information (e.g. Kim & Verrecchia, 1994). How-
ever, the simplicity of the market in Gillette et al.
(1999) makes such explanations unlikely.
Excess trading is a puzzle in Gillette et al.
(1999), but it has few welfare implications because
all traders are identical, and therefore wealth
transfers can be ignored (or are at best impossible
to interpret). Bloom?eld, Libby, and Nelson
(1999) examine excess trading that has very clear
welfare implications. They create markets in which
less-informed traders hold a subset of the infor-
mation available to better-informed traders. Less-
informed traders unwisely trade with — and lose
money to — the more-informed traders. However,
additional instructions that clarify to less-
informed investors the extent of their informa-
tional disadvantages reduce these wealth transfers
(although it has no apparent e?ect on price bia-
ses). These results have regulatory implications:
less sophisticated individual investors (who have
less information than more sophisticated indivi-
duals or institutional investors) can be protected
by regulations that emphasize the extent of their
informational disadvantage.
There appear to be a number of open questions
related to trading volume. Archival papers have
examined volume in response to earnings
announcements, or tie volume to pricing anoma-
lies (Lee & Swaminathan, 2000; Swaminathan &
Lee, 2000). These ?ndings may be caused by fac-
tors indicated in economic models (e.g. Kim &
Verrecchia, 1994) or by psychological factors. The
literature on motivated reasoning seems particu-
larly promising, because it examines how initial
variations in beliefs and preferences can be mag-
ni?ed by ambiguous public disclosures of infor-
mation (Wilks, 2001).
3.4. How do strategic interactions between
reporters and users of information a?ect reporting
and market outcomes?
Game theory has been exceptionally useful in
modeling the strategic interactions between sellers
(who can make reports about their value) and
buyers who rely on those reports in making their
trading decisions. These models potentially have
regulatory implications, because they show that
seemingly reasonable regulations may be unneces-
sary or unwise when one considers the joint
response of buyers and sellers to the regulation.
The models are very di?cult to test with archival
methods, because their predictions are derived in
settings that are far simpler than natural markets.
However, a number of experimental researchers
have chosen to examine behavior in settings that
closely resemble those described in the models. In
this section, we brie?y review some of these
experiments.
One line of research examines voluntary dis-
closure models, in which sellers choose between
honestly disclosing the exact value of the security
they are selling, and not disclosing anything at all.
Two papers by King and Wallin ?nd strong sup-
port for the qualitative predictions of the models
792 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
of Jung and Kwon (1988), and Wagenhofer
(1990). King and Wallin (1991b) ?nd that increas-
ing the probability that the seller is informed leads
sellers to disclose more often, and also leads buyers
to draw more unfavorable inferences when they do
not observe disclosure (making disclosure a wise
strategy for sellers). King and Wallin (1995) show
that disclosure is also limited by introducing a cost
to disclosing favorable information (a competitor
who will take advantage of favorable disclosures
to enter the seller’s product market), because even
high-value ?rms might choose not to disclose. In
both cases, however, results deviate substantially
from the point predictions of the models.
Forsythe, Lundholm, and Reitz (1999) show
how disclosure regulations a?ect the welfare of
buyers and sellers in a simple market with volun-
tary disclosure. When sellers are not permitted to
disclose their information about value, many sur-
plus-enhancing transactions do not occur, and
both buyers and sellers su?er. Allowing sellers to
disclose any value (even a false one) increases
market surplus, but these gains accrue almost
entirely to the sellers. Requiring sellers’ reports to
include the true value shifts part of this surplus
from the sellers to the buyers.
King (1996) examines whether disclosure pat-
terns change when sellers have an opportunity to
develop reputations. He permits sellers to report
any value they wish, but imposes a cost on buyers
when the seller’s report is inaccurate. This setting
includes two equilibria. In an ‘‘in?ation’’ equili-
brium, sellers always report the highest value, and
buyers pay expected value net of the cost of inac-
curacy. In a ‘‘reputation’’ equilibrium, the seller
reports honestly, and the buyers believe the reports
until the seller reports dishonestly; at that point,
the players revert to the in?ation equilibrium.
King ?nds that an exogenous cost for inaccuracy
does permit reputation formation, but that the
reputation equilibrium arises only in a few cases.
There are several natural directions for research
in strategic disclosure. There is certainly no short-
age of new disclosure models to test. However, it is
probably more important for researchers to begin
to delve into how and why various equilibria do
and do not have predictive power. Some research-
ers have begun doing so by asking whether
‘‘adaptive’’ strategies (doing more of strategies
that performed better in the past) lead to a given
equilibrium. For example, King and Wallin (1995)
?nd little support for an ‘‘adaptively unstable’’
equilibrium that is not the end result of adaptive
behavior. Other researchers focus more directly on
the players’ thought processes. For example,
experiments by Bloom?eld and Hales (2000)
examine how sellers’ abilities to form reputations
for honest reporting are in?uenced by buyers’ and
sellers’ expectations of one another’s likely beha-
vior and beliefs.
Future research might also begin to integrate
disclosure research with the other literatures
described in this section. For example, Bloom?eld
(1996a) integrates the disclosure literature with the
information aggregation literature by showing that
sellers are willing to pay a fee to in?ate a public
signal, even though the information available to
the market as a whole is unchanged. They are will-
ing to do this because markets tend to react more
strongly to information held by more investors.
Researchers might also integrate economics-
based disclosure research with the psychology-based
literature described in Section 3.1. That research
focuses on how investors could use ?nancial
reporting choices to draw inferences about man-
agers’ incentives and information, but ignores the
fact that managers should anticipate investors’
reactions. On the other hand, the psychology-
based research presents a more comprehensive
treatment of ?nancial accounting institutions, by
allowing managers to choose how to classify and
report accounting information. We believe it
would be worthwhile — though di?cult — to
examine fully strategic interactions in more complex
accounting institutions. Researchers in ?nancial
accounting might also attempt to integrate game
theory and social psychology, as has been done
successfully in the auditing context by King (2001).
4. E?ective and e?cient research design:
methodological considerations in experiments
Section 3 presented a number of directions for
future experiments. In this section, we discuss
how these experiments can be designed to be
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 793
both e?cient and e?ective. An experiment is e?-
cient if it achieves a given level of e?ectiveness as
economically as possible. An experiment is e?ec-
tive if it provides evidence of su?cient internal
validity that readers should believe the results of
hypothesis tests, while being of su?cient external
validity that it bears on a signi?cant part of the
?nancial accounting issue of interest.
10
Both
internal and external validity are key to e?ective-
ness. An experiment that lacks internal validity
fails by providing a misleading indication of the
relation between the dependent and independent
variable, while an experiment that lacks external
validity produces results that are (or at least
should be) divorced from the motivation of the
study. We do not provide an exhaustive treatment
of research design (see Kinney, 1986; Runkel &
McGrath, 1972; Trotman, 1996 for more compre-
hensive discussions). Rather, we focus on issues
that we believe are particularly important or are
often misunderstood. Section 4.1 addresses tech-
niques for maximizing e?ectiveness through care-
ful hypothesis development and research design.
Section 4.2 addresses when it is (and is not) possi-
ble to improve e?ciency by consuming fewer
resources without sacri?cing e?ectiveness. We
address the number and type of subjects used in
the experiment, the payment of monetary incen-
tives, the use of within-subject designs, and the
decision to use single-person tasks rather than
interactive tasks (such as ?nancial markets or
strategic reporting settings).
4.1. Increasing experimental e?ectiveness
We organize our discussion of experimental
e?ectiveness around the predictive validity model
(Libby, 1981; Runkel & McGrath, 1972). This
model provides a useful description of the
hypothesis testing process, and focuses our atten-
tion on the key determinants of the internal and
external validity of a research design.
Fig. 1 illustrates the predictive validity model as
it applies to Hypothesis H1b from Hunton and
McEwen (1997; hereafter, HM). As noted earlier,
based on prior theory and evidence HM hypothe-
sized that sell-side analysts’ relationship-based
incentives would decrease their forecast accuracy.
Analysts’ relationship-based incentives were oper-
ationally de?ned as a three-level independent
variable: an ‘‘underwriting relationship’’ that has a
direct impact on fees, a ‘‘following relationship’’
that creates the need for future access to private
information, or ‘‘no future relationship.’’ HM
expect analysts in the underwriting condition to
provide the most optimistic forecasts, those who
follow the ?rm to be next most optimistic, and
analysts who do not follow the ?rm to be the least
optimistic. They operationally de?ne optimism
(the dependent variable) as the analysts’ forecast
minus the actual earnings outcome. HM also con-
trolled for a number of other potentially in?uential
variables including subject background, experi-
ence, time on task, and information availability.
In Fig. 1, link 1 depicts the relationship in HM’s
underlying theory. No theory can be tested
directly; rather, a theory is tested by assessing the
relationship between the operational de?nitions of
key concepts in the theory (i.e. by assessing link 4).
For this test to be valid, the links between the
concepts and the operational de?nitions (links 2
and 3) must be valid, and other factors that might
a?ect the dependent variable (link 5) must be
controlled or have no e?ect. A study’s internal and
external validity is determined by the validity of
these ?ve links. We now discuss ways in which
researchers can strengthen each of these links.
4.1.1. Link 1: theory and hypotheses
The ?rst determinant of experimental e?ective-
ness is speci?cation of a good research question. A
good research question addresses the relation
between two or more concepts, can be stated
clearly and unambiguously as a question, implies
the possibility of empirical testing, and is impor-
tant to the researcher and others (Kinney, 1986).
Experimental tests of research questions must
rely on some theory depicting forces that in?uence
behavior in the experimental setting. Theories may
range from highly speci?c numerical models (such
10
Internal validity is the degree to which you can be sure
that observed e?ects are the result of the independent variables.
External validity is the degree to which results can be general-
ized beyond the speci?c tasks, measurement methods, and par-
ticipants employed in the study.
794 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
as those derived from economics or arti?cial-
intelligence cognition models) to more general
qualitative predictions based on prior evidence
(such as systematic evidence that people use a
certain heuristic in a given setting). Regardless of
its nature, the theory suggests the expected answer
to the research question, and serves to guide the
many decisions and tradeo?s that must be made
during the design and administration of an experi-
ment. Whereas archival researchers analyze data
from secondary sources,
11
the experimental setting
is speci?cally designed to gather data relevant to
the hypotheses. Consequently, all stages of the
design of experiments are profoundly a?ected by
the need for a well-formulated research question
and hypotheses. In this section, we emphasize four
issues that are particularly important in develop-
ing good research questions and hypotheses in
experimental ?nancial accounting research.
First, the hypotheses must have external valid-
ity; that is, readers must believe that the theore-
tical concepts and the relationships between them
capture important aspects of the target environ-
ment. Although people often speak of external
validity as an aspect of experimental stimuli, we
consider it an element of theory as well. If the
theory and hypotheses are appropriately capturing
relationships among elements of the target envir-
onment, an internally valid experiment will test
that theory in a manner that generalizes to the tar-
get environment. External validity is established
empirically by extensions of the research that test
additional hypotheses concerning environmental
contingencies that de?ne the limits of generality of
the initial hypotheses (Trotman, 1996).
For example, HM’s research question of ‘‘Do
sell-side analysts’ relationships with the ?rms they
cover decrease their forecast accuracy?’’ relates an
antecedent (analysts’ relationships) and consequ-
ence (forecast accuracy) that clearly maps into ?rst
order concerns indicated by theory and prior evi-
dence. If the experiment operationalizes those
concepts well and provides an internally valid test
of their relation, it will provide insight into the
real-world e?ect of analysts’ incentives on their
judgments. Future research can then test the
extent to which those insights can be generalized.
Second, experimental research questions in
?nancial accounting should focus on how theories
drawn from fundamental disciplines (such as psy-
chology and economics) interact with details of
?nancial accounting institutions (as discussed in
Section 2.4). As Gibbins and Swieringa (1995)
suggest, accounting experiments should be ‘‘both
theory driven and setting sensitive.’’
Tying the accounting institution to theory from
a fundamental discipline allows hypotheses to
have relevance beyond the very speci?c practice
context that motivated the experiment (as recom-
mended by Maines, 1994). It also allows experi-
menters to contribute to both ?nancial accounting
and the fundamental discipline. For example,
Nelson and Kinney (1997) apply Einhorn and
Hogarth’s (1986) ambiguity model to predict how
Fig. 1. Predictive validity framework.
11
That is, the data is initially gathered for a di?erent purpose.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 795
ambiguity a?ects ?nancial statement auditors’ and
users’ judgments of appropriate contingent-liabi-
lity disclosure. Their study shows how the di?er-
ences between auditors’ and users’ incentives lead
auditors to use the discretion provided by ambig-
uous evidence to justify lower levels of disclosure
than users desire. This result is of clear interest to
?nancial accounting researchers, and also con-
tributes to psychologists’ understanding of how
incentives interact with ambiguity.
12
A more ambitious approach is to use funda-
mental disciplines to develop and experimentally
test a general theory that is applied to the ?nancial
accounting phenomenon of interest. For example,
Maines and McDaniel (2000) identify various
general dimensions of formats that signal infor-
mation importance or that a?ect the cognitive cost
of processing information (see also Lipe, 1998).
They apply their theory when testing whether
information-disclosure format a?ects considera-
tion of the volatility of unrealized gains and losses,
but their theory is much broader than the parti-
cular practice context that they examine.
Third, researchers should frame their theories at
the least speci?c level that can account for the data
expected to arise from the experiment. Stating the
theory with greater speci?city will simply encou-
rage readers to argue that the results are driven by
a slightly di?erent theory (such as a di?erent the-
ory of categorization) that yields identical predic-
tions in the experimental setting. Such debates are
rarely productive. If the distinction is likely to be
important in accounting settings, researchers inter-
ested in accounting issues should consider what
other experiments might illustrate this importance.
If the distinction is unlikely to have important
rami?cations for accounting settings, experiments
discriminating between such theories are more
appropriately seen as contributions to the funda-
mental disciplines from which the theory is drawn.
Finally, experimental research questions should
be based on a theory that describes causal rela-
tionships between concepts. As discussed above,
the key advantage of the experimental method lies
in its ability to disentangle factors that are con-
founded in natural settings, and thus provide
indications of how and why phenomena arise. A
causal theory also improves external validity,
because causal forces are more likely to generalize
to di?erent settings. This also leads to a preference
for research questions that focus on a directional
prediction of di?erences, as opposed to a single
point prediction. As Trotman (1996) indicates,
‘‘the basis of any experimental design is that one
or more independent variables are manipulated
and the e?ect on the dependent variable(s) is
observed.’’ Since experiments require abstraction
from the real world, any number of di?erences
between the experimental and real-world environ-
ments could a?ect the particular levels of observed
measures. Consequently, evidence consistent with
point predictions (e.g. ‘‘the market price will be
$5.00’’) and particular parameter estimates (e.g.
‘‘managers will weight current year’s earnings
twice as heavily as prior year’s earnings’’) are
unlikely to generalize to real-world environments.
Directional e?ects are more likely to generalize,
because di?erences between the experimental set-
ting and the target setting are more likely to alter
the magnitude of an e?ect than its direction. A
focus on directional e?ects also makes it much
easier to design an experiment that controls for
competing explanations. We discuss this latter
issue further in Section 4.1.3.
4.1.2. Links 2 and 3: operationalizing dependent
and independent variables
Link 2 relates the antecedent theoretical concept
A to the independent variable(s) operationalized
in the experiment. Link 3 relates the consequential
concept B to the dependent variable operationalized
in the experiment. An internally valid test requires
manipulation of each independent variable in a
way that changes only one theoretical antecedent
at a time. At the same time, they must construct
an operational dependent variable that measures
the conceptual variable, and that variable alone.
This section discusses three particularly di?cult
12
Of course, the theory should entail some element of doubt
before testing. Experiments applying psychology to accounting
settings can be uninteresting if readers are certain that the
results obtained in psychology will readily extend to accounting
even without seeing the experimental results. Experiments
applying economics to accounting settings can be uninteresting
if they are little more than complex ways of showing that peo-
ple prefer more money to less (Kachelmeier, 1996b).
796 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
issues in operationalizing variables: (1) choosing
the appropriate realism of the stimuli presented to
participants; (2) choosing the appropriate levels of
independent variables; and (3) using measured
independent variables.
4.1.2.1. Realism of stimuli. A common challenge
in operationalizing independent variables is decid-
ing how realistic the stimuli should be. The
appropriate level of realism in the operationaliza-
tion of an independent variable is determined by
the role of realism in the theory to be tested.
Experiments testing psychological theories typi-
cally present participants with more realistic sti-
muli than experiments testing economic theory,
because psychology-based experiments are typi-
cally focused on how participants make decisions
using cognitive processes and knowledge that
developed in response to their real-world educa-
tion, training, and experience. Without relatively
realistic stimuli, participants may not rely on the
cognitive processes and knowledge of interest. For
example, HM’s theory relates analysts’ knowledge
of their incentives to their earnings estimates. In
order to test this theory, the experiment must
provide the participants with a su?ciently realistic
stimulus to activate that knowledge. Similarly,
Hopkins (1996) tests the theory that classi?cation
of debt-equity hybrid securities alters analysts’
inferences about ?rm value; this theory can be
tested only with relatively realistic stimuli and
value-assessment tasks.
Experiments testing economic theories typically
present participants with less rich information and
less realistic stimuli, because they focus on how par-
ticipants make decisions using economic informa-
tion given particular preferences, constraints, and
incentives. The decision processes depicted in these
theories are not hypothesized to depend on task
realism, so these studies are less concerned with it.
For example, King and Wallin (1991a) test theories
relating the probability that a seller knows the
asset value to the sellers’ disclosure strategies and
buyers’ responses to those disclosures. That study
does not require realism or knowledge of parti-
cular real-world institutions, so it uses abstract
stimuli and tasks to avoid introducing extraneous
factors that might compromise internal validity.
This discussion should not be construed as indi-
cating that all experiments testing theories drawn
from psychology (economics) must have high (low)
stimulus realism. Experiments testing very general
psychological theories (such as the relation between
short-term memory and optimism) could contribute
to ?nancial accounting research with stimuli and
tasks that possess very low degrees of realism.
Similarly, experiments testing the e?ects of super-
ior accounting knowledge on trading pro?ts
would require high degrees of realism. It is the
goal of the experiment that determines whether
realism adds to or detracts from internal validity.
Stimulus realism can also provide bene?ts
beyond that required for an internally valid test of
the underlying theory. First, realism can help
authors convey to readers the ways in which the
results relate to prior research. For example,
Hopkins (1996) and Tan, Libby, and Hunton
(2000) are able to compare their pricing and earn-
ings-forecast di?erence results for some treatments
directly to prior archival studies, which increases
con?dence in the generality of the results of treat-
ment combinations for which no (or insu?cient)
archival data are available. Second, realism can
help subjects understand the task they are being
asked to perform, thereby reducing noise in the
data. This may be particularly important in eco-
nomics-based experiments, which place high
demands on participants’ attention.
However, it is important not to exaggerate the
bene?ts that stimulus realism provides when it is
not directly enhancing internal or external valid-
ity. Such realism may not substantially increase
external validity, which is determined mainly by
the theory itself and how e?ectively the theoretical
constructs have been operationalized. Similarly, it
is important not to exaggerate its costs. Experi-
mental economists often worry that realism may
in?uence behavior in ways that lie outside their
theories, and thus reduce internal validity
(Camerer, 1997; Smith, 1976), but as we will dis-
cuss in Section 4.1.3, these concerns typically can
be dealt with through good experimental design.
4.1.2.2. Choosing levels of independent variables.
After choosing the nature of independent vari-
ables, the researchers must choose their levels. A
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 797
general goal is to choose levels that are di?erent
enough that the experiment has su?cient power to
yield strong e?ects, yet be within the relevant range.
As indicated above, in some cases it is appro-
priate to choose levels that depict real-world con-
ditions. For example, HM’s independent variable
consists of treatment levels that re?ect what ana-
lysts might experience in practice. Given that their
theory is testing the relation between those real-
world incentives and analysts’ behavior, this rea-
listic depiction provides a strong test of the theory.
However, it is usually di?cult to ensure a repre-
sentative sample of independent variable values,
which limits the interpretability of levels of e?ects
and parameter estimates in most experiments.
Choosing realistic versions of naturally occurring
phenomena can also make it di?cult to manip-
ulate only a single theoretical antecedent while
holding all others constant. This is particularly
true in studies of alternative accounting methods
or disclosures, where di?erences in method or dis-
closure (the experimental treatments in these
studies) can convey unintended information about
the nature of the underlying transactions that
a?ect the dependent variable but are not included
in the theory being tested. Experimental controls
discussed under link 5 can be employed to reduce
this concern (e.g. Hopkins, 1996).
In other cases, it can be wise to create levels that
are unrealistically extreme. For example, For-
sythe, Lundholm, and Reitz (1999) compare a
regulatory regime that prohibits disclosure with
one that allows any disclosure (even fraudulent
statements). While these levels are unrealistic, they
allow a very powerful test of e?ects that would
likely generalize to milder changes in disclosure
regulations.
It can even be useful to specify at least one level
of the independent variables that cannot occur in
practice, to enable a cleaner test of the underlying
theory. One example of this approach is provided
by Libby and Tan (1999). They seek to understand
how analysts can say they reward ?rms for issuing
early warning of negative earnings surprises, while
actually punishing them in their forecast revisions.
Libby and Tan address this question by oper-
ationalizing three ‘‘warning’’ conditions. Two
conditions are realistic: one in which no warning
occurs prior to an earnings announcement, and
one in which the warning is followed by the nega-
tive earnings announcement. A third condition
cannot exist in practice: the warning and negative
earnings announcement occur simultaneously.
This ‘‘simultaneous warning’’ condition allows
them to separate the e?ect of the warning from the
sequential processing of two signals by creating
two comparisons (each treatment compared to the
simultaneous warning condition) that manipulate
only one antecedent. The other two settings
enhance external validity by mapping naturally
into the institutional setting and archival ?ndings
the authors seek to inform.
Regardless of how one chooses the levels of the
independent variables, it is usually advisable to
conduct manipulation checks. These are measures,
often taken during debrie?ng, which seek to deter-
mine whether subjects noticed and interpreted cor-
rectly the independent variable(s). Manipulation
checks test link 2 of the predictive validity frame-
work. Manipulation checks are particularly useful
when analyses reveal no signi?cant treatment
e?ect, since one alternative explanation for the
lack of a signi?cant e?ect is ine?ective oper-
ationalization of the independent variable (a link 2
problem). However, it is critical that the manip-
ulation check tests recognition and comprehension
of the independent variable, as opposed to serving
as another test of the treatment e?ect. Otherwise,
the manipulation check is really just a second
measure of the dependent variable (testing link 4
rather than link 2).
4.1.2.3. Measured independent variables. Some
independent variables in accounting experiments
are observed, rather than manipulated. Because
subjects are not assigned randomly to measured
treatment levels, measuring independent variables
gives up some of the experimentalist’s comparative
advantage. Such studies are subject to the same
correlated-omitted variables problems that com-
promise internal validity in archival research.
Therefore, it is typically preferable to manipulate
important independent variables whenever possi-
ble, rather than measuring them.
However, there are at least four circumstances
where measuring independent variables is useful.
798 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
The ?rst is that it is impossible or impractical to
manipulate an antecedent. For example, HM
hypothesize that analysts that are considered by
their ?rms to be more accurate forecasters tend to
use a more directive, hypothesis-driven evidential
search strategy. Because HM cannot randomly
assign analysts to ‘‘high historical accuracy classi-
?cation’’ and ‘‘low historical accuracy classi?ca-
tion’’ treatments, it is possible that historic
accuracy classi?cation is correlated with some
other variable (such as age or intelligence) that
determines use of a directive search strategy. As a
consequence, HM include a number of control
variables to test these alternative explanations for
results, and are careful to discuss these results in
terms of ‘‘associations’’ rather than ‘‘causes.’’ A
second reason to use measured independent vari-
ables is that the theory relating the antecedent to
the consequence involves mediating variables (a
sequence of links through intervening variables).
For example, Hopkins (1996) predicts that the bal-
ance-sheet classi?cation of manditorily redeemable
preferred stock (concept a) a?ects analysts’ beliefs
concerning the total amount of equity outstanding
(concept b), which in turn a?ects their stock price
estimates (concept c). Because analysts’ beliefs
about outstanding equity are actually a dependent
variable in a part of his theory (a a?ects b), Hop-
kins cannot manipulate it directly. Those beliefs
become a measured independent variable when
testing the second part of the theory (b a?ects c).
Similarly, almost every multi-person task involves
intervening variables, because the behavior of one
person is determined by the (necessarily endogen-
ous) behavior of another. For example, King (1996)
tests whether imposing exogenous costs on buyers
for inaccurate value estimates induces sellers to
report values accurately. One simple breakdown
of this theory is that exogenous costs (concept a)
reduce the prices buyers are willing to pay when
the seller has previously reported inaccurately
(concept b), which leads the seller to choose higher
reporting accuracy (concept c). Because equili-
brium models involve many forces acting simulta-
neously (e.g. the seller should anticipate the
buyers’ response to his reports, and the buyers
should anticipate the seller’s response to their
likely price-setting behavior), it is di?cult to
measure all of those forces simultaneously in one
experiment. Thus, King measured some potential
intervening variables (he chose to examine how
sellers’ reporting accuracy a?ects buyers’ reliance
on those reports), but not others.
One way to avoid measured independent vari-
ables is to construct separate experiments testing
the separate parts of the theory. Hopkins could
have tested the ‘‘a,b’’ and ‘‘b,c’’ links separately or
in sequence, reasoning that ?nding support for
both links suggests (but does not demonstrate) an
‘‘a,c’’ link. However, he chose to provide a clean
test of the ‘‘a,c’’ link by testing it directly, and
using subsequent measurement of ‘‘b’’ to provide
comfort that subjects behaved as predicted. Simi-
larly, King could have separately tested buyers’
responses to seller decisions. However, we believe
that both authors were justi?ed in focusing their
cleanest tests on the primary antecedent and con-
sequence concepts in their theory. A full under-
standing of the causal path may be somewhat
encumbered by the problems associated with
measured independent variables, but remaining
problems can be addressed in future research. For
example, Bloom?eld and Hales (2000) use a series
of experiments to understand more of the linkages
in King’s study.
Third, it is sometimes much less interesting to
examine reactions to a manipulated variable than
a naturally occurring one. For example, it would
have been less interesting for Hopkins (1996) to
test whether analysts who are told that there are
more shares outstanding would place a lower value
on a ?rm’s stock, all else held equal. It seems much
more reasonable to ask whether the same analysts
would use that belief to assess stock value when the
belief arises naturally. This type of concern is even
more salient in tests of equilibrium models.
Fourth, measured variables often provide the
keys to understanding underlying processes that
produce the e?ects of interest. For example,
Maines and McDaniel (2000) make a contribution
by demonstrating e?ects of format on judgments
of management e?ectiveness and stock risk (an
‘‘a,b’’ link), even though their lack of signi?cant
e?ects of format on valuation could be viewed as
an insigni?cant ‘‘a,c’’ link. After all, each inter-
vening successive link adds noise and diminishes
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 799
the experimenter’s ability to detect an e?ect of ‘‘a’’
on a later consequence (particularly when the later
consequence is a very complicated judgment like
stock valuation). Only by eliciting intervening
variables does a clear pattern of results emerge.
Hirst, Koonce, and Miller (1999) demonstrate the
importance of specifying the correct causal path.
They show that the form of a forecast will a?ect
trading decisions, not through estimates of future
earnings, but through con?dence in estimates. This
further highlights the need to elicit intervening
dependent variables that aid in interpreting results
with respect to tests of complex theories. We
encourage researchers to measure potential inter-
vening variables whenever possible, if only after
they measure their primary dependent variable.
13
4.1.3. Links 4 and 5: statistics and other
potentially in?uential variables
As noted earlier, internal validity refers to the
degree to which variation in the dependent variable
can be attributed to variation in the independent
variable. Link 4 assesses the relations between the
operational independent and dependent variables.
Link 5 captures ‘‘other potentially in?uential’’ or
‘‘extraneous’’ variables besides the independent
variable that could a?ect the dependent variable. A
key advantage of the experimental approach is that
the e?ects of extraneous variables can be controlled
for primarily by holding them constant or through
randomization. As a result, statistical analyses in
experiments are typically straightforward, often
consisting of simple t-tests, ANOVAs, or non-
parametric equivalents. Extraneous variables can
also be measured as in archival studies, and used
to enhance the power of analyses by accounting
for variation in the dependent variable that is not
related to the theory being tested. Finally, extra-
neous variables can be manipulated to directly test
their e?ect. Given the expense typically associated
with this approach, it should only be used when
the experimenter believes the extraneous variables
cannot be dealt with another way.
Very complex statistics are typically necessary in
experiments only when they rely heavily on mea-
sured independent variables, or when researchers
must try to boost power when subject resources
are scarce. When those circumstances are not
apparent, complicated statistical tests may signal
poor experimental design — the experimenter is
trying to grapple after the fact with concerns that
should have been headed o? with good experi-
mental design.
This section describes some of the powerful
array of techniques experimenters can use to deal
with extraneous variables. The most important
technique available to the experimentalist to con-
trol for extraneous variables is to assign subjects
randomly to treatments. Random assignment,
combined with manipulation of independent vari-
ables, enables experimentalists to ensure that their
results are not biased by factors of which they are
aware, as well as factors of which they are not
aware. For example, HM randomly assign ana-
lysts to incentive-treatment conditions. This
results in an unbiased distribution of industry
familiarity, age, experience, prior accuracy, etc.
across the three levels of the incentive treatment.
Thus, HM can conclude, with a speci?ed level of
statistical con?dence, that these variables, and
other unspeci?ed variables such as motivation or
breakfast size, did not account for the results. In
fact, had HM not chosen to measure analysts’
experience and use it as a covariate to reduce var-
iance in their analysis, they could have ignored
experience and expected that it would not a?ect
their mean results because of random assignment
across treatment conditions.
More generally, random assignment to treat-
ment conditions allows experimentalists to avoid
many of the omitted variable concerns that limit
causality inferences in archival studies. For exam-
ple, Kothari (2000) notes that the direction of
cause and e?ect between relationship and forecast
optimism documented in the archival literature is
not clear. It could as easily result from managers’
selection of investment banks whose analysts pro-
vide a more optimistic forecast as from opportu-
nistic forecasting by analysts with relationships.
This selection alternative explanation is eliminated
in HM by random assignment of analyst subjects.
13
Of course, the experimenter needs to worry about carry-
over e?ects (i.e. earlier measurements a?ecting later behavior).
Sometimes the order in which successive dependent variables
are elicited is manipulated between subjects to reduce this con-
cern. This is discussed further in Section 4.1.3.
800 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
As a result of random assignment, the expected
value of analyst optimism prior to the treatment is
unbiased across the three incentive treatment
groups.
Second, experimentalists can hold extraneous
variables constant at a particular level. For exam-
ple, HM hypothesize that analysts who will be
underwriting securities exhibit di?erent forecast
bias than analysts who do not, because they face
di?erent incentives. However, compared to non-
underwriting analysts, underwriting analysts could
also have larger amounts of information available
about a ?rm, or spend di?erent amounts of time
forecasting earnings. HM deal with these potential
alternative explanations for changes in their
dependent variable (forecast accuracy) by holding
constant across treatments the amount of infor-
mation analysts have available and the amount of
time analysts can spend on the experimental task.
More generally, experimentalists typically hold
constant aspects of the institutional setting that
they believe are potentially important but that are
not part of the portion of the research question
examined in that particular study.
A third way to deal with extraneous variables is
to measure them (typically during debrie?ng).
These measurements can be used as covariates or
measured independent variables to account for
their e?ects. For example, HM identi?ed prior
research that indicated that analysts’ forecast
accuracy changes as they become more experi-
enced. Since a general experience e?ect was not
part of their hypotheses, but might a?ect their
dependent variable, HM measured experience by
eliciting years spent as a ?nancial analyst and used
it as a covariate in their analysis. Years of experi-
ence cannot have been in?uenced by HM’s treat-
ment e?ect, so they use it as a covariate to reduce
noise in their analyses without fear that it is actu-
ally capturing some element of the e?ect of the
independent variable on the dependent variable
(link 4). Similarly, Hirst, Koonce, and Miller
(1999) use a pretest measure of forecasted earnings
taken before the treatment was administered to
reduce noise and increase power.
Measurements of extraneous variables are also
useful for testing competing explanations for
experimental results. For example, Hopkins (1996)
tests whether subjects infer management signaling
or di?erential tax treatment from the balance
sheet classi?cation of the hybrid security. Either of
these inferences could explain an e?ect of classi?-
cation on forecast error, but neither is included in
Hopkins’ theory. Hopkins provides evidence
against these explanations by eliciting in debrie?ng
subjects’ inferences about the underlying transac-
tion and demonstrating a lack of signi?cant dif-
ference in inference between treatment conditions.
Such measures operate much like a manipulation
check, but rather than providing evidence that the
independent variable operationalizes the ante-
cedent concept the experimenter intended, they
provide evidence that the independent variable did
not operationalize antecedent concepts other than
those intended by the experimenter. The assurance
they provide is limited (in that they provide evi-
dence by ?nding an insigni?cant di?erence), but it
is assurance nonetheless.
A fourth way to deal with extraneous variables
is to manipulate them and test their e?ects. For
example, Bloom?eld, Libby, and Nelson (2000b)
present their subjects with a number of securities,
and vary between subjects the order in which
securities are presented. They test for order e?ects
and ?nd none, allowing them to discount order of
presentation as a potential explanation for their
results. Even if they did not test for such e?ects,
manipulating order in a balanced design would
reduce the risk that results are speci?c to a parti-
cular order. In general, manipulating factors
unrelated to the hypotheses can be useful, but
expensive in terms of use of subjects.
Finally, experimentalists can deal with link-5
factors by ignoring them. By ‘‘ignore’’ we really
mean ‘‘abstract from,’’ because those factors will
not be included in the experimental environment.
Ignoring some extraneous variables is necessary
because it is not practical to mimic all elements of
reality in an experiment; some abstraction is
necessary for the experiment to be conducted in a
timely manner. To the extent that subjects make
assumptions about information that is not inclu-
ded in the experimental environment, those
assumptions are randomly distributed across
treatment conditions, and do not a?ect inter-
pretation of results, as long as the treatments do
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 801
not di?erentially a?ect subjects’ assumptions
about extraneous variables.
It is important to note that these methods of
accounting for extraneous variables are e?ective
only when the experimental design manipulates the
variables of primary interest to test e?ects of
directional predictions. For example, Tuttle, Col-
ler, and Burton (1997) wish to examine how
security prices are in?uenced by the order in which
information is revealed to investors. They provide
investors with rich ?rm-speci?c information about
market conditions and corporate events, rather
than the abstract information used in many mar-
kets experiments. Because the authors cannot
know exactly what knowledge investors bring to
bear in interpreting this rich information, it could
have a number of unknown e?ects on stock price,
and might lead prices on average to be higher and
lower than they should be. However, rather than
comparing prices to a point prediction of true
value, they examine whether the order of infor-
mation release causes a di?erence in prices. This
di?erence cannot be a?ected by extraneous vari-
ables created by the rich information (although
they surely exist), because the total information is
held constant across the settings being compared.
As discussed by Bloom?eld and Libby (1996),
this type of ‘‘paired securities’’ design can generally
be used to eliminate concerns about unanticipated
e?ects of realism in experiments. Experiments that
attempt to compare behavior to point predictions
sacri?ce this powerful form of experimental con-
trol. Even apparently innocuous variables in an
experimental setting (such as the color of a com-
puter screen or the time of day at which data col-
lection occurs) could cause deviations of behavior
from a point prediction, but are unlikely to cause
those deviations to vary across levels of the
manipulated independent variables.
4.2. Increasing experimental e?ciency (without
compromising e?ectiveness)
Experimenters make many choices that a?ect
the amount of resources consumed by their
experiments. This section discusses four such
choices: whether to use professional subjects
(which are di?cult to obtain); whether to provide
those subjects with monetary incentives (which are
expensive); whether to use between-subjects designs
(which use more subjects than within-subjects
designs); and whether to place subjects in a labora-
tory market (which requires more subjects than
would a study of individual judgments). Choosing
to consume more resources does not necessarily
increase experimental e?ectiveness. Rather, it
increases e?ectiveness in some circumstances,
reduces it in others, and has a small enough e?ect
in others that it is not justi?ed from a cost/bene?t
perspective. We discuss each choice in turn.
4.2.1. Subject selection
When should experiments use professional sub-
jects? Our advice is to match subjects to the goals
of the experiment, but to avoid using more
sophisticated subjects than is necessary to achieve
those goals.
Experiments that examine the e?ects of some
attribute subjects have developed before entering
the experiment must use subjects who possess the
necessary attribute. Many studies use experiments
to ‘‘peer into the minds’’ of speci?c groups of
experienced professionals to determine what they
have learned about relevant concepts and events
and how that learning a?ects decisions. Hopkins
(1996) examines how knowledge of the di?erential
e?ects of debt and equity o?erings determines how
classi?cation of debt-equity hybrids a?ects ana-
lysts’ judgments. Libby and Kinney (2000) seek to
explore how auditors’ beliefs about managers and
their own incentives determine the e?ect of old
and new regulations. In both of these cases, the
experimenter is interested in how subjects’ use of
some type of knowledge learned in the real world
causes treatment e?ects, so they must use subjects
with the requisite knowledge. Thus, these studies
use professionals as subjects.
In some cases, the experimenter can train stu-
dent subjects to possess an attribute (e.g. knowl-
edge) that the experimenter is interested in
examining. This approach is cost-e?ective given
students’ greater availability than professional
subjects, and is well suited for testing the e?ects of
speci?c features of the learning environment and
elements of the resulting knowledge (cf. Bonner &
Walker, 1994). However, this must be done with
802 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
care since recently acquired knowledge is unlikely
to be of the same depth and breadth, or integrated
as well with subjects’ pre-existing knowledge.
Student subjects are also entirely appropriate in
studies that focus on general cognitive abilities, or
responses to economic institutions or ?nancial-
market forces that are expected to be learned
within the experimental setting. Maines and Hand
(1996) provide an example of the former; they
examine the e?ects of general tendencies in the
processing of time-series information on forecast-
ing behavior. Any of the reporting studies by King
and Wallin (1991a, 1991b, 1995) provide examples
of the latter; those studies examine how subjects
respond to the strategic forces in disclosure games.
Other experiments focus on the judgments of
general or novice investors, and so require subjects
who possess only basic familiarity with accounting
and investing. Student populations that have such
basic familiarity are appropriate here as well.
MBA students and executive-program partici-
pants are particularly useful, as they often have
some accounting knowledge and investing experi-
ence. Studies of this type employing student sub-
jects include Bloom?eld, Libby, and Nelson (1999,
2000a), Hirst, Koonce, and Miller (1999), Hirst,
Koonce, and Simko (1995), Kennedy, Mitchell,
and Sefcik (1998), Lipe (1998), Bloom?eld and
Libby (1996), Maines and McDaniel (2000), and
Nelson, Krische, and Bloom?eld (2000).
In general, experimenters should avoid using
professional subjects unless it is necessary to
achieve their research goals. In addition to
increasing the experimenters’ own time and
expense, inappropriate use of professional subjects
has negative externalities — they may make it
more di?cult for other experimenters to gain
access to this very valuable resource.
4.2.2. Monetary incentives
When is it appropriate to provide explicit
monetary incentives in ?nancial accounting
experiments? As in subject selection, the answer
should be driven primarily by the goals of the
experiment.
First, as noted above, experiments that focus on
incentives rely on participating professionals to
bring their knowledge of and behavior learned in
response to real world incentives to the experi-
ment. Such experiments attempt to examine how
professional practice has provided professionals
with incentives that a?ect their behavior in parti-
cular ways. For example, HM studied the e?ect of
analysts’ incentives on their forecast accuracy,
with those incentives determined by the analysts’
perceptions and understanding of the relationship
that the analyst has with the ?rm whose perfor-
mance is being forecasted. Providing performance-
contingent incentives in this type of experiment
would distort or interfere with the e?ects of the
real world incentives, and is therefore inappropri-
ate. While the e?ects of professionals’ perceived
incentives might be diminished in the experimental
setting, their direction should not be altered, so
their directional e?ects should not be altered.
Experiments testing responses to economic the-
ory (such as those described in Section 3.4) need to
provide performance-contingent incentives in
order to induce subjects to possess the incentives
assumed by the economic model (Smith, 1976).
Without such incentives, a fundamental causal
element of the model may not be present, and
there is no reason to expect theoretical predictions
to hold. Performance-contingent incentives are
almost always appropriate in laboratory market
experiments that examine how individual biases
can be mitigated by competitive forces. For
example, the ‘‘smart trader’’ hypothesis relies on
an assumption that more accurate traders trade
more actively because they will earn money by
doing so.
A researcher who has concluded that perfor-
mance-contingent incentives are appropriate must
then decide on how sensitive payments should be
to variations in performance. Our casual observa-
tions suggest that most experimental tests of eco-
nomic theories pay subjects an average of $8 to
$20 per hour, with payments ranging from $5/
hour to $100/hour (or sometimes more). These
numbers re?ect tradition and resource limitations
more than any reasoned theory. These incentives
are obviously much less than most agents in
?nancial accounting target environments would
expect. However, we doubt behaviors would be
substantially di?erent with larger incentives. Past
experiments show little evidence that biases are
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 803
eliminated by incentive compensation, just as
?nancial rewards have not allowed athletes to run
a 3-min mile. Limitations on abilities, rather than
a lack of reward, drive these results. More gen-
erally, larger monetary incentives might reduce the
size of biases, but are unlikely to alter their basic
nature and direction.
14
Thus, larger incentives
would probably not change the inferences drawn
from directional hypothesis tests.
4.2.3. Within- vs. between-subjects designs
When should experiments use between-subjects
designs, rather than within-subjects designs?
Within-subjects (or ‘‘repeated-measures’’ designs)
where subjects provide more than one observa-
tion, generally enhance statistical power by allow-
ing control of between-subjects di?erences (i.e.
there is a ‘‘subject factor’’ in the analyses that
accounts for subject-speci?c noise). This approach
has the added advantage of using fewer subjects.
However, repeated measures designs can also
a?ect results by making treatment e?ects more
salient, which may signal to subjects that the
experimenter wants them to respond to the
manipulation (the familiar ‘‘demand e?ect’’ con-
cern). Also, repeated measures are vulnerable to
carryover e?ects from the elicitation of one
measure to the next. Therefore, these designs are
most e?ective when increased salience of manipu-
lated variables is desirable from the standpoint of
the experiment’s goals and/or when any carryover
e?ect is desired or can be minimized via manip-
ulation of the order in which measures occur.
As noted earlier, Hirst, Koonce, and Miller
(1999) use one type of repeated measures design,
the pretest–posttest design. Their subjects ?rst
forecast earnings and assess con?dence in that
forecast, given only company background infor-
mation and the prior years’ ?nancial data. The
subjects were then provided with the experimental
treatments (management forecast and information
about management forecast accuracy), and again
forecasted earnings and assessed con?dence. This
pretest–posttest design allows Hirst, Koonce, and
Miller to increase power by using the pretest as a
covariate in their analyses or by analyzing the
change in forecasts caused by the treatment. Since
they want their subjects to attend carefully to the
information contained in their treatments, and
their analyses are based on comparisons between
treatment conditions (which hold treatment sal-
ience constant), they are not concerned about
drawing extra attention to the treatment.
Within-subject treatments are particularly com-
mon in laboratory markets and games. For exam-
ple, Bloom?eld and Wilks (2000) create a setting
in which each group of subjects participates in
eight di?erent treatments (every cell of a 2Â2Â2
design) over the course of two trading sessions.
Such repetition reduces noise in the data, which is
often high in early repetitions because the envir-
onment is so complex. Repetition also uses sub-
jects’ time very e?ciently, which reduces the
already high cash cost of running such experi-
ments. However, repetition also requires Bloom-
?eld and Wilks to balance the orders of the
treatments, to ensure that treatment e?ects are not
confounded with order e?ects.
Tan, Libby, and Hunton (2000) also suggest the
use of a combination of between- and within-sub-
jects designs as a method of partitioning the e?ects
of unintentional biases from intentional judgment
policies. Following Kahneman and Tversky (1996),
they suggest that the between-subjects design pro-
vides a clean test of the subject’s natural reasoning
process, while the within-subjects design draws
attention to the independent variable of interest
and thus gives the subject a chance to detect and
correct errors and inconsistencies in their respon-
ses. Comparison of results under the two approa-
ches highlights how subjects address any con?ict
between what they do and what they know. Evi-
dence of di?erences using between-subjects treat-
ments, but not using within-subjects treatments,
suggests that the between-subjects di?erences are
unintentional. On the other hand, evidence of dif-
ferences using within-subjects treatments, but not
using between-subjects treatments, suggests that
subjects are aware of the implications of the di?er-
ences in the stimuli, but that, in their natural rea-
soning process, the stimuli were ignored or subjects’
related knowledge was not accessed and used. This
method should be useful in other studies that
14
See Kachelmeier and Shehata (1992) for a study on how
very large incentives in?uence responses to risk.
804 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
attempt to distinguish between the e?ects of judg-
ment heuristics versus knowledge.
The choice of between- versus within-subjects
designs a?ects analyses, since within-subjects mani-
pulation (i.e. repeated measures) yields observa-
tions that are not independent. For example,
Bloom?eld and Wilks (2000) observe well over a
thousand closing prices in their study. However,
since there are only eight distinct groups of sub-
jects, their repeated-measures analyses e?ectively
compute the average treatment e?ect (a signed
di?erence) for each group, and then perform a t-
test on the eight di?erences. This design is more
powerful than it might seem, because each of the
eight numbers is the average of a large number of
observations, and therefore has very little noise.
As discussed in Section 4.1.2, most laboratory
markets conduct supplementary analyses that
break a theory into parts using measured inter-
vening variables. For example, Bloom?eld and
Wilks (2000) examine how disclosure quality
a?ects market price through its e?ects on market
liquidity, which is measured. It is more di?cult to
apply pure repeated-measures statistical techni-
ques to such analyses. However, experimenters
should be aware that inappropriate statistical
methods overstate sample size (and therefore
understate P-values), and should be interpreted
with caution. More importantly, researchers must
make every attempt to use repeated-measures
analyses for their main hypothesis tests.
4.2.4. Using laboratory ?nancial markets
When is it necessary to place individuals in
laboratory markets? Critics of individual decision-
making experiments often suggest that biases and
suboptimal behavior would be driven away by
market forces. In our view, this criticism alone
rarely justi?es the cost of a market experiment. As
discussed in Section 3.3, few experiments have
shown that market forces eliminate biases; even
when they mitigate a bias, they tend to a?ect its
magnitude, but not its sign (e.g. market prices are
still too high, but not by as much). Because only
directional e?ects are easily generalized from
experiments to target settings, using a ?nancial
market does not substantially alter an experiment’s
e?ectiveness. On the other hand, the market does
dramatically increase the cost of the experiment. A
group of 50 subjects will yield 50 judgments that
are statistically independent of one another.
Forming those subjects into 10 separate ?ve-trader
markets yields only 10 judgments (market prices)
that are statistically independent of one another.
As a result, the use of a market either reduces
power or increases the costs of the study.
Laboratory markets are most appropriate when
examining particular forces within the market that
might a?ect bias mitigation (such as the smart-
trader hypothesis), or when examining dependent
variables that are simply unde?ned at the indivi-
dual level (such as trading volume or market
liquidity). Even in these cases, however, one can
sometimes address experimental goals in indivi-
dual decision-making tasks. For example, Nelson,
Krische, and Bloom?eld (2000) use an individual
decision-making task to examine how con?dence
in one’s own ability to ‘‘pick winners,’’ relative to
con?dence in large-sample anomalies (such as
post-earnings-announcement drift) can a?ect tra-
ders’ willingness to rely on a disciplined trading
strategy. They do not have traders transact with
each other, but rather examine the number of
shares that each trader o?ers to transact. This
approach allows researchers to examine the rela-
tion between judgment and trading behavior, but
does not allow researchers to capture strategic
interactions between market participants.
Given that one chooses to conduct a ?nancial
market, there are many decisions that can reduce
the cost of each observation. One method used
almost universally in laboratory ?nancial markets
and laboratory games (as in Sections 3.3. and 3.4)
is to have each group provide many observations
(a repeated-measures design). As noted in Section
4.2.3, repeated-measures designs o?er many
advantages, but a?ect the statistical analyses that
must be performed.
5. Conclusions
This paper discusses how recent experimental
research in ?nancial accounting has responded to
past criticisms, discusses how the recent literature
has developed and how it can be extended, and
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 805
provides our perspective on how future experiments
can be designed to maximize both e?ectiveness and
e?ciency. Our comments are driven by our belief
that experiments — whether based on psychologi-
cal or economic theory — must exploit the primary
advantages of the experimental method. Those
advantages include the ability to construct an envir-
onment in which a causal theory of phenomena can
be tested with a maximum of internal validity.
Experimental research is still only a small part
of empirical ?nancial accounting research. This
raises the question of how ?nancial accounting
experiments should relate to the more dominant
archival-empirical work. One of the most notable
characteristics of the better studies that we have
reviewed is their close tie to formal or informal
empirical observation. These observations often
provide part of the motivation for the experi-
mental studies, and are relied upon to demonstrate
the external validity of experimental results.
Future research can relate even more closely to
this literature by testing alternative potential
explanations for archival ?ndings when there are
natural confounds, measurement problems, or
where causality is unclear, by explaining contra-
dictory ?ndings, and by examining conditions
where large samples are unavailable. Experiments
can also point to directions for future archival-
empirical studies by specifying either the limits to
the generality of existing ?ndings or other ?ndings
that should exist in further archival studies.
Acknowledgements
Prepared for the Accounting, Organizations and
Society 25th Anniversary Conference, Oxford Uni-
versity, July 2000. We thank Ron King, Lisa
Koonce, Laureen Maines, Greg Waymire, the par-
ticipants at the AOS 25th Anniversary Conference
and the Emory Behavioral Financial Accounting
Research Conference for their comments and sug-
gestions, and Bernadine Low for her assistance.
References
Abarbanell, J. S., & Bernard, V. L. (1992). Tests of analysts’
overreaction/underreaction to earnings information as an
explanation for anomalous stock price behavior. Journal of
Finance, 47(3), 1181–1208.
Abdel-Khalik, A. R., & El-Sheshi, K. (1980). Information
choice and cue utilization in an experiment on default pre-
diction. Journal of Accounting Research, 18(2), 325–342.
Ackert, L. F., Church, B. K., & Shehata, M. (1997). An
experimental examination of the e?ects of forecast bias on
individuals’ use of forecasted information. Journal of
Accounting Research, 35(1), 25–42.
Anderson, M. J. (1988). A comparative analysis of information
and evaluation behavior of professional and non-profes-
sional ?nancial analysts. Accounting, Organizations and
Society, 13(5), 431–446.
Anderson, M. J., & Sunder, S. (1995). Professional traders as
intuitive Bayesians. Organizational Behavior and Human
Decision Processes, 64(2), 185–202.
Andrade, G. (1999). Do appearances matter? The impact of EPS
accretion and dilution on stock prices. Working Paper, Har-
vard Business School.
Ball, R. (1992). The earnings–price anomaly. Journal of
Accounting and Economics, 15(2), 319–345.
Ball, R., & Bartov, E. (1996). How naive is the stock market’s
use of earnings information? Journal of Accounting and Eco-
nomics, 21(3), 319–337.
Bamber, L. (1987). Unexpected earnings, ?rm size, and trading
volume around quarterly earnings announcements. The
Accounting Review, 62(3), 510–532.
Bamber, L., Barron, O., & Stober, T. (1997). Trading volume
and di?erent aspects of disagreement coincident with earn-
ings announcements. The Accounting Review, 72(4), 575–597.
Barberis, N., Shleifer, A., & Vishny, R. (1998). A model of
investor sentiment. Journal of Financial Economics, 49(3),
307–343.
Bazerman, M. H. (1998). Judgment in managerial decision
making. New York: John Wiley.
Bazerman, M. H., Morgan, K. P., & Loewenstein, G. F. (1997).
The impossibility of auditor independence. Sloan Manage-
ment Review, Summer, 89–94.
Beeler, J., & Hunton, J. E. (2001). Contingent economic rents:
insidious threats to auditor independence. Working Paper,
South Florida University.
Beresford, D. R. (1994). A request for more research to support
?nancial accounting standard setting AAA — accounting,
behavior and organization section. Behavioral Research in
Accounting, 6(Supplement), 190–203.
Berg, J., Dickhaut, J., & McCabe, K. (1995). The individual
versus the aggregate. In R. H. Ashton, & A. H. Ashton
(Eds.), Judgment and decision-making research in accounting
and auditing (pp. 102–134). New York: Cambridge.
Bernard, V. L. (1993). Stock price reactions to earnings
announcements: a summary of recent anomalous evidence
and possible explanations. In R. Thaler (ed.), Advances in
behavioral ?nance (pp. 303–340).
Bernard, V. L., & Skinner, D. J. (1996). What motivates man-
agers’ choice of discretionary accruals? Journal of Accounting
and Economics, 22(1–3), 313–325.
806 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
Bernard, V. L., & Thomas, J. (1989). Post-earnings announce-
ment drift: delayed price response or risk premium? Journal
of Accounting Research, 27(1), 1–48.
Bernard, V. L., & Thomas, J. (1990). Evidence that stock prices
do not fully re?ect the implications of current earnings for
future earnings. Journal of Accounting and Economics, 13(4),
305–340.
Bernheim, D. (1984). Rationalizable strategic behavior. Econo-
metrica, 52(5), 1007–1028.
Bhushan, R. (1994). An informational e?ciency perspective on
the post-earnings-announcement drift. Journal of Accounting
and Economics, 18(1), 45–65.
Biggs, S. F. (1984). Financial analysts’ information search in
the assessment of corporate earning power. Accounting,
Organizations and Society, 9(3), 313–323.
Biggs, S. F., Bedard, J. C., Gaber, B. G., & Linsmeier, T. J.
(1985). The e?ects of task size and similarity on the decision
behavior of bank loan o?cers. Management Science, 31(8),
970–987.
Bloom?eld, R. (1996a). The interdependence of reporting dis-
cretion and informational e?ciency in laboratory markets.
The Accounting Review, 71(4), 493–511.
Bloom?eld, R. (1996b). Quotes, prices and estimates of value in
a laboratory market. Journal of Finance, 51(5), 1791–1808.
Bloom?eld, R., & Hales, J. (2000). Developing reputations for
reliable reporting: the role of expectations. Working Paper,
Cornell University.
Bloom?eld, R., & Libby, R. (1996). Market reactions to di?er-
entially available information in the laboratory. Journal of
Accounting Research, 34(2), 183–207.
Bloom?eld, R., Libby, R., & Nelson, M. W. (1996). Commu-
nication of con?dence as a determinant of group judgment
accuracy. Organizational Behavior and Human Decision Pro-
cesses, 68(3), 287–300.
Bloom?eld, R., Libby, R., & Nelson, M. W. (1999). Con?dence
and the welfare of less-informed investors. Accounting,
Organizations and Society, 24(8), 623–647.
Bloom?eld, R., Libby, R., & Nelson, M. W. (2000a). Over-
reliance on previous years’ earnings. Working Paper, Cornell
University.
Bloom?eld, R., Libby, R., & Nelson, M. W. (2000b). Under-
reactions, over-reactions, and moderated con?dence. Journal
of Financial Markets, 3, 113–137.
Bloom?eld, R., & Wilks, T. J. (2000). Disclosure e?ects in the
laboratory: liquidity, depth and the cost of capital. The
Accounting Review, 75(1), 13–42.
Bonner, S. E. (1990). Experience e?ects in auditing: the role of
task-speci?c knowledge. The Accounting Review, 65(1), 72–92.
Bonner, S. E., & Walker, P. L. (1994). The e?ects of instruction
and experience on the acquisition of auditing knowledge. The
Accounting Review, 69(1), 157–178.
Bouwman, M. J. (1984). Expert vs. novice decision making in
accounting: a summary. Accounting, Organizations and
Society, 9(3), 325–327.
Brown, L., & Han, J. (2000). Do stock prices re?ect the impli-
cations of current earnings for future earnings for AR1
?rms? Journal of Accounting Research (in preparation).
Calegari, M., & Fargher, N. L. (1997). Evidence that prices do
not fully re?ect the implications of current earnings for
future earnings: an experimental markets approach. Con-
temporary Accounting Research, 14(3), 397–433.
Camerer, C. (1987). Do biases in probability judgment matter
in markets, experimental evidence. American Economic
Review, 77(5), 981–997.
Camerer, C. (1992). The rationality of prices and volume in
experimental markets. Organizational Behavior and Human
Decision Processes, 51(2), 237–272.
Camerer, C. (1997). Rules for experimenting in psychology and
economics, and why they di?er. In Van Dam et al., Under-
standing strategic interaction: essays in honor of R Selten.
Berlin, New York: Springer.
Carroll, J. S. & Johnson, E. (1990). Decision research: a ?eld
guide. Sage
Chan, L., Jegadeesh, K. C., & Lakonishok, J. (1996). Momen-
tum strategies. Journal of Finance, 51(5), 1681–1713.
Clement, M. (1999). Analyst forecast accuracy: do ability,
resources, and portfolio complexity matter? Journal of
Accounting and Economics, 27(3), 285–303.
Cloyd, C. B., Pratt, J., & Stock, T. (1996). The use of ?nancial
accounting choice to support aggressive tax positions: public
and private ?rms. Journal of Accounting Research, 34(1), 23–43.
Coller, M. (1996). Information, noise, and asset prices: an
experimental study. Review of Accounting Studies, 1, 35–50.
Cuccia, A. D., Hackenbrack, K., & Nelson, M. W. (1995). The
ability of professional standards to mitigate aggressive
reporting. The Accounting Review, 70(2), 227–248.
Daniel, K., Hirshleifer, D., & Subrahmanyam, A. (1998).
Investor psychology and security market under- and over-
reactions. Journal of Finance, 53(6), 1839–1885.
DeBondt, W., & Thaler, R. (1985). Does the stock market
overreact. Journal of Finance, 40(3), 793–818.
DeBondt, W., & Thaler, R. (1987). Further evidence of investor
overreaction and stock market seasonality. Journal of
Finance, 42(3), 557–581.
DeBondt, W., & Thaler, R. (1990). Do security analysts over-
react? American Economic Review, 80(2), 52–57.
Dechow, P., & Sloan, R. (1997). Returns to contrarian invest-
ment strategies: tests of naive expectation hypotheses. Jour-
nal of Financial Economics, 43(1), 3–27.
Dechow, P. M., Sloan, R. G., & Sweeney, A. P. (1995).
Detecting earnings management. The Accounting Review,
70(2), 193–225.
De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann,
R. J. (1991). The survival of noise traders in ?nancial mar-
kets. The Journal of Business, 64(1), 1–19.
Dietrich, J. R., Kachelmeier, S. J., Kleinmuntz, D. N., & Lins-
meier, T. J. (2000). Market e?ciency, bounded rationality,
and supplemental business reporting disclosures. Journal of
Accounting Research (in preparation).
Dopuch, N., & King, R. R. (1996). The e?ects of lowballing on
audit quality: an experimental markets study. Journal of
Accounting, Auditing and Finance, 11, 45–69.
Dyckman, T. R. (1964). On the investment decision. The
Accounting Review, 39(2), 285–295.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 807
Einhorn, H. J. (1980). Learning From experience and sub-
optimal rules in decision making. In T. Wallsten (Ed.), Cog-
nitive processes in choice and decision behavior (pp. 1–20).
Hillsdale, NJ: Erlbaum.
Einhorn, H. J., & Hogarth, R. M. (1981). Behavioral decision
theory: processes of judgment and choice. Annual Review of
Psychology, 32, 53–88.
Einhorn, H. J., & Hogarth, R. M. (1986). Decision making
under ambiguity. Journal of Business, 59(4), S225–S250.
Fama, E. F. (1970). E?cient capital markets: a review of theory
and empirical work. Journal of Finance, 25(2), 383–417.
Fama, E. F. (1998). Market e?ciency, long-termreturns, andbeha-
vioral ?nance. Journal of Financial Economics, 49(3), 283–306.
Fischer, P., &Verrecchia, R. (1999). Public information and heur-
istic trade. Journal of Accounting and Economics, 27(1), 89–124.
Forsythe, R., & Lundholm, R. (1990). Information aggregation
in an experimental market. Econometrica, 58(2), 309–348.
Forsythe, R., Lundholm, R., & Reitz, T. (1999). Cheap talk,
fraud and adverse selection in ?nancial markets: some experi-
mental evidence. Review of Financial Studies, 12, 518–581.
Foster, G., Olsen, C., & Shevlin, T. (1984). Earnings releases,
anomalies, and the behavior of security returns. The
Accounting Review, 59(4), 574–603.
Frankel, R., & Lee, C. (1998). Accounting valuation, market
expectation, and cross-sectional stock returns. Journal of
Accounting and Economics, 25(3), 283–319.
Ganguly, A. R., Kagel, J. H., & Moser, D. V. (1994). The
e?ects of biases in probability judgments on market prices.
Accounting, Organizations and Society, 19(8), 675–700.
Gervais, S., & Odean, T. (1997). Learning to be overcon?dent.
Unpublished Working Paper, University of Pennsylvania.
Ghosh, D., & Whitecotton, S. M. (1997). Some determinants of
analysts’ forecast accuracy. Behavioral Research in Account-
ing, 9(Supplement), 50–68.
Gibbins, M., Salterio, S., & Webb, A. (2000). Evidence about
auditor–client management negotiation concerning client’s
?nancial reporting. Journal of Accounting Research (in
preparation).
Gibbins, M., & Swieringa, R. J. (1995). Twenty years of judgment
research in accounting and auditing. In R. H. Ashton, & A. H.
Ashton (Eds.), Judgment and decision-making research in
accounting and auditing (pp. 231–249). New York: Cambridge.
Gillette, A. B., Stevens, D. E., Watts, S. G., & Williams, A. W.
(1999). Price and volume reactions to public information relea-
ses: an experimental approach incorporating traders’ subjective
beliefs. Contemporary Accounting Research, 16(3), 437–479.
Gode, D., & Sunder, S. (1993). Allocative e?ciency of markets
with zero-intelligence traders: market as a partial substitute
for individual rationality. The Journal of Political Economy,
101(1), 119–140(February).
Gode, D., &Sunder, S. (1997). What makes markets allocationally
e?cient? The Quarterly Journal of Economics, 112(2), 603–630.
Gonedes, N., & Dopuch, N. (1974). Capital market equilibrium,
information production, and selecting accounting techniques:
theoretical framework and review of empirical work. Journal
of Accounting Research, 12(Supplement), 48–129.
Gri?n, D., & Tversky, A. (1992). The weighing of evidence and
the determinants of con?dence. Cognitive Psychology, 24(3),
411–435.
Hackenbrack, K., & Nelson, M. W. (1996). Auditors’ incen-
tives and their application of ?nancial accounting standards.
The Accounting Review, 71(1), 43–59.
Hand, J. (1990). A test of the extended functional ?xation
hypothesis. The Accounting Review, 65(4), 740–763.
Haynes, C. M., & Kachelmeier, S. J. (1998). The e?ects of
accounting contexts on accounting decisions: a synthesis of
cognitive and economic perspectives in accounting experi-
mentation. Journal of Accounting Literature, 17, 97–136.
Healy, P. M., & Wahlen, J. M. (1999). A review of the earnings
management literature and its implications for standard set-
ting. Accounting Horizons, 13(4), 365–383.
Herrnstein, R., &Vaughn, W. (1980). Melioration and behavioral
allocation. In J. Staddon (Ed.), Limits to action: the allocation
of individual behavior (pp. 143–176). NewYork, NY: Academic
Press.
Hirst, D. E. (1994). Auditor sensitivity to earnings manage-
ment. Contemporary Accounting Research, 11(1), 405–422.
Hirst, D. E., & Hopkins, P. E. (1998). Comprehensive income
reporting and analysts’ valuation judgments. Journal of
Accounting Research, 36(Supplement), 47–75.
Hirst, D. E., Koonce, L., & Miller, J. (1999). The joint e?ect of
management’s prior forecast accuracy and the form of its
?nancial forecasts on investor judgment. Journal of Account-
ing Research, 37(Supplement), 101–124.
Hirst, D. E., Koonce, L., & Simko, P. J. (1995). Investor reac-
tions to ?nancial analysts’ research reports. Journal of
Accounting Research, 33(2), 335–351.
Hodder, L., Koonce, L., & McAnally, M. L. (2001). SEC mar-
ket risk disclosures: implications for judgment and decision
making. Accounting Horizons (in preparation).
Hogarth, R. M. (1993). Accounting for decisions and decisions
for accounting. Accounting, Organizations and Society, 18(5),
407–424.
Hogarth, R. M., & Einhorn, H. J. (1992). Order a?ects in belief
updating: the belief-adjustment model. Cognitive Psychology,
24(1), 1–55.
Hopkins, P. E. (1996). The e?ect of ?nancial statement classi-
?cation of hybrid ?nancial instruments on ?nancial analysts’
stock price judgments. Journal of Accounting Research,
34(Supplement), 33–50.
Hopkins, P. E., Houston, R. W., & Peters, M. F. (2000). Pur-
chase, pooling, and equity analysts’ valuation judgments.
The Accounting Review, 75(3), 257–281.
Hunton, J. E., & McEwen, R. A. (1997). An assessment of the
relation between analysts’ earnings forecast accuracy, moti-
vational incentives and cognitive information search strat-
egy. The Accounting Review, 72(4), 497–515.
Jacob, J., Lys, T., & Neale, M. (1999). Expertise in forecasting
performance of security analysts. Journal of Accounting and
Economics, 28(1), 51–82.
Jensen, R. (1966). An experimental design for study of e?ects
of accounting variations in decision making. Journal of
Accounting Research, 4(2), 224–238.
Jung, W., & Kwon, Y. (1988). Disclosure when the market is
808 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
unsure of information endowment of managers. Journal of
Accounting Research, 26(1), 146–153.
Kachelmeier, S. (1996a). Do cosmetic reporting variations
a?ect market behavior? A laboratory study of the accounting
emphasis on unavoidable costs. Review of Accounting Stud-
ies, 1, 115–140.
Kachelmeier, S. (1996b). Discussion of ‘‘tax advice and report-
ing under uncertainty: theory and experimental evidence.’’.
Contemporary Accounting Research, 13, 81–90.
Kachelmeier, S., & Shehata, M. (1992). Examining risk pre-
ferences under high monetary incentives: experimental evi-
dence from the People’s Republic of China. The American
Economic Review, 82, 1120–1141.
Kahneman, D., & Tversky, A. (1979). Prospect theory: an
analysis of decision under risk. Econometrica, 47(2), 263–291.
Kahneman, D., & Tversky, A. (1996). On the reality of cogni-
tive illusions. Psychological Review, 103(3), 582–588.
Kennedy, J., Kleinmuntz, D. N., & Peecher, M. E. (1997).
Determinants of the justi?ability of performance in ill-struc-
tured audit tasks. Journal of Accounting Research, 35(Sup-
plement), 105–123.
Kennedy, J., Mitchell, T., & Sefcik, S. E. (1998). Disclosure of
contingent environmental liabilities: some unintended con-
sequences? Journal of Accounting Research, 36(Autumn),
257–277.
Kim, O., & Verrecchia, R. (1994). Market liquidity and volume
around earnings announcements. Journal of Accounting and
Economics, 17(1), 41–67.
King, R. R. (1996). Reputation formation for reliable report-
ing: an experimental investigation. The Accounting Review,
71(3), 375–396.
King, R. R. (2001). An experimental investigation of self-serving
biases in an auditing trust game: the e?ect of group a?liation.
Working Paper, Washington University.
King, R. R., & Wallin, D. E. (1991a). Market-induced infor-
mation disclosures: an experimental markets investigation.
Contemporary Accounting Research, 8(1), 170–197.
King, R. R., & Wallin, D. E. (1991b). Voluntary disclosures
when seller’s level of information is unknown. Journal of
Accounting Research, 29(1), 96–108.
King, R. R., & Wallin, D. E. (1995). Experimental tests of dis-
closure with an opponent. Journal of Accounting and Eco-
nomics, 19(1), 139–168.
Kinney, W. R. (1986). Empirical accounting research design for
PhD students. The Accounting Review, 61(2), 338–350.
Kinney, W. R., & Martin, R. D. (1994). Does auditing reduce
bias in ?nancial reporting? A review of audit-related adjust-
ment studies. Auditing: A Journal of Practice and Theory,
13(1), 149–156.
Kinney, W. R., & Nelson, M. W. (1996). Outcome information
and the ‘expectations gap’: the case of loss contingencies.
Journal of Accounting Research, 34(2), 281–299.
Kothari, S. P. (2000). Capital markets research in accounting.
Journal of Accounting and Economics (in preparation).
Kunda, Z. (1990). The case for motivated reasoning. Psycholo-
gical Bulletin, 108(3), 480–498.
Kyle, A. S., & Wang, F. A. (1997). Speculation duopoly with
agreement to disagree. Can overcon?dence survive the mar-
ket test?. Journal of Finance, 52(5), 2073–2090.
LaPorta, R. (1996). Expectations and the cross-section of stock
returns. Journal of Finance, 51(5), 1715–1742.
Lee, C., Myers, J., & Swaminathan, B. (1999). What is the intrin-
sic value of the dow? Journal of Finance, 54(5), 1693–1741.
Lee, C., & Swaminathan, B. (2000). Price momentum and
trading volume. Journal of Finance (in preparation).
Libby, R. (1981). Accounting and human information processing:
theory and applications. Englewood Cli?s: Prentice-Hall.
Libby, R., & Kinney, W. R. (2000). Earnings management,
audit di?erences, and analysts’ forecasts. The Accounting
Review (in preparation).
Libby, R., & Luft, J. (1993). Determinants of judgment per-
formance in accounting settings: ability, knowledge, motiva-
tion, and environment. Accounting, Organizations and
Society, 18(5), 425–450.
Libby, R., & Tan, H-T. (1999). Analysts’ reactions to warnings
of negative earnings surprises. Journal of Accounting
Research, 37(2), 415–436.
Lipe, M. G. (1991). Counterfactual reasoning as a framework for
attribution theories. Psychological Bulletin, 109(3), 456–471.
Lipe, M. G. (1998). Individual investors’ risk judgments and
investment decisions: the impact of accounting and market
data. Accounting, Organizations and Society, 23(7), 625–640.
Lundholm, R. J. (1991). What a?ects the e?ciency of a mar-
ket? Some answers from the laboratory. The Accounting
Review, 66(3), 486–515.
Maines, L. A. (1994). The role of behavioral accounting
research in ?nancial accounting standard setting. Behavioral
Research in Accounting, 6(Supplement), 204–212.
Maines, L. A. (1995). Judgment and decision-making research
in ?nancial accounting: a review and analysis.
In R. H. Ashton, & A.H Ashton (Eds.), Judgment and deci-
sion-making research in accounting and auditing (pp. 76–101).
New York: Cambridge.
Maines, L. A., & Hand, J. R. M. (1996). Individuals’ percep-
tions and misperceptions of time series properties of quar-
terly earnings. The Accounting Review, 71(3), 317–336.
Maines, L. A., Mautz, R. D., Wright, G. B., Graham, L. E., Ros-
man, A. J., &Yardley, J. A. (2000). Implications of international
diversity in joint venture ?nancial-reporting standards for ?nan-
cial analysts’ stock values. Working Paper, Indiana University.
Maines, L. A., & McDaniel, L. S. (2000). E?ects of compre-
hensive income volatility on nonprofessional investors’ judg-
ments: the role of presentation format. The Accounting
Review (in preparation).
Maines, L. A., McDaniel, L. S., & Harris, M. S. (1997). Impli-
cations of proposed segment reporting standards for ?nan-
cial analysts’ investment decisions. Journal of Accounting
Research, 35(Supplement), 1–24.
Mayhew, B. W., Schatzberg, J. W., & Sevcik, G. R. (2000). The
e?ect of accounting uncertainty and auditor reputation on
auditor independence. Working Paper, University of
Wisconsin — Madison.
Maynard Smith, J. (1982). Evolution and the theory of games.
Cambridge, UK: Cambridge University Press.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 809
Mear, R., & Firth, M. (1987). Cue usage and self-insight of
?nancial analysts. The Accounting Review, 62(1), 176–182.
Mikhail, M., Walther, B., & Willis, R. (1997). Do security
analysts improve their performance with experience? Journal
of Accounting Research, 35(Supplement), 131–157.
Milgrom, P., & Stokey, N. (1982). Information, trade and com-
mon knowledge. Journal of Economic Theory, 26(1), 17–27.
Moser, D. V. (1998). Using an experimental economics
approach in behavioral accounting research. Behavioral
Research in Accounting, 10(Supplement), 94–110.
Nelson, M. W., Elliott, J. A., & Tarpley, R. L. (2000). Where
do companies attempt earnings management, and when do
auditors prevent it? Working Paper, Cornell University.
Nelson, M. W., & Kinney, W. R. (1997). The e?ect of ambi-
guity on auditors’ loss contingency reporting judgments. The
Accounting Review, 72(2), 257–274.
Nelson, M. W., Krische, S. D., &Bloom?eld, R. (2000). Sticking
with the program: why investors don’t exploit anomalies shown
in large-sample studies. Working Paper, Cornell University.
O’Brien, J., & Srivastava, S. (1991). Dynamic stock markets
with multiple assets: an experimental analysis. Journal of
Finance, 46(5), 1811–1838.
Odean, T. (1998). Volume, volatility, price, and pro?t when all
traders are above average. Journal of Finance, 53(6), 1887–1934.
Ou, J., & Penman, S. (1989). Financial statement analysis and
the prediction of stock returns. Journal of Accounting and
Economics, 11(4), 295–329.
Panko?, L. D., & Virgil, R. L. (1970). Some preliminary ?nd-
ings from a laboratory experiment on the usefulness of
?nancial accounting information to security analysts. Journal
of Accounting Research, 8(Supplement), 1–48.
Paquette, L., & Kida, T. (1988). The e?ect of decision strategy
and task complexity on decision performance. Organizational
Behavior and Human Decision Processes, 41(1), 128–142.
Payne, J. W., Bettman, J. R., & Johnson, E. J. (1992). Beha-
vioral decision research: a constructive processing perspec-
tive. Annual Review of Psychology, 43, 87–131.
Pearce, D. G. (1984). Rationalizable strategic behavior and the
problem of perfection. Econometrica, 52(5), 1029–1050.
Phillips, F. (1999). Auditor attention to and judgments of
aggressive ?nancial reporting. Journal of Accounting
Research, 37(1), 167–189.
Plott, C., & Sunder, S. (1988). Rational expectations and the
aggregation of diverse information in laboratory security
markets. Econometrica, 56(5), 1085–1118.
Runkel, P., & McGrath, J. (1972). Research on human behavior:
a systematic guide to method. New York: Holt, Rinehart and
Winston.
Salterio, S., & Koonce, L. (1997). The persuasiveness of audit
evidence: the case of accounting policy decisions. Accounting,
Organizations and Society, 22(6), 573–587.
Simon, H. A. (1957). Models of man. New York: Wiley.
Sloan, R. (1996). Do stock prices fully re?ect information in
accruals and cash ?ows about future earnings. The Account-
ing Review, 71(3), 289–315.
Slovic, P., Fleissner, D., & Bauman, W. S. (1972). Analyzing
the use of information in investment decision making: a
methodological perspective. Journal of Business, 45, 283–
301.
Slovic, P., & Lichtenstein, S. C. (1968). The relative importance
of probabilities and payo?s in risk taking. Journal of
Experimental Psychology Monograph Supplement, 78.
Slovic, P., & Lichtenstein, S. C. (1971). Comparison of baye-
sian and regression approaches to the study of information
processing in judgment. Organizational Behavior and Human
Performance, 6, 649–744.
Smith, E. E., & Medin, D. L. (1981). Categories and concepts
(pp. 1–17). Harvard: Cambridge.
Smith, V. (1976). Experimental economics: induced value the-
ory. American Economic Review, 66, 274–279.
Swaminathan, B., & Lee, C. (2000). Do stock prices overreact to
earnings news? Working Paper, Cornell University.
Tan, H. T., Libby, R., & Hunton, J. (2000). Analysts’ reactions
to earnings preannouncement strategies. Working Paper,
Cornell University.
Tan, T. C., & Werlang, S. (1988). The bayesian foundations of
solution concepts of games. Journal of Economic Theory,
45(2), 379–391.
Tetlock, P. (1992). The impact of accountability on judgment and
choice: towardasocial contingencymodel. InL. Berkowitz(Ed.),
Advances in experimental social psychology 25 (pp. 331–376).
NewYork: Academic Press.
Thaler, R. H. (1999). The end of behavioral ?nance. Financial
Analysts’ Journal, 55(November/December), 12–17.
Trotman, K. T. (1996). Research methods for judgment and
decision making studies in auditing. Melbourne, Australia:
Coopers and Lybrand.
Tucker, R. R. (1997). The relationship between public and pri-
vate information: an experimental markets study. Behavioral
Research in Accounting, 9, 219–249.
Tuttle, B., Coller, M., & Burton, F. G. (1997). An examination
of market e?ciency: information order e?ects in a laboratory
market. Accounting, Organizations and Society, 22(1), 89–103.
Tversky, A., & Kahneman, D. (1974). Judgment under uncer-
tainty: heuristics and biases. Science, 185, 1124–1131.
Vincent, L. (1997). Equity valuation implications of purchase
versus pooling accounting. The Journal of Financial State-
ment Analysis, 2(4), 5–19.
Wagenhofer, A. (1990). Voluntary disclosure with a strategic
opponent. Journal of Accounting and Economics, 12(4), 341–363.
Watts, R., & Zimmerman, J. (1986). Positive accounting
research. Englewood Cli?s, NJ: Prentice Hall.
Whitecotton, S. M. (1996). The e?ects of experience and con-
?dence on decision aid reliance: a causal model. Behavioral
Research in Accounting, 8, 194–216.
Wilks, J. (2001). Predecisional distortion of evidence as a con-
sequence of real-time audit review. Working Paper, Brigham
Young University.
Wright, W. F. (1977). Financial information processing models:
an empirical study. The Accounting Review, 52(3), 676–689.
Yetton, P. W., & Bottger, P. C. (1982). Individual versus group
problem solving: an empirical test of a best-member strategy.
Organizational Behavior and Human Decision Processes, 29,
307–321.
810 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
doc_261777888.pdf
This paper uses recent experimental studies of financial accounting to illustrate our view of how such experiments
can be conducted successfully. Rather than provide an exhaustive review of the literature, we focus on how particular
examples illustrate successful use of experiments to determine how, when and (ultimately) why important features of
financial accounting settings influence behavior. We first describe how changes in views of market efficiency, reliance
on the experimentalist’s comparative advantage, new theories, and a focus on key institutional features have allowed
researchers to overcome the criticisms of earlier financial accounting experiments.
Experimental research in ?nancial accounting
Robert Libby *, Robert Bloom?eld, Mark W. Nelson
Johnson Graduate School of Management, 383 Sage Hall,
Cornell University, Ithaca NY 14853 6201, USA
Abstract
This paper uses recent experimental studies of ?nancial accounting to illustrate our view of how such experiments
can be conducted successfully. Rather than provide an exhaustive review of the literature, we focus on how particular
examples illustrate successful use of experiments to determine how, when and (ultimately) why important features of
?nancial accounting settings in?uence behavior. We ?rst describe how changes in views of market e?ciency, reliance
on the experimentalist’s comparative advantage, new theories, and a focus on key institutional features have allowed
researchers to overcome the criticisms of earlier ?nancial accounting experiments. We then describe how speci?c
streams of experimental ?nancial accounting research have addressed questions about ?nancial communication
between managers, auditors, information intermediaries, and investors, and indicate how future research can extend
those streams. We focus particularly on (1) how managers and auditors report information; (2) how users of ?nancial
information interpret those reports; (3) how individual decisions a?ect market behavior; and (4) how strategic inter-
actions between information reporters and users can a?ect market outcomes. Our examples include and integrate
experiments that fall into both the ‘‘behavioral’’ and ‘‘experimental economics’’ literatures in accounting. Finally, we
discuss how experiments can be designed to be both e?ective and e?cient. # 2002 Elsevier Science Ltd. All rights
reserved.
1. Introduction
Financial accounting research is a broad ?eld
that examines ?nancial communication between
managers, auditors, information intermediaries,
and investors, as well as the e?ects of regulatory
regimes on that process. Much of this literature
focuses on managers’ and auditors’ reporting deci-
sions and their relationships to analysts’ forecasts
and value estimates, investors’ trading decisions,
and resulting market prices. This clear focus on
judgment and decision making led to the large
number of experimental ?nancial accounting
studies published in major accounting journals in
the 1960s and 1970s.
Serious criticisms of this early research (e.g.
Gonedes &Dopuch, 1974) turned experimentalists’
focus away from ?nancial accounting issues in the
1980s and early 1990s. As discussed by Maines
(1995) and Berg, Dickhaut, and McCabe (1995),
major elements of these criticisms were: (1) the
irrelevance of individual behavior in market set-
tings, in which competitive forces will eliminate
individual ‘‘errors’’; (2) poor matching of research
methods to research questions; (3) the lack of
psychological or economic theory to predict e?ects
and specify the mechanisms through which they
occur; and (4) failure to capture relevant aspects
0361-3682/02/$ - see front matter # 2002 Elsevier Science Ltd. All rights reserved.
PI I : S0361- 3682( 01) 00011- 3
Accounting, Organizations and Society 27 (2002) 775–810
www.elsevier.com/locate/aos
* Corresponding author. Tel.: +1-607-255-3348; fax: +1-
607-254-4590.
E-mail address: [email protected] (R. Libby).
of the decisions of interest, in particular, decision
maker attributes and institutional features.
Beginning in the mid-1990s, there was a resur-
gence of experimental research addressing an even
broader spectrum of ?nancial accounting issues.
This paper presents our view of how this new lit-
erature has addressed prior criticisms, and how it
can continue to shed light on ?nancial accounting
questions. We argue that signi?cant evidence of
capital market ine?ciency has renewed interest in
how individuals make key accounting-related
decisions and how these decisions a?ect market
prices. Recent studies take advantage of the
experimentalist’s comparative advantage at disen-
tangling variables that are confounded in natural
settings and measuring intervening processes to
draw strong causal inferences. Theories combining
psychology and economics have allowed experi-
mentalists to specify more clearly the mechanisms
a?ecting individual and market behavior. Finally,
most of the new studies focus on issues of clear
relevance to ?nancial accounting, particularly the
e?ects of decision-maker knowledge and motiva-
tion, the complex information environment, reg-
ulation, and strategic interaction.
This paper is aimed primarily at those who plan
to conduct ?nancial accounting experiments, and
secondarily at other ?nancial accountants who are
interested in what can be learned from experi-
mental studies. Our primary goal is to use recent
experimental studies of ?nancial accounting to
illustrate our view of how such experiments can be
conducted successfully. The core of our view is
that successful ?nancial accounting experiments use
the comparative advantages of the experimental
approach to determine how, when and (ultimately)
why important features of ?nancial accounting set-
tings in?uence behavior. By elaborating on this
view, we hope to increase the impact of future
experiments and help the new literature avoid the
mistakes and fate of the earlier literature. We do
not provide an exhaustive review of the literature,
nor do we provide detailed critiques of particular
studies. Instead, we focus on how particular exam-
ples illustrate successful use of experiments to
address important ?nancial accounting issues. Our
examples include and integrate experiments that
fall into both the ‘‘behavioral’’ and ‘‘experimental
economics’’ literatures in accounting.
1
Although
these literatures evolved from di?erent traditions,
we see them as essentially similar — both use
experiments to shed light on ?nancial accounting
issues, and therefore, both present similar oppor-
tunities and challenges to researchers. Naturally,
our review is also deeply a?ected by our own bia-
ses and the ?nancial accounting issues that we
have been addressing in our own recent research.
In Section 2, we describe in more detail how
changes in views of market e?ciency, reliance on
the experimentalist’s comparative advantage, new
theories, and a focus on key institutional features
have allowed recent experiments in ?nancial
accounting to overcome the criticisms of the ear-
lier literature. In Section 3, we describe how spe-
ci?c streams of experimental ?nancial accounting
research have addressed questions about ?nancial
communication between managers, auditors,
information intermediaries, and investors, and
indicate how future research can extend those
streams. We focus particularly on (1) how man-
agers and auditors report information; (2) how
users of ?nancial information interpret those
reports; (3) how individual decisions a?ect market
behavior; and (4) how strategic interactions
between information reporters and users can a?ect
market outcomes. While we address studies of
auditors in their ?nancial reporting role, to limit
the scope of the review, we do not address issues
related to the demand for and conduct of auditing.
We also do not address studies of creditors’ deci-
sions, which have received little attention in recent
?nancial accounting experiments.
In Section 4, we discuss how experiments can be
designed to be both e?ective and e?cient. We use
the ‘‘predictive validity framework’’ (Libby, 1981;
Runkel & McGrath, 1972) to structure our discus-
sion of maximizing e?ectiveness through careful
hypothesis development and research design. Our
discussion of e?ciency focuses on the consumption
of scarce resources, such as subjects and compen-
sation to those subjects. We conclude in Section 5
with a brief summary of our main points.
1
See Haynes and Kachelmeier (1998) and Moser (1998) for
recent discussions of the integration of the behavioral and eco-
nomic approaches to experimentation.
776 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
2. Factors a?ecting the supply and demand for
experimental ?nancial accounting research
In this section, we examine four interdependent
factors that have mitigated concerns raised about
the earlier experimental literature and promoted
recent progress in experimental ?nancial accounting
research: changing views of market e?ciency,
recognition of the strengths and weaknesses of
experimental methods in addressing ?nancial
accounting questions, the availability of new the-
oretical bases for the research, and a more detailed
view of the institutional features of ?nancial
accounting settings. We discuss each of these factors
in turn.
2.1. Changing views of market e?ciency
Much of the ?nancial accounting research in the
1960s implicitly assumed that some investors’ fail-
ure to adjust fully for the e?ects of accounting
method choices would a?ect allocation of resour-
ces in the economy and disadvantage these less
sophisticated investors in their exchanges with
more sophisticated investors (see Maines, 1995 for
a review). A series of papers in ?nance (particularly
Fama, 1970) persuaded many accounting research-
ers that if just a small fraction of investors are
sophisticated enough to respond appropriately to
accounting information, they will compete among
themselves to set security prices equal to their
expected values. As a result, the market becomes a
‘‘fair game’’ in which even unsophisticated inves-
tors are protected by the informational e?ciency
of prices.
2
This research led Gonedes and Dopuch
(1974), among others, to argue that experimental
research on individual behavior could have only
limited importance for ?nancial accounting.
In the late 1980s and 1990s, however, numerous
studies reported market ine?ciencies.
3
One line of
research provides direct support for the assumptions
underlying early ?nancial accounting research:
accounting policies a?ect pricing, even when they
have no true economic e?ects (e.g. Andrade, 1999;
Hand, 1990; Sloan, 1996; Vincent, 1997). Another
line of research indicates more generally that fun-
damental analysis of public ?nancial statement
information can lead to higher stock returns (e.g.
Frankel & Lee, 1998; Lee, Myers, & Swami-
nathan, 1999; Ou & Penman, 1989). A third line of
research suggests that even sell-side analysts —
generally recognized as among the most sophisti-
cated users of ?nancial statements — are pre-
dictably biased (DeBondt & Thaler, 1990; Dechow
& Sloan, 1997; La Porta, 1996).
The best-known lines of e?ciency research focus
on momentum in earnings and prices. A volumi-
nous literature on post-earnings-announcement
drift shows that markets underreact to large earn-
ings surprises (Ball & Bartov, 1996; Bernard &
Thomas, 1989, 1990; Bhushan, 1994; Brown &Han,
2000; Foster, Olsen, & Shevlin, 1984). Another lit-
erature, primarily published in ?nance journals,
shows that after adjusting for risk, stock returns
are positively autocorrelated over periods of sev-
eral months (e.g. Chan, Jegadeesh, & Lakonishok,
1996), but negatively autocorrelated over periods
of several years (DeBondt & Thaler, 1985, 1987).
The literature on market ine?ciency is con-
troversial, and many of the papers alleging ine?-
ciency have been criticized on methodological
grounds (Ball, 1992; Fama, 1998; Kothari, 2000).
Nevertheless, many researchers now doubt whe-
ther markets satisfy the requirements of the semi-
strong form of the e?cient markets hypothesis
(that markets respond e?ciently to all publicly
available information), or even the weak form
(that markets respond e?ciently to information
contained in past market prices). Even some of the
most skeptical seem to be convinced that post-
earnings-announcement drift is not simply an
artifact of research design (Ball, 1992). Recent
research on e?ciency has also led theorists to
examine how the assumptions underlying the e?-
cient markets hypothesis might be relaxed to
account for archival results. (We discuss these
models more in Section 2.3). As a result, experi-
mental researchers can more easily argue that
individual behavior can be an important element
in determining market behavior, even in the pre-
sence of competitive forces.
2
Watts and Zimmerman (1986) also provided particularly
in?uential arguments.
3
See Fama (1998), Kothari (2000), and Thaler (1999) for
more comprehensive reviews of this literature.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 777
2.2. The comparative advantage of ?nancial
accounting experiments
Earlier ?nancial accounting experiments typically
sought to determine whether speci?c accounting
policy choices would a?ect investors’ decisions.
Answers to such research questions call for esti-
mates of the magnitude of an e?ect (or error) by
representative actors in representative circum-
stances, a task ill suited to experiments. Such a
task is more appropriate for archival-empirical
research, which examines large representative
samples of naturally occurring phenomena.
More recent experimental research strives to use
experimentalists’ comparative advantage to focus
on disentangling the e?ects of variables that are
confounded in natural settings and determining
under what circumstances and through which
processes speci?c phenomena arise. Experiments
are well suited to this task because they construct
their own research setting. In a constructed
research setting, one can manipulate the indepen-
dent variables, control for other potentially in?u-
ential variables by holding them constant or
through randomisation, and measure the inter-
vening processes (such as information search or
the path players take to equilibrium outcomes in
strategic settings) and mental states (such as
knowledge, beliefs, or con?dence) that a?ect ?nal
outcomes. This allows an experimentalist to dis-
entangle the e?ects of variables that are con-
founded in the environment to draw strong causal
inferences, and to test the e?ects of conditions that
do not yet exist or do not exist in su?cient quan-
tity in the natural environment (Libby & Luft,
1993). Experiments testing how and why (rather
than whether or not) ?nancial accounting phe-
nomena occur can be based on theories of psy-
chological, economic or institutional processes.
We discuss these theories next.
2.3. Theoretical advances in psychology, ?nance,
and economics
Earlier experimental research was criticized for
the lack of psychological or economic theory that
speci?ed the mechanisms through which e?ects of
accounting disclosures would occur. Recent
experiments in ?nancial accounting can rely on
well-developed psychological theories of judgment
and decision making
4
that were in their infancy
when the studies reviewed by Gonedes and Dopuch
(1974) were conducted. Recent research can also
rely on economic models that describe more care-
fully when and how equilibrium outcomes arise.
The major idea underlying much research on
judgment and decision making is that decision
makers are boundedly rational (Simon, 1957).
Decision makers often have limited information
on which to base their judgments and decisions,
limited ability to retain and retrieve that informa-
tion from memory, limited ability to process and
use that information, and limited insight into their
own decision processes and future preferences.
Studies over the last 25 years have focused on how
various attributes of human cognition determine
exactly what humans do well and what they do
poorly. A number of their ?ndings have in?uenced
recent thinking in ?nancial accounting and the
study of ?nancial markets.
Many decision-making studies emphasize the
role of heuristics (Tversky & Kahneman, 1974).
Heuristics are simpli?ed decision rules developed
to deal with complex situations. These heuristics
are e?cient and often work well. But in some cir-
cumstances they may lead to systematic biases
such as over- and under-con?dence in judgment
(Gri?n & Tversky, 1992) and misperceptions of
the covariation between signals and events (Lipe,
1991), which can systematically a?ect the manner
in which individuals react to ?nancial accounting
information and the manner in which that infor-
mation is impounded in prices. Learning to over-
come biases is di?cult because of the uncertainty
and poor feedback inherent in complex environ-
ments. Often what we learn from experience is not
valid (Einhorn, 1980).
The importance of (imperfect) storage and
retrieval of information from memory has also
been recognized in recent ?nancial accounting
experiments. Some of these studies rely on models
4
Syntheses of the key constructs or ideas that drive psycho-
logical theories of judgment and decision making have been
provided by Carroll and Johnson (1990), Hogarth (1993),
Bazerman (1998), and others.
778 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
of memory organization (e.g. Smith & Medin,
1981) that indicate how knowledgeable decision
makers e?ciently organize and retrieve data.
Other studies recognize that memory for events is
in?uenced by factors that are normatively rele-
vant, such as their frequency of occurrence, and
factors that are normatively irrelevant, such as
primacy, recency, and contrast e?ects (e.g.
Hogarth & Einhorn, 1992). Still others recognize
that the limited capacity of working memory
a?ects our ability to consider multiple factors in
making a judgment or choice. Consequently, even
normatively relevant factors that decision makers
are aware of often have limited in?uence on their
judgments and decisions.
Recent research in accounting and ?nance also
relies on psychological models of risk (e.g. Kahne-
man & Tversky, 1979) and ambiguity (e.g. Einhorn
& Hogarth, 1986) that characterize individuals’
responses to risk and reward in ways that deviate
from standard expected utility theory.
5
This more
recent psychology literature provides greater abil-
ity to predict under what circumstances behavior
will be more or less likely to di?er from the pre-
dictions of standard economic theory (e.g. in
earnings predictions versus trading behavior, in
di?erent information environments). A large lit-
erature on social psychology could also be used to
understand interaction between participants in
?nancial accounting settings. For example, research
related to accountability (e.g. Tetlock, 1992),
motivated reasoning (e.g. Kunda, 1990) and group
decision processes (e.g. Yetton & Bottger, 1982)
has signi?cantly in?uenced auditing studies.
Other ?nancial accounting studies use advances
in ?nancial economics to test the assertion that
biased traders will be driven out of the market
through systematic trading losses. Some of these
models focus on how biases might in?uence mar-
ket outcomes. For example, Barberis, Shleifer, and
Vishny (1998) use psychological models of how
people perceive random-walk sequences in a
model with a representative investor. Daniel,
Hirshleifer, and Subrahmanyam (1998), Gervais
and Odean (1997) and Odean (1998) incorporate
overcon?dence into trading models. Other models
focus on forces that keep unbiased traders from
exploiting price errors. For example, De Long,
Shleifer, Summers, and Waldmann (1991) show
that traders who respond irrationally to irrelevant
information (‘‘sentiment’’) create enough noise in
prices to keep rational traders from exploiting the
resulting price errors. Fischer and Verrecchia
(1999) and Kyle and Wang (1997) show that
overcon?dence, although irrational, can actually
give traders higher payo?s than their rational
compatriots. These results make it di?cult to
argue that some form of natural selection will
eliminate irrational traders in dynamic equilibria,
and provide accounting researchers with speci?c
models of how and when individual biases might
in?uence market prices.
Experiments focusing on game theoretic models
of ?nancial accounting settings can now rely on
new economic models that move beyond the tra-
ditional equilibrium view. Rather than simply
identifying an equilibrium and assuming that it
will occur, many economists have examined in
detail what assumptions about rationality must be
satis?ed for equilibria to have predictive power
(Bernheim, 1984; Pearce, 1984; Tan & Werlang,
1988). Other models have examined the process by
which equilibria are achieved, using either psy-
chological theories based on behaviorism (Herrn-
stein & Vaughn, 1980) or evolutionary theories of
natural selection (Maynard Smith, 1982). In a simi-
lar vein, Gode and Sunder (1993, 1997) used such
ideas to show that ‘‘zero-intelligence’’ traders, who
do nothing more than avoid obviously horrible
strategies, can achieve e?cient security allocations
in some markets. By focusing on processes by
which equilibria are achieved, these studies provide
indications of when equilibria will and will not
predict behavior in ?nancial accounting settings.
2.4. Key institutional features of ?nancial
accounting settings
Most early experimental studies in ?nancial
accounting took relatively narrow views of ?nan-
cial accounting institutions. They typically focused
on the set of rules governing how accounting infor-
mation could be reported in ?nancial statements,
5
See Hodder, Koonce, and McAnally (2001) for further
discussion of risk in ?nancial accounting settings.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 779
implicitly assuming that reporting choices (and
interpretations of those choices) were made neu-
trally, rather than being in?uenced by the incen-
tives of a strategic manager or auditor. Early
studies also implicitly assumed that responses to
?nancial accounting information would be inde-
pendent of the expertise or incentives of the user,
and that interactions among users and reporters
would not alter outcomes.
Consistent with the advice of Libby and Luft
(1993), recent experimental research in ?nancial
accounting has considered institutional features
more broadly, and has also focused on the inter-
action between individual and environmental
characteristics. Two key individual characteristics
are the knowledge and motivation of information
reporters and users. These determine the parties’
goals, and how they use ?nancial accounting to
achieve those goals. Key environmental char-
acteristics include the complex regulations govern-
ing reporting, the existence of ?nancial markets,
and the strategic interactions between reporters
and users, as well as between di?erent sets of
users. Regulations determine the set of choices
open to managers and auditors, and may also
determine the results of those actions (e.g. lawsuit
outcomes). Financial markets a?ect how indivi-
dual decisions result in aggregate market out-
comes, such as stock prices, liquidity and trading
volume, and may also determine wealth transfers
among di?erent sets of investors. Strategic inter-
actions capture the intertwining of the incentives
and actions of the many parties to ?nancial
accounting decisions. Financial accounting set-
tings include managers, auditors, investors and
information intermediaries (analysts and the
press) who may all interact strategically. Man-
agers and auditors negotiate to determine the
contents of the ?nancial statement and audit
report. Investors draw inferences about managers’
and analysts’ information and incentives from
observing reports. Managers may choose reports
in an attempt to ‘‘fool’’ investors, but the investors
may be able to anticipate these attempts.
6
Focusing explicitly on individual and environ-
mental characteristics allows experimental
researchers to shed light on how and when
experimental results will generalize to target set-
tings, and also indicate how variations in these
institutions will alter behavior. In this way, an
institutional focus helps researchers to exploit the
comparative advantage of experimentation. In the
next section, we describe how speci?c streams of
experimental ?nancial accounting research have
done so, and indicate how future research could
extend those streams.
3. Key ?nancial accounting questions and
experimental evidence
The goals of the literature that we review are
similar to those of the broader ?nancial account-
ing literature: to increase our understanding of the
?nancial reporting process and its e?ects. While all
of the studies that we examine share the same
general goal, they focus on di?erent elements of
the interactions of boundedly rational managers,
auditors, information intermediaries, and inves-
tors. These di?erences in emphasis led us to divide
the studies into four related categories described
by the following questions.
1. How do managers’ and auditors’ incentives
and ?nancial accounting regulations deter-
mine how they report events?
2. How do knowledge of accounting regula-
tions, managers’ incentives, and the infor-
mation content of accounting reports a?ect
users’ (investors and information inter-
mediaries) interpretations of accounting
reports?
3. How do individual responses to information
a?ect market-level phenomena?
4. How do strategic interactions between
reporters and users of information a?ect
reporting and market outcomes?
We focus primarily on papers published since
the publication of Maines’s (1995) review of this
literature.
6
Financial accounting information is also used for con-
tracting and stewardship purposes, but that has not been the
focus of signi?cant experimental research.
780 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
3.1. How do managers’ and auditors’ incentives
and ?nancial accounting regulations determine how
they report events?
Reporting performance is fundamental to ?nan-
cial accounting. Discretion provided by ?nancial
accounting regulations, coupled with the inherent
subjectivity of much accounting measurement,
allows managers some ?exibility to opportunisti-
cally report or manage earnings. Consequently,
much archival and experimental research has
focused on this area.
Archival studies typically examine opportunistic
reporting by identifying whether earnings or
accruals di?er fromexpectation in a manner favored
by managers’ incentives (see Healy & Wahlen,
1999 for a review). While these studies have
demonstrated numerous instances of apparent
earnings management, their conclusions are some-
times criticized because of methodological di?-
culties, including poor incentive proxies, misstated
discretionary accruals models, or potential omit-
ted variables such as operating choices that have
non-earnings-management rationales but that
a?ect discretionary accruals (Bernard & Skinner,
1996; Dechow, Sloan, & Sweeney, 1995). Also,
archival studies of earnings management focus on
post-audit ?nancial statements that are a joint
product of the negotiations between managers and
auditors, which makes it di?cult to distinguish the
separate contributions of managers and auditors
to earnings management or to determine how
managers’ and auditors’ separate incentives in?u-
ence their reporting and attesting behavior (Nel-
son, Elliott, & Tarpley, 2000).
Experimental studies avoid these problems by
manipulating incentives and assessing treatment
e?ects rather than attempting to measure unex-
pected accruals, and by holding constant task
characteristics that create potential omitted vari-
ables problems. Experiments can examine man-
agers’ and auditors’ judgments separately, but can
also examine auditor–client interactions. These
characteristics of experimental work have led to a
growing experimental literature that complements
the archival work in this area.
The largest group of experimental earnings-
management studies focuses on auditors’ incentives
and the circumstances under which they allow
managers to take aggressive accounting positions.
Consistent with the general auditing literature (e.g.
Kinney & Martin, 1994), results indicate that audi-
tors reduce the aggressiveness of ?nancial reports.
For example, Hirst (1994) provides evidence that
auditors consider management competence and
objectivity when evaluating management-provided
evidence. Phillips (1999) demonstrates that, after
auditors receive evidence of aggressive reporting in
high-risk accounts, they are more likely to attend
to it elsewhere, even in accounts they typically
consider to be of low risk. Kinney and Nelson
(1996) demonstrate a circumstance in which
auditors make audit-reporting judgments that
are as conservative as thought appropriate by
even those investors who are evaluating the
audit report in the presence of negative outcome
information.
However, other studies indicate that auditors
are more likely to allow their clients to take
aggressive accounting positions when the relevant
evidence or precedents o?er more room for inter-
pretation. For example, Nelson and Kinney (1997)
provide evidence that auditors are more (less)
conservative than users required when the relevant
evidence was precise (ambiguous). Similarly, Salt-
erio and Koonce (1997) provide evidence that
auditors’ treatment of clients’ capitalization versus
deferral decisions depends on whether the relevant
precedents unanimously favor one alternative.
When the precedents favor one alternative, audi-
tors follow the precedents, but when the pre-
cedents are mixed, auditors tend to follow their
client’s preference. Mayhew, Schatzberg, and Sev-
cik (2000) provide consistent evidence in experi-
mental markets. When participants in the role of
auditor were sure of the appropriate disclosure,
they made that disclosure, but as their uncertainty
about appropriate disclosure increased, they ten-
ded to misreport in favor of their client.
Other studies have focused on the role of spe-
ci?c incentives in auditors’ reporting decisions.
For example, Hackenbrack and Nelson (1996)
provide evidence that auditors are more likely to
allow their clients to take aggressive accounting
positions if the auditors’ litigation risk is reduced,
and that auditors justify the aggressive position
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 781
with aggressive interpretations of the relevant
?nancial accounting regulations. Hackenbrack
and Nelson hold constant the underlying audit
evidence while varying auditors’ incentives and
whether those incentives favored accrual or foot-
note disclosure of a contingency, allowing them to
infer with high con?dence that incentives were
driving the e?ects they observed. Using the same
case materials, Kennedy, Kleinmuntz, and Peecher
(1997) provide evidence that, even when litigation
risk is relatively high, auditors may tend to take
aggressive reporting positions when they can dif-
fuse personal responsibility by consulting other
experts within the ?rm. Wilks (2001) provides evi-
dence that auditors’ interpretations of evidence
and decisions are a?ected by the views of more
senior auditors. Beeler and Hunton (2001) provide
evidence that incentives from lowballing or man-
agement-advisory services a?ect audit partners’
going concern judgments. Bazerman, Morgan,
and Loewenstein (1997) suggest that auditors
cannot be independent because of the unconscious
e?ect of such incentives, or even because of a sense
of auditor–client a?liation that occurs through
multiple interactions. However, Dopuch and King
(1996) provide evidence that competitive pressures
can reduce the e?ect of incentives like lowballing,
and King (2001) provides evidence that, holding
constant economic incentives, professional–group
a?liation can o?set the in?uence of auditor–cli-
ent a?liation, demonstrating that o?setting
a?liations can have o?setting e?ects on auditors’
independence.
A smaller group of studies examines how man-
agers’ incentives a?ect the aggressiveness of their
reporting decisions. These studies take two
approaches. One approach is to elicit managers’
judgments directly. For example, Cloyd, Pratt,
and Stock (1996) gather data from corporate
?nancial executives at both public and private
manufacturing ?rms. They provide evidence
that, when a manager has selected an aggressive
tax treatment, the manager tends to choose a
?nancial accounting method that conforms to
the tax choice in hopes of better defending the
appropriateness of the tax choice if it is later
questioned by the IRS. Managers of public ?rms
were less likely to choose conformity than were
managers of private ?rms, presumably because
managers of public ?rms face more disincentives
for making income-decreasing ?nancial accounting
disclosures.
The second approach is to elicit the joint pro-
duct of the manager–auditor negotiation indirectly
from auditors. Three di?erent studies use di?erent
versions of this approach. Libby and Kinney
(2000) manipulate factors that a?ect managers’
incentives and ask auditors to determine how the
audited ?nancial statements would appear. They
provide evidence that correction of quantitatively
immaterial errors is much less likely if the correc-
tion would cause the ?rm to miss analysts’ EPS
forecasts (i.e. is qualitatively material), and that
the recently promulgated SAS 89 has little e?ect
on this behavior. Gibbins, Salterio, and Webb
(2000) develop a model of auditor–client negotia-
tion and support their model by surveying audi-
tors concerning their experiences negotiating
contentious accounting issues with their clients.
Nelson, Elliott, and Tarpley (2000) survey audi-
tors concerning their experiences with clients’
attempts to manage earnings, and provide evidence
concerning managers’ incentives for attempting
earnings management, the ?nancial accounting
areas in which managers attempt earnings man-
agement, and the circumstances under which
auditors pass or thwart managers’ attempts.
Overall, these studies provide direct evidence
that managers and auditors use the ?exibility
inherent in accounting rules to make disclosures
that are favored by their incentives. Holding con-
stant amount of ?exibility, changes in incentives
move disclosure in the direction favored by those
incentives. Holding incentives constant, increasing
?exibility increases the degree to which incentives
a?ect decisions.
Certainly one direction for future research is to
continue examining how managers’ and auditors’
incentives a?ect their decisions. In addition, the
literature could work more to identify the pro-
cesses through which these e?ects occur. To what
extent are these e?ects intentional and strategic
versus the unintended results of cognitive limita-
tions? Wilks (2001) provides evidence that incen-
tives a?ect decisions more when the incentives are
made apparent to subjects prior to evaluating
782 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
evidence, suggesting that incentive e?ects in?uence
the evaluation process as well as the decisions that
result from that process. Beeler and Hunton
(2001) provide evidence that incentives a?ect both
the favorability and weighting of evidence, and
that auditors believe that incentives a?ect other
auditors’ judgments, but not their own. A fruitful
direction for future research is to further under-
stand how and when such incentive e?ects occur.
Another useful direction is to examine how
changes in regulations or other interventions
might a?ect the aggressiveness of ?nancial report-
ing. For example, Libby and Kinney (2000), Hirst
and Hopkins (1998), and Maines and McDaniel
(2000) provide evidence of recent regulatory
changes that do not appear to prevent managers
from making aggressive reporting decisions. Cuc-
cia, Hackenbrack, and Nelson (1995) provide
evidence in a tax context that increasing the pre-
cision of a standard does not prevent aggressive
reporting when the underlying evidence also pro-
vides latitude for interpretation. When coupled
with evidence of the e?ect of incentives on report-
ing judgments, ?ndings indicating the ine?ective-
ness of some regulatory interventions suggest that
regulators might reduce aggressiveness more
e?ectively by addressing incentives directly via
changes in penalties. Alternatively, other approa-
ches like improvements in audit-evidence sequen-
cing (Phillips, 1999) or within-?rm consultation
(Kennedy et al., 1998) might also a?ect the
aggressiveness of ?nancial reports, by a?ecting the
extent to which auditors discourage aggressive
reporting.
Finally, future research could focus more on the
interaction among participants in the ?nancial
reporting process. Researchers are only beginning
to consider the process by which auditors negoti-
ate with their clients to produce the joint product
that investors consume. Also, the increasing role
of audit committees in this process remains lar-
gely uninvestigated. Addressing these issues via
experiments (e.g. Libby & Kinney, 2000), surveys
(e.g. Gibbins et al., 2000; Nelson, Elliot, & Tarp-
ley, 2000), and laboratory markets (e.g. Mayhew
et al., 2000) appear to be useful directions for
future research. These issues are discussed more in
Section 3.4.
3.2. How do information users interpret reports,
given their knowledge of the regulations governing
those reports, and their knowledge of the reporters’
incentives?
Three streams of literature address distinct
facets of this question:
1. How do accounting methods and disclosure
alternatives a?ect earnings predictions and
value estimates of investors and information
intermediaries?
2. How do investors and analysts use the time-
series properties of earnings to predict future
earnings?
3. What determines analysts’ forecasting and
valuation performance?
We discuss each in turn.
3.2.1. How do accounting methods and disclosure
alternatives a?ect earnings predictions and value
estimates of investors and information intermediaries?
The earliest experimental research in ?nancial
accounting tended to be motivated by the need for
evidence to address speci?c accounting policy
debates. These studies focused on whether inves-
tors and others adjusted appropriately for the
e?ects of accounting methods and disclosure
alternatives (e.g. Dyckman, 1964; Jensen, 1966).
Looking back on the earlier literature, it is readily
apparent that the answer to this question is
‘‘sometimes.’’ Some participants in nearly every
study of this type demonstrate some degree of
functional ?xation; they do not fully adjust for
di?erences in the e?ects of accounting alternatives
on the bottom line (Maines, 1995, p. 90, 91). As a
consequence, ?rms that are in identical economic
circumstances except for their choice of accounting
alternatives are sometimes judged to be di?erent.
These speci?c policy-oriented studies did little to
tell us how the extent of functional ?xation will
vary across types of decision makers or economic
circumstances, or what psychological processes
underlie insu?cient adjustments to accounting
policies. Consistent with this concern, much
recent research has heeded the advice of Maines
(1994) to focus on the dimensions of disclosure,
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 783
environmental factors, and processes that deter-
mine the degree to which appropriate adjustments
are made. In response to a recent call for more
speci?c policy-oriented experiments (Beresford,
1994), Maines (1994) noted that ‘‘Psychological and
sociological research may be most productively used
to guide behavioral accounting research on gen-
eral issues that underlie many di?erent accounting
standards, rather than focusing on issues relevant
to only one standard.’’ Understanding the e?ects
of these general factors will dramatically broaden
the relevance of this research.
Three groups of studies demonstrate progressive
re?nement in the manner in which this research
question has been addressed. The ?rst group
focuses on the mechanisms through which place-
ment and classi?cation of accounting disclosures
a?ect the use and interpretation of the disclosures.
The second group explicitly or implicitly recog-
nizes that managers issuing accounting reports
have their own strategic interests and will report
opportunistically, and examines how users
respond to voluntary disclosures by managers.
The third recognizes that analysts respond to their
own strategic interests and examines how users
respond to potential relationship induced bias in
analysts’ reports. We discuss each in turn.
3.2.2. General issues underlying functional ?xation
The development of category structures in
memory plays a major role in allowing expert
decision makers to respond e?ectively and e?-
ciently in complex decision environments. In these
structures, attributes are associated with cate-
gories as opposed to individual instances of the
category. An individual instance or event is then
interpreted based in part on its category member-
ship. This allows for e?cient and often e?ective
processing of attributes of the environment, but
sometimes produces errors when the particular
instance does not match the typical category
attributes well. A number of recent papers have
recognized that classi?cation issues like the
assignment of a ?nancial disclosure to a particular
?nancial statement, to a speci?c subsection within
a statement, or to the notes, will a?ect decision
makers’ categorization of that disclosure and
interpretation of its relevance and meaning.
Existing studies have examined three dimensions
of classi?cation. Hopkins (1996) examined the
e?ects of classi?cation of items on the right side of
the balance sheet as debt, equity, or mezzanine
?nancing on judgments of the stock price e?ects of
new ?nancing. He found that experienced buy-side
analysts who had knowledge of the di?erential
stock price e?ect of debt and equity issuances
found in ?nancial economics research responded
to the issuance of hybrid securities based on their
categorization. When the securities were classi?ed
as mezzanine, for which the analysts had no well-
de?ned category, they responded based on the
attributes of the individual security. Similarly,
Hopkins, Houston, and Peters (2000) examined
issues related to categorization of costs as operat-
ing expenses, one-time charges, or note disclosure.
Experienced buy-side analysts treated the account-
ing acquisition premium in a merger in part based
on its classi?cation. One-time charges and note
disclosures were treated as less relevant to stock
valuation than operating expenses. Finally, Hirst
and Hopkins (1998) and Maines and McDaniel
(2000) examined whether placement of elements of
comprehensive income on the income statement
versus the statement of stockholders’ equity a?ec-
ted the ability to detect earnings management and
changes in earnings volatility. Information placed
on the income statement (the primary perfor-
mance statement) was much more likely to be
treated as relevant to future performance esti-
mates by the experienced analysts in Hirst and
Hopkins (1998) as well as by the evening MBA
students in Maines and McDaniel (2000).
Maines and McDaniel (2000) also present the
beginnings of a theory of format e?ects. Their
theory lists ?ve factors that a?ect the degree to
which investors will rely on a particular disclosure
in assessments of corporate performance: place-
ment, labeling as income, linkage (to net income),
isolation, and degree of aggregation. Such a the-
ory holds the promise of allowing predictions of
e?ects beyond the scope of individual studies, as
Maines (1994) recommends. Future research can
re?ne and test the model in other circumstances.
Other studies identify the stage in the decision
process where any failure to adjust for accounting
or disclosure di?erences occurs. Following prior
784 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
credit analysis and auditing research (e.g. Abdel-
Khalik & El-Sheshi, 1980; Bonner, 1990), Lipe
(1998) uses a series of debrie?ng questions to
separate the e?ects of measurement from weight-
ing. She examines whether investors can accurately
assess the variance and covariance of returns in
making risk assessments and whether they use
those assessments in their investment decisions.
7
Maines and McDaniel (2000) use a combination
of debrie?ng questions and regression analysis to
determine whether di?erences in accessing the
information cues, interpreting or measuring the
cues, or weighting the cues caused their results.
They suggest that participants in all disclosure
conditions accessed and interpreted the cues in the
same manner, but weighted them more heavily in
the income statement presentation condition.
Another set of studies uses improved theories of
functional ?xation to de?ne ‘‘superior’’ disclosure
methods. Early studies only determined if di?erent
judgments or decisions are made and ignore the
issue of determining the superior disclosure
method. Many of the newer studies specify sub-
tasks necessary for successful ?nal judgments or
decisions, such as detection of earnings manage-
ment (Hirst & Hopkins, 1998), assessment of
variability in underlying ‘‘core’’ earnings (Maines
& McDaniel, 2000), or covariance assessment (Lipe,
1998). Alternatively, Maines, Mautz, Wright, Gra-
ham, Rosman, and Yardley (2000) approach the
question of assessing which disclosure method is
superior in a way similar to the training and deci-
sion aids literature in auditing. They suggest that
high quality reporting methods (1) allow novice
decision makers to perform like expert decision
makers and (2) allow the same decisions to be
made as completely disaggregated disclosures.
They apply their approach in a study of joint-ven-
ture ?nancial reporting standards. The approach is
consistent with the SEC and FASB’s concern for
the naive investor, as well as e?ciency concerns
and Hand’s (1990) suggestion of investor sophisti-
cation e?ects as a partial explanation for market
ine?ciencies. This study, Maines, McDaniel, and
Harris’s (1997) study of segment standards, and a
number of the above-mentioned are motivated in
part by a particular policy issue of current interest.
Again, we believe that their impact is determined
by their ability to relate the particular policy issue
of interest to more general phenomena that inform
a wider array of policy questions.
3.2.3. Responses to voluntary disclosures
The studies discussed above implicitly assume
that disclosures are generated by a neutral process.
However, managers issuing accounting reports
generally have their own strategic interests and
will report opportunistically. A number of studies
address how this strategic element a?ects users’
decisions.
The ?rst two studies examine the e?ects of the
form of disclosures. Kennedy, Mitchell, and Sefcik
(1998) examine how investors interpret the di?er-
ent allowable forms of contingent environmental
liability disclosure: minimum, best estimate, max-
imum, or range of the distribution. Experienced
?nancial executive, manager, banker, and MBA
student participants’ assessments of the distribu-
tion of possible losses implied by each disclosure
did not match the commonly accepted meaning of
the terms. For example, when the ‘‘best estimate’’
was disclosed by management, the participants
interpreted it as the minimum, and when a range
was disclosed, the participants’ estimates of the
expected value were well above the midpoint of
the range. The participants clearly believed that
managers bias their disclosures downward.
8
It also
indicates that accounting information has di?erent
e?ects on di?erent judgments, in this case, man-
agement credibility and ?rm value.
Hirst, Koonce, and Miller (1999) examine
investors’ interpretation of point versus range
forecasts and historic forecast accuracy on earnings
7
She also examines how they react when market and
accounting measures con?ict. Her study is unique at this point
in jointly examining the role of accounting and non-accounting
information. It also suggests the possibility that the weight
placed on normatively relevant information may change with
the inclusion of less-relevant information and presents a
potential explanation for the lack of diversi?cation of indivi-
dual portfolios.
8
Participants also believed that managers that decided to
disclose the minimum were the least credible, yet they valued
their ?rms the most highly. This suggests that the accounting
standard provides managers with a perverse incentive to pro-
vide the least informative disclosure.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 785
forecasts and con?dence in forecasts (which they
relate to trading). If both of these forecast attri-
butes indicate precision of the forecast, they both
should a?ect forecasts and con?dence. However,
only prior accuracy had an e?ect on earnings
forecasts, while both factors a?ected con?dence
and trading. This again indicates that normatively
relevant attributes of accounting information may
a?ect some judgments and decisions but not others.
Libby and Tan (1999) and Tan, Libby, and
Hunton (2000) investigate the e?ects of earnings
warnings or preannouncements on sell-side ana-
lysts’ forecasts of future periods’ earnings. Libby
and Tan provide a demonstration of the process
through which the same disclosure can have dif-
ferential e?ects on di?erent judgments and deci-
sions. They examine why analysts say in the press
that they reward ?rms that warn, yet punish them
in their forecasts. They demonstrate that this
inconsistency results from the simultaneous pro-
cessing of the warning and earnings announce-
ment in answers to press questions versus the
sequential processing of the same signals in the
forecasting setting. Tan, Libby, and Hunton (2000)
demonstrate that ?rms that low-ball preanno-
uncements of both positive and negative earnings
surprises will receive higher forecasts for future
period’s earnings, even though the reporting man-
agers themselves are judged as having lower
integrity and competence. Also, analysts are aware
of management’s tendency to low-ball the pre-
announcements, but do not adjust their estimates
of earnings of ?rst time preannouncers in light of
this base rate knowledge. This again indicates that
known attributes of accounting information do
not a?ect all judgments in the same manner.
3.2.4. Responses to analyst’s forecasts
Hirst, Koonce, and Simko (1995) and Ackert,
Church, and Shehata (1997) investigate the e?ects
of potential bias in analysts’ reports on investors’
use of those reports. MBA student subjects in
Hirst, Koonce, and Simko (1995) expected ana-
lysts whose employers also provide investment
banking services to the company to be more
biased than those that do not. However, this per-
ceived bias only a?ected their reliance on the report
when the report gave a negative recommendation.
Similarly, the strength of the analysts’ arguments
had an e?ect only for negative recommendations.
Ackert, Church, and Shehata (1997) extend this
study to a multiperiod setting where subjects have
the option to acquire forecasts from analysts, and
also observe actual earnings. Individuals were
much less willing to acquire analysts’ forecasts
that proved to be biased in the past, even when the
forecast information was useful. Both studies sug-
gest the need to better understand the processes
that determine when reports from analysts and
other information intermediaries will be purchased
and relied upon.
A general picture emerges from the above stud-
ies. First, management’s often cited (Beresford,
1994) preoccupation with the bottomline, and more
speci?cally with potential penalties for earnings
volatility and e?ects of cosmetic di?erences,
appears at least in part well founded. Second, we
have begun to understand that placement, categor-
ization, and labeling all play a role in the simpli?-
cations that even professional analysts apply when
evaluating accounting information. Future research
on the knowledge structures developed by experts
for di?erent types of companies and di?erent types
of ?nancial judgments and decisions promises to
increase our understanding of these e?ects.
It is also clear from the above results that the
information that decision makers rely upon in
their judgments is limited, and the information
emphasized clearly changes depending on the
?nancial judgment being made and other elements
of the environment. In fact, awareness of cosmetic
di?erences (and ability to ‘‘do the math’’) does not
ensure full consideration of their implications for
valuation. The same is true of knowledge of man-
agement’s tendency to opportunistically employ
vague reporting standards or analysts’ tendency to
bias their reports. There appear to be many cases
where the same normatively relevant factors are
ignored in one circumstance, but adequately
weighted in another by the same decision maker.
The fact that results here tie closely to archival
data gathered in prior studies adds to the cred-
ibility of the results. Future studies should focus
on systematically determining the circumstances in
which di?erent classes of information receive ?rst-
order consideration.
786 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
Earlier research on the e?ect of task complexity
on the use of alternative decision rules in credit
decisions (e.g. Biggs, Bedard, Gaber, & Linsmeier,
1985; Paquette & Kida, 1988; see Payne, Bettman,
& Johnson, 1992 for a review of psychological
studies) will provide some guidance in this area.
However, it appears that the determinants of which
information items receive ?rst order consideration
in particular judgment situations involves more
than task complexity. Findings of the importance
of cue-response compatibility (Slovic & Lichten-
stein, 1968) and other task determinants of cue
usage in early judgment and decision making
research (e.g. Einhorn & Hogarth, 1981; Slovic &
Lichtenstein, 1971) may provide useful directions
for future research in this area. Furthermore, the
interplay between these factors, investor sophistica-
tion and e?ort, and various market attributes dis-
cussed in Section 3.3 appear critical in determining
the importance of cosmetic disclosure di?erences.
3.2.5. How do investors and analysts use the time-
series properties of earnings to predict future
earnings?
Post-earnings-announcement drift has become a
very active stream of archival research. Bernard
and Thomas (1990) provide evidence that drift
arises because investors misperceive the time-series
of earnings. Speci?cally, quarterly earnings follow a
Brown–Roze? model, which has two key elements.
One element is the autoregressive component —
changes from one quarter of one year to the same
quarter of the next tend to be positively auto-
correlated. The other element is the ‘‘moving
average’’ component — the di?erences between
actual and predicted earnings tend to be negatively
correlated from one quarter to the same quarter of
the next year. Research by Bernard and Thomas
(1990) and Ball and Bartov (1996) indicate that
investors underestimate both the autoregressive
and moving-average components of quarterly
earnings; results from Abarbanell and Bernard
(1992) indicate that analysts make a similar mistake.
Recent studies have used the advantages of the
experimental approach to understand the psycho-
logical nature of investors’ and analysts’ time-ser-
ies prediction errors. Calegari and Fargher (1997)
provides a logical starting point — they attempt to
replicate drift in the laboratory, using experi-
mental controls to rule out the possibility that
prediction errors are driven by factors other than
judgment errors.
9
Just as archival studies focus
only on ?rms with extreme earnings surprises,
Calegari and Fargher use time series that exhibit
unusually large earnings changes in the most
recent quarter. Their results are largely consistent
with archival research — both individual traders
and market prices underreact to earnings surprises.
Maines and Hand (1996) extend this ?nding in
two ways. First, they present MBA students with
two di?erent 40-quarter time-series. One series has
strong autoregressive and moving-average com-
ponents. Another is simply a seasonal random
walk with no such components. Subjects under-
react to both elements when they are present, but
also act as if the autoregressive element is present
when it is not. This suggests that drift may arise in
the target environment simply because it is too
di?cult for investors to discern the autoregressive
and moving average terms. Drift may therefore be
less severe for ?rms that adhere more closely to a
seasonal random walk. Second, Maines and Hand
directly test Bernard’s (1993) hypothesis that
investors anchor too strongly on earnings from the
same quarter of the previous year, perhaps
because it is stressed in the reporting format used
in the popular press. Maines and Hand test this
supposition by presenting a new set of subjects
with a Brown–Roze? time-series, and reporting
earnings relative to earnings from four quarters
ago. The results raise doubts about Bernard and
Thomas’s (1990) hypothesis, because these sub-
jects place even more weight on the autoregressive
component of the time series. These results suggest
the need to test for alternative causes.
Bloom?eld, Libby, and Nelson (2000a) argue
that drift may arise because people naturally over-
rely on unreliable information (Bloom?eld, Libby,
& Nelson, 2000b; Gri?n & Tversky, 1992), and
old earnings numbers tend to be unreliable pre-
dictors of future earnings, once more current
9
For example, investors and analysts could appear to make
prediction errors in archival studies because they respond to
information other than earnings, because they have incentives
for something other than prediction accuracy, or because they
are attempting to manage risk.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 787
earnings are known. They test this hypothesis by
manipulating information about old earnings per-
formance, holding recent earnings performance
constant. Student subjects rely much too heavily
on old earnings numbers, and generate errors
consistent with post-earnings-announcement drift,
even when they are presented with a time series
that is much simpler than that used in other
experiments. This suggests that drift may not arise
merely because the time-series properties of earn-
ings are so complex.
Future research in time-series perceptions might
follow several directions. One direction is to inte-
grate the di?erent research approaches described
above. The realistic time-series used by Calegari
and Fargher (1997) and Maines and Hand (1996)
allow them to generalize their results readily to
archival settings, but make it di?cult for them to
ascertain how aspects of the time-series data
interact with psychological processes to cause
prediction errors. The simpler time-series data
used in Bloom?eld, Libby, and Nelson (2000a)
poses precisely the opposite problem. Future
research might attempt to work toward the middle
of these two approaches, either by using time-
series that are progressively simpler than in the
former studies, or progressively more complex
than in the latter study.
Future research might also investigate the model
of Barberis, Shleifer, and Vishny (1998). That
model assumes that earnings follow a random
walk, but that investors believe that earnings
switch between regimes of positive autocorrelation
and regimes of negative autocorrelation. This
misperception results in both underreactions to
recent earnings changes and overreactions to long-
termtrends. While such misperceptions are broadly
consistent with psychological ?ndings indicating
representativeness and conservatism biases, no
single study supports its assumptions, and their
predictions are not entirely consistent with archi-
val evidence (e.g. Lee and Swaminathan, 2000).
Finally, future studies might attempt to inte-
grate research on time-series predictions with
other research streams that consider earnings pre-
diction more broadly. For example, how might
knowledge of earnings components (accruals, cash
?ows) alter subjects’ time-series predictions?
3.2.6. What personal and process attributes
determine analysts’ forecasting and valuation
performance?
As Maines (1995) notes, a number of studies in
the 1970s and 1980s examined the manner in
which expert and novice analysts process
accounting information (e.g. Mear & Firth, 1987;
Panko? & Virgil, 1970; Slovic, Fleissner, & Bau-
man, 1972; Wright, 1977). The studies assessed
various characteristics of information search, cue
weighting, judgment consistency and consensus,
and self-insight into information processing. A
number of the more recent studies in this group
used detailed process tracing techniques in an
attempt to tie individual or process attributes to
judgment accuracy (e.g. Anderson, 1988; Biggs,
1984; Bouwman, 1984). However, most studies
were only able to relate process attributes to
experience because of subject sample constraints
or di?culty in measuring judgment performance.
These earlier experiments also did not focus on the
e?ects of analysts’ incentives, which have received
a great deal of attention in recent archival studies.
Three recent studies have added substantially to
our understanding of the relationship of personal
and process variables to forecast accuracy as well
as the impact of relationship incentives on bias in
forecasts. Hunton and McEwen (1997) emphasize
both process measurement and disentangling
variables that are confounded in natural settings.
They address whether sell-side analysts’ search
strategies and incentives (in the form of their rela-
tionship to the company) a?ected the accuracy
and bias of their earnings forecasts. Information
search strategy was assessed with an eye move-
ment measurement system that eliminates most
concerns about the reactivity and validity of ver-
bal protocols. The authors measured the accuracy
of the forecasts made in the experiment as well as
historical accuracy from company archives, which
assures external validity. Analysts that followed a
more directed (as opposed to sequential) search
strategies were more accurate both in the experi-
mental task and in practice. The analysts in the
underwriting condition gave higher (more biased)
forecasts than those in the following condition,
which were higher than those in the no relation-
ship condition. Careful use of controls eliminates
788 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
concerns about omitted variables such as informa-
tion availability, time on task, and some forms of
selection that could have explained similar ?ndings
in archival studies (see Kothari, 2000 for a review).
Few studies have examined the knowledge and
abilities that lead to successful performance by
analysts. Ghosh and Whitecotton (1997) present
evidence that two standard psychometric measures
of information processing ability (perceptual abil-
ity and tolerance for ambiguity) were correlated
with forecast accuracy. But, as in Hunton and
McEwen (1997), experience was unrelated to
accuracy. However, Whitecotton (1996) reports
that experienced analysts outperformed MBA
students, who outperformed undergraduate stu-
dents, though the experienced analysts were the
most over optimistic.
Like similar work in auditing, these ?ndings are
potentially relevant to the selection and training of
analysts, as well as the interpretation of their
forecasts and reports. Again, the fact that results
here tie closely to archival data, gathered either in
the same study in the case of Hunton and McE-
wen’s (1997) accuracy measures, or in prior studies
in the case of their incentives ?ndings, adds to the
credibility of the results. Recent archival studies
by Mikhail, Walther, and Willis (1997), Clement
(1999), and Jacob, Lys, and Neale (1999) have
documented di?erences in the experiences of more
and less accurate analysts that may indicate direc-
tions for future research. In the auditing literature,
expertise studies have re?ned such ?ndings in studies
that specify the knowledge necessary to complete
various tasks, when it is acquired, and the
mechanisms through which knowledge content and
structure a?ect performance. These studies can
provide guidance for future ?nancial accounting
research in this area. Other recent work has begun
to look at how these individual responses a?ect
market-level performance and the characteristics
of markets that will a?ect information dissemina-
tion. This research is discussed in the next section.
3.3. How do individual responses to information
a?ect market-level phenomena?
Early experimental research in ?nancial account-
ing implicitly assumed that individual behavior
would a?ect market-level prices in some straight-
forward manner (e.g. the price might be simply the
average of all investors’ beliefs), and that some
investors would lose money to more sophisticated
investors by trading unwisely at market prices.
Counter-arguments by proponents of the e?cient
markets hypothesis have led many experimental
researchers to make these assumptions explicit and
subject them to testing. We divide this literature
into three lines: those that address di?erences
between individual and aggregate behavior, infor-
mation aggregation, and excess trading volume.
3.3.1. Di?erences between individual and
aggregate behavior
A number of papers examine whether or not
individual responses to information extend to the
market level. Two papers examine whether indivi-
dual responses to risk extend to the market level.
Coller (1996) shows that both individual traders
and market prices respond to uncertainty in public
disclosures in a manner roughly consistent with
Bayesian rationality. Bloom?eld and Wilks (2000)
show that, consistent with theoretical and archival
work on disclosure, more accurate disclosures
increase individual and market prices relative to
expected values, and also increase individual and
market liquidity. A larger number of papers show
that biases in individual decisions result in biased
market prices as well. For example, Calegari and
Fargher (1997) show that post-earnings-announce-
ment drift persists in a double auction market, and
Bloom?eld, Libby, and Nelson (2000a) show that
over-reliance on previous years’ earnings persists
in a clearinghouse market. Tuttle, Coller, and
Burton (1997) show that recency e?ects extend to
the market level.
Dietrich, Kachelmeier, Kleinmuntz, and Lins-
meier (2000) conduct a study closely related to the
functional ?xation (e.g. Hopkins, 1996) and volun-
tary disclosure (e.g. Kennedy et al., 1998) studies
discussed in Section 3.2.1. They demonstrate that
more explicit disclosure of accounting information
about oil-producing properties leads to more e?-
cient market prices even though the same infor-
mation can be inferred from the balance sheet and
income statement. Di?erent disclosure forms
either mitigate or exacerbate biases in prices. The
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 789
authors test their process explanation by tying
individual participant’s behavior to prices to
ensure that the market price results are the result
of individual information processing biases.
Other research investigates how competitive for-
ces might allow less biased traders to have more
in?uence on price, and use that explanation to
guide examination of when this is more likely to
occur. Of particular interest is the ‘‘smart-trader’’
hypothesis, which states that traders who are less
susceptible to the bias trade more actively than
other traders, driving prices to unbiased levels
(Camerer, 1987, 1992). The intuition behind this
hypothesis underlies the strong-form of the e?-
cient markets hypothesis, which states that prices
will fully re?ect information even if it is held only
by a small number of traders.
Anderson and Sunder (1995) provide evidence
that the smart-trader hypothesis might be more
predictive among professional traders than among
student traders. They compare the extent of base-
rate neglect in markets involving student subjects
with the bias in markets involving professional
traders. They report that price biases in markets of
professional traders exhibit less base-rate neglect
over time, while price biases in markets of students
do not. This is so even though the professional
traders’ individual value estimates do not appear
to di?er from the students’ estimates. This suggests
that the professional traders are able to trade in a
way that reduces bias more (or increases it less).
Bloom?eld, Libby, and Nelson (1996) provide
evidence favoring the smart-trader hypothesis in a
market in which security values are determined by
the answer to general business knowledge ques-
tions. Traders with more accurate answers do
indeed trade more actively than other traders.
When prices are in?uenced by trading volume,
prices become more accurate than the simple
average of all traders’ value estimates. (Prices are
no more accurate than average estimates when
they are not in?uenced by trading volume.) This
study might support the smart-trader hypothesis
more strongly than the studies above because
inaccurate traders are not biased, but merely
uninformed. It is possible that uninformed peo-
ple are more likely to know that their answers
are inaccurate (and therefore trade less aggres-
sively) than biased people, because biases are
unconscious.
Kachelmeier (1996a) uses an analysis of bids
and asks to show the di?culty in determining
exactly how markets can debias prices. He induces
a sunk-cost fallacy that signi?cantly increases sell-
ers’ asking prices and buyers’ bidding prices.
However, these biases have no e?ect on transac-
tion prices, because the higher bids and asks cause
more trades to take place at the bids, which keeps
prices low.
Other recent studies show that market structure
can be important in determining when the smart-
trader hypothesis is likely to be supported.
Ganguly, Kagel, and Moser (1994) present student
subjects with a problem that leads to base-rate
neglect. They ?nd that, because traders are not
allowed to sell shares they do not own (short-sell-
ing is prohibited), market prices are set by the
traders with the highest valuation. As a result,
market prices exhibit base-rate neglect most
strongly (weakly) when the biased prices are
higher (lower) than the Bayesian expected values.
Bloom?eld and Wilks (2000) ?nd strong indivi-
dual evidence of an ‘‘endowment’’ e?ect — incon-
sistent with Bayesian optimization, traders choose
higher ask (selling) prices for riskier securities,
even as they simultaneously enter lower bid (buy-
ing) prices. However, higher risk does not cause
the market ask price to rise. This form of irra-
tionality at the individual level is eliminated at the
market level because the market ask is determined
by the lowest individual ask. The market ask,
therefore, re?ects the selling price of the investor
who succumbs least to the endowment e?ect. In
this way, the structure of the market combines
with the nature of the bias to mitigate the bias at
the market level.
Future research could examine the foundations
of the smart-trader hypothesis more directly. In
particular, what factors might induce less-biased
traders to exploit biases, or keep them from doing
so? What factors might make more-biased traders
curtail their trading activity? How might changes
in market structure, or the degree of market depth
and liquidity, a?ect bias mitigation? (Archival
studies routinely show larger biases in less liquid
stocks.) Future research could also examine how
790 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
the nature of ?nancial accounting information will
a?ect the di?erence between individual and
aggregate behavior. To the extent that informa-
tion induces biases, rather than degrees of
informedness that di?er across traders, prices
would seem more likely to represent an average of
all traders’ beliefs.
3.3.2. Information aggregation and underreaction
A di?erent stream of research examines the
ability of ?nancial markets to aggregate informa-
tion held by di?erent traders. Like studies of the
smart-trader hypothesis, aggregation studies are
motivated by the belief that traders who know a
security value does not re?ect their own informa-
tion will trade aggressively to exploit that fact,
thereby revealing their information to the market.
Early studies on information aggregation
showed that markets do often aggregate informa-
tion. They do so most e?ectively when security
values are tied to states of nature in very simple
ways (O’Brien & Srivastava, 1991; Plott & Sunder,
1988), and when experienced traders have com-
mon knowledge regarding the information envir-
onment (Forsythe & Lundholm, 1990).
More recent studies have examined how uncer-
tainty a?ects information aggregation. In a series
of double-auction markets, Lundholm (1991)
manipulates the ‘‘aggregate uncertainty’’ that
remains after combining investors’ information
about security value. He ?nds that markets with
aggregate uncertainty aggregate information much
less e?ciently than those with aggregate certainty.
Imperfect aggregation can lead markets to under-
react to information, because prices will be too
high when the aggregate information indicates a
very low value, and too low when the aggregate
information indicates a very high value. Bloom-
?eld (1996a, 1996b) shows a similar type of
underreaction in a setting which allows aggregate
certainty, but in which the information structure is
su?ciently complex that information aggregation
is still very di?cult.
Other papers show that market prices can even
underreact to public information that need not be
aggregated. Gillette, Stevens, Watts and Williams
(1999) construct a market in which security values
are determined by a sequence of random dividends.
The authors analyze the market’s reactions as the
dividends are announced publicly one-by-one.
They ?nd that the individual traders’ estimates of
value underreact slightly to the dividend announce-
ments, possibly because they erroneously believe
that random events tend to reverse over time (the
‘‘gambler’s fallacy’’). More interesting is the fact
that market prices underreact substantially more
than individual value estimates. The reason for
this sluggishness in market prices is not clear, but
the authors replicate it in both double-auctions
and call markets. Bloom?eld, Libby, and Nelson
(2000b) also observe a similar e?ect in clearing-
house markets. Bloom?eld (1996a) shows that
markets react to a public signal when it is subject
to manipulation by a self-interested seller, but not
when the signal is purely random. These results
raise the possibility that post-earnings-announce-
ment drift and underreactions to other informa-
tion (e.g. fundamental values, analysts’ estimates)
may arise simply due to a generic underreaction of
market prices to information, rather than infor-
mation-speci?c biases.
Several future directions for research in this area
entail making endogenous the distribution of
information among subjects. All of the aggrega-
tion studies described above manipulate informa-
tion distribution by exogenously altering who is
given information and who is not. Future studies
might relax this assumption by recognizing that
collection of information is an intentional action
that is driven in part by the perceived bene?t of
becoming informed, as in Tucker (1997). Alter-
natively, one might recognize that some informa-
tion may be e?ectively widely distributed because
it is more easily analyzed. For example, Sloan’s
(1996) archival evidence that prices are too high
(low) when ?rms have high (low) accruals might
simply re?ect an underreaction to ?nancial state-
ment information that is not widely known. This
result is consistent with Bloom?eld and Libby’s
(1996) ?nding that laboratory markets respond
more strongly to information that is more widely
available. However, a more direct test of this
hypothesis would be to give all traders the same
information (e.g. a complete ?nancial statement),
and vary the ease with which the information can
be analyzed (as in Dietrich et al., 2000), as well as
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 791
the traders’ knowledge and training that would
help with such analysis.
More generally, researchers might start with the
features we argue are essential for progress in
functional ?xation research — explicitly under-
standing how people process and interpret the
information in ?nancial statements, and then con-
sidering how di?erences in that processing might
alter market behavior.
3.3.3. Trading volume
A third line of research examines the determi-
nants of trading volume in laboratory markets.
Many of these studies are motivated by a general-
ization of the ‘‘no-trade’’ theorem (Milgrom &
Stokey, 1982), which shows that under fairly gen-
eral conditions, information releases should not
induce any trading between traders. The intuition
is that if one trader expects to make money trad-
ing at a given price, the trader on the other side of
the transaction must expect to lose money (since
trading is a zero-sum game).
Gillette et al. (1999) ?nd routine violations of
the no-trade theorem: trading volume is generally
quite high, and is even higher after very high or
low dividend announcements. These results are
consistent with archival evidence on trading
volume (e.g. Bamber, 1987; Bamber, Barron, &
Stober, 1997), which have generated a number of
theoretical models that generate trade through
complex interactions between public and private
information (e.g. Kim & Verrecchia, 1994). How-
ever, the simplicity of the market in Gillette et al.
(1999) makes such explanations unlikely.
Excess trading is a puzzle in Gillette et al.
(1999), but it has few welfare implications because
all traders are identical, and therefore wealth
transfers can be ignored (or are at best impossible
to interpret). Bloom?eld, Libby, and Nelson
(1999) examine excess trading that has very clear
welfare implications. They create markets in which
less-informed traders hold a subset of the infor-
mation available to better-informed traders. Less-
informed traders unwisely trade with — and lose
money to — the more-informed traders. However,
additional instructions that clarify to less-
informed investors the extent of their informa-
tional disadvantages reduce these wealth transfers
(although it has no apparent e?ect on price bia-
ses). These results have regulatory implications:
less sophisticated individual investors (who have
less information than more sophisticated indivi-
duals or institutional investors) can be protected
by regulations that emphasize the extent of their
informational disadvantage.
There appear to be a number of open questions
related to trading volume. Archival papers have
examined volume in response to earnings
announcements, or tie volume to pricing anoma-
lies (Lee & Swaminathan, 2000; Swaminathan &
Lee, 2000). These ?ndings may be caused by fac-
tors indicated in economic models (e.g. Kim &
Verrecchia, 1994) or by psychological factors. The
literature on motivated reasoning seems particu-
larly promising, because it examines how initial
variations in beliefs and preferences can be mag-
ni?ed by ambiguous public disclosures of infor-
mation (Wilks, 2001).
3.4. How do strategic interactions between
reporters and users of information a?ect reporting
and market outcomes?
Game theory has been exceptionally useful in
modeling the strategic interactions between sellers
(who can make reports about their value) and
buyers who rely on those reports in making their
trading decisions. These models potentially have
regulatory implications, because they show that
seemingly reasonable regulations may be unneces-
sary or unwise when one considers the joint
response of buyers and sellers to the regulation.
The models are very di?cult to test with archival
methods, because their predictions are derived in
settings that are far simpler than natural markets.
However, a number of experimental researchers
have chosen to examine behavior in settings that
closely resemble those described in the models. In
this section, we brie?y review some of these
experiments.
One line of research examines voluntary dis-
closure models, in which sellers choose between
honestly disclosing the exact value of the security
they are selling, and not disclosing anything at all.
Two papers by King and Wallin ?nd strong sup-
port for the qualitative predictions of the models
792 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
of Jung and Kwon (1988), and Wagenhofer
(1990). King and Wallin (1991b) ?nd that increas-
ing the probability that the seller is informed leads
sellers to disclose more often, and also leads buyers
to draw more unfavorable inferences when they do
not observe disclosure (making disclosure a wise
strategy for sellers). King and Wallin (1995) show
that disclosure is also limited by introducing a cost
to disclosing favorable information (a competitor
who will take advantage of favorable disclosures
to enter the seller’s product market), because even
high-value ?rms might choose not to disclose. In
both cases, however, results deviate substantially
from the point predictions of the models.
Forsythe, Lundholm, and Reitz (1999) show
how disclosure regulations a?ect the welfare of
buyers and sellers in a simple market with volun-
tary disclosure. When sellers are not permitted to
disclose their information about value, many sur-
plus-enhancing transactions do not occur, and
both buyers and sellers su?er. Allowing sellers to
disclose any value (even a false one) increases
market surplus, but these gains accrue almost
entirely to the sellers. Requiring sellers’ reports to
include the true value shifts part of this surplus
from the sellers to the buyers.
King (1996) examines whether disclosure pat-
terns change when sellers have an opportunity to
develop reputations. He permits sellers to report
any value they wish, but imposes a cost on buyers
when the seller’s report is inaccurate. This setting
includes two equilibria. In an ‘‘in?ation’’ equili-
brium, sellers always report the highest value, and
buyers pay expected value net of the cost of inac-
curacy. In a ‘‘reputation’’ equilibrium, the seller
reports honestly, and the buyers believe the reports
until the seller reports dishonestly; at that point,
the players revert to the in?ation equilibrium.
King ?nds that an exogenous cost for inaccuracy
does permit reputation formation, but that the
reputation equilibrium arises only in a few cases.
There are several natural directions for research
in strategic disclosure. There is certainly no short-
age of new disclosure models to test. However, it is
probably more important for researchers to begin
to delve into how and why various equilibria do
and do not have predictive power. Some research-
ers have begun doing so by asking whether
‘‘adaptive’’ strategies (doing more of strategies
that performed better in the past) lead to a given
equilibrium. For example, King and Wallin (1995)
?nd little support for an ‘‘adaptively unstable’’
equilibrium that is not the end result of adaptive
behavior. Other researchers focus more directly on
the players’ thought processes. For example,
experiments by Bloom?eld and Hales (2000)
examine how sellers’ abilities to form reputations
for honest reporting are in?uenced by buyers’ and
sellers’ expectations of one another’s likely beha-
vior and beliefs.
Future research might also begin to integrate
disclosure research with the other literatures
described in this section. For example, Bloom?eld
(1996a) integrates the disclosure literature with the
information aggregation literature by showing that
sellers are willing to pay a fee to in?ate a public
signal, even though the information available to
the market as a whole is unchanged. They are will-
ing to do this because markets tend to react more
strongly to information held by more investors.
Researchers might also integrate economics-
based disclosure research with the psychology-based
literature described in Section 3.1. That research
focuses on how investors could use ?nancial
reporting choices to draw inferences about man-
agers’ incentives and information, but ignores the
fact that managers should anticipate investors’
reactions. On the other hand, the psychology-
based research presents a more comprehensive
treatment of ?nancial accounting institutions, by
allowing managers to choose how to classify and
report accounting information. We believe it
would be worthwhile — though di?cult — to
examine fully strategic interactions in more complex
accounting institutions. Researchers in ?nancial
accounting might also attempt to integrate game
theory and social psychology, as has been done
successfully in the auditing context by King (2001).
4. E?ective and e?cient research design:
methodological considerations in experiments
Section 3 presented a number of directions for
future experiments. In this section, we discuss
how these experiments can be designed to be
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 793
both e?cient and e?ective. An experiment is e?-
cient if it achieves a given level of e?ectiveness as
economically as possible. An experiment is e?ec-
tive if it provides evidence of su?cient internal
validity that readers should believe the results of
hypothesis tests, while being of su?cient external
validity that it bears on a signi?cant part of the
?nancial accounting issue of interest.
10
Both
internal and external validity are key to e?ective-
ness. An experiment that lacks internal validity
fails by providing a misleading indication of the
relation between the dependent and independent
variable, while an experiment that lacks external
validity produces results that are (or at least
should be) divorced from the motivation of the
study. We do not provide an exhaustive treatment
of research design (see Kinney, 1986; Runkel &
McGrath, 1972; Trotman, 1996 for more compre-
hensive discussions). Rather, we focus on issues
that we believe are particularly important or are
often misunderstood. Section 4.1 addresses tech-
niques for maximizing e?ectiveness through care-
ful hypothesis development and research design.
Section 4.2 addresses when it is (and is not) possi-
ble to improve e?ciency by consuming fewer
resources without sacri?cing e?ectiveness. We
address the number and type of subjects used in
the experiment, the payment of monetary incen-
tives, the use of within-subject designs, and the
decision to use single-person tasks rather than
interactive tasks (such as ?nancial markets or
strategic reporting settings).
4.1. Increasing experimental e?ectiveness
We organize our discussion of experimental
e?ectiveness around the predictive validity model
(Libby, 1981; Runkel & McGrath, 1972). This
model provides a useful description of the
hypothesis testing process, and focuses our atten-
tion on the key determinants of the internal and
external validity of a research design.
Fig. 1 illustrates the predictive validity model as
it applies to Hypothesis H1b from Hunton and
McEwen (1997; hereafter, HM). As noted earlier,
based on prior theory and evidence HM hypothe-
sized that sell-side analysts’ relationship-based
incentives would decrease their forecast accuracy.
Analysts’ relationship-based incentives were oper-
ationally de?ned as a three-level independent
variable: an ‘‘underwriting relationship’’ that has a
direct impact on fees, a ‘‘following relationship’’
that creates the need for future access to private
information, or ‘‘no future relationship.’’ HM
expect analysts in the underwriting condition to
provide the most optimistic forecasts, those who
follow the ?rm to be next most optimistic, and
analysts who do not follow the ?rm to be the least
optimistic. They operationally de?ne optimism
(the dependent variable) as the analysts’ forecast
minus the actual earnings outcome. HM also con-
trolled for a number of other potentially in?uential
variables including subject background, experi-
ence, time on task, and information availability.
In Fig. 1, link 1 depicts the relationship in HM’s
underlying theory. No theory can be tested
directly; rather, a theory is tested by assessing the
relationship between the operational de?nitions of
key concepts in the theory (i.e. by assessing link 4).
For this test to be valid, the links between the
concepts and the operational de?nitions (links 2
and 3) must be valid, and other factors that might
a?ect the dependent variable (link 5) must be
controlled or have no e?ect. A study’s internal and
external validity is determined by the validity of
these ?ve links. We now discuss ways in which
researchers can strengthen each of these links.
4.1.1. Link 1: theory and hypotheses
The ?rst determinant of experimental e?ective-
ness is speci?cation of a good research question. A
good research question addresses the relation
between two or more concepts, can be stated
clearly and unambiguously as a question, implies
the possibility of empirical testing, and is impor-
tant to the researcher and others (Kinney, 1986).
Experimental tests of research questions must
rely on some theory depicting forces that in?uence
behavior in the experimental setting. Theories may
range from highly speci?c numerical models (such
10
Internal validity is the degree to which you can be sure
that observed e?ects are the result of the independent variables.
External validity is the degree to which results can be general-
ized beyond the speci?c tasks, measurement methods, and par-
ticipants employed in the study.
794 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
as those derived from economics or arti?cial-
intelligence cognition models) to more general
qualitative predictions based on prior evidence
(such as systematic evidence that people use a
certain heuristic in a given setting). Regardless of
its nature, the theory suggests the expected answer
to the research question, and serves to guide the
many decisions and tradeo?s that must be made
during the design and administration of an experi-
ment. Whereas archival researchers analyze data
from secondary sources,
11
the experimental setting
is speci?cally designed to gather data relevant to
the hypotheses. Consequently, all stages of the
design of experiments are profoundly a?ected by
the need for a well-formulated research question
and hypotheses. In this section, we emphasize four
issues that are particularly important in develop-
ing good research questions and hypotheses in
experimental ?nancial accounting research.
First, the hypotheses must have external valid-
ity; that is, readers must believe that the theore-
tical concepts and the relationships between them
capture important aspects of the target environ-
ment. Although people often speak of external
validity as an aspect of experimental stimuli, we
consider it an element of theory as well. If the
theory and hypotheses are appropriately capturing
relationships among elements of the target envir-
onment, an internally valid experiment will test
that theory in a manner that generalizes to the tar-
get environment. External validity is established
empirically by extensions of the research that test
additional hypotheses concerning environmental
contingencies that de?ne the limits of generality of
the initial hypotheses (Trotman, 1996).
For example, HM’s research question of ‘‘Do
sell-side analysts’ relationships with the ?rms they
cover decrease their forecast accuracy?’’ relates an
antecedent (analysts’ relationships) and consequ-
ence (forecast accuracy) that clearly maps into ?rst
order concerns indicated by theory and prior evi-
dence. If the experiment operationalizes those
concepts well and provides an internally valid test
of their relation, it will provide insight into the
real-world e?ect of analysts’ incentives on their
judgments. Future research can then test the
extent to which those insights can be generalized.
Second, experimental research questions in
?nancial accounting should focus on how theories
drawn from fundamental disciplines (such as psy-
chology and economics) interact with details of
?nancial accounting institutions (as discussed in
Section 2.4). As Gibbins and Swieringa (1995)
suggest, accounting experiments should be ‘‘both
theory driven and setting sensitive.’’
Tying the accounting institution to theory from
a fundamental discipline allows hypotheses to
have relevance beyond the very speci?c practice
context that motivated the experiment (as recom-
mended by Maines, 1994). It also allows experi-
menters to contribute to both ?nancial accounting
and the fundamental discipline. For example,
Nelson and Kinney (1997) apply Einhorn and
Hogarth’s (1986) ambiguity model to predict how
Fig. 1. Predictive validity framework.
11
That is, the data is initially gathered for a di?erent purpose.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 795
ambiguity a?ects ?nancial statement auditors’ and
users’ judgments of appropriate contingent-liabi-
lity disclosure. Their study shows how the di?er-
ences between auditors’ and users’ incentives lead
auditors to use the discretion provided by ambig-
uous evidence to justify lower levels of disclosure
than users desire. This result is of clear interest to
?nancial accounting researchers, and also con-
tributes to psychologists’ understanding of how
incentives interact with ambiguity.
12
A more ambitious approach is to use funda-
mental disciplines to develop and experimentally
test a general theory that is applied to the ?nancial
accounting phenomenon of interest. For example,
Maines and McDaniel (2000) identify various
general dimensions of formats that signal infor-
mation importance or that a?ect the cognitive cost
of processing information (see also Lipe, 1998).
They apply their theory when testing whether
information-disclosure format a?ects considera-
tion of the volatility of unrealized gains and losses,
but their theory is much broader than the parti-
cular practice context that they examine.
Third, researchers should frame their theories at
the least speci?c level that can account for the data
expected to arise from the experiment. Stating the
theory with greater speci?city will simply encou-
rage readers to argue that the results are driven by
a slightly di?erent theory (such as a di?erent the-
ory of categorization) that yields identical predic-
tions in the experimental setting. Such debates are
rarely productive. If the distinction is likely to be
important in accounting settings, researchers inter-
ested in accounting issues should consider what
other experiments might illustrate this importance.
If the distinction is unlikely to have important
rami?cations for accounting settings, experiments
discriminating between such theories are more
appropriately seen as contributions to the funda-
mental disciplines from which the theory is drawn.
Finally, experimental research questions should
be based on a theory that describes causal rela-
tionships between concepts. As discussed above,
the key advantage of the experimental method lies
in its ability to disentangle factors that are con-
founded in natural settings, and thus provide
indications of how and why phenomena arise. A
causal theory also improves external validity,
because causal forces are more likely to generalize
to di?erent settings. This also leads to a preference
for research questions that focus on a directional
prediction of di?erences, as opposed to a single
point prediction. As Trotman (1996) indicates,
‘‘the basis of any experimental design is that one
or more independent variables are manipulated
and the e?ect on the dependent variable(s) is
observed.’’ Since experiments require abstraction
from the real world, any number of di?erences
between the experimental and real-world environ-
ments could a?ect the particular levels of observed
measures. Consequently, evidence consistent with
point predictions (e.g. ‘‘the market price will be
$5.00’’) and particular parameter estimates (e.g.
‘‘managers will weight current year’s earnings
twice as heavily as prior year’s earnings’’) are
unlikely to generalize to real-world environments.
Directional e?ects are more likely to generalize,
because di?erences between the experimental set-
ting and the target setting are more likely to alter
the magnitude of an e?ect than its direction. A
focus on directional e?ects also makes it much
easier to design an experiment that controls for
competing explanations. We discuss this latter
issue further in Section 4.1.3.
4.1.2. Links 2 and 3: operationalizing dependent
and independent variables
Link 2 relates the antecedent theoretical concept
A to the independent variable(s) operationalized
in the experiment. Link 3 relates the consequential
concept B to the dependent variable operationalized
in the experiment. An internally valid test requires
manipulation of each independent variable in a
way that changes only one theoretical antecedent
at a time. At the same time, they must construct
an operational dependent variable that measures
the conceptual variable, and that variable alone.
This section discusses three particularly di?cult
12
Of course, the theory should entail some element of doubt
before testing. Experiments applying psychology to accounting
settings can be uninteresting if readers are certain that the
results obtained in psychology will readily extend to accounting
even without seeing the experimental results. Experiments
applying economics to accounting settings can be uninteresting
if they are little more than complex ways of showing that peo-
ple prefer more money to less (Kachelmeier, 1996b).
796 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
issues in operationalizing variables: (1) choosing
the appropriate realism of the stimuli presented to
participants; (2) choosing the appropriate levels of
independent variables; and (3) using measured
independent variables.
4.1.2.1. Realism of stimuli. A common challenge
in operationalizing independent variables is decid-
ing how realistic the stimuli should be. The
appropriate level of realism in the operationaliza-
tion of an independent variable is determined by
the role of realism in the theory to be tested.
Experiments testing psychological theories typi-
cally present participants with more realistic sti-
muli than experiments testing economic theory,
because psychology-based experiments are typi-
cally focused on how participants make decisions
using cognitive processes and knowledge that
developed in response to their real-world educa-
tion, training, and experience. Without relatively
realistic stimuli, participants may not rely on the
cognitive processes and knowledge of interest. For
example, HM’s theory relates analysts’ knowledge
of their incentives to their earnings estimates. In
order to test this theory, the experiment must
provide the participants with a su?ciently realistic
stimulus to activate that knowledge. Similarly,
Hopkins (1996) tests the theory that classi?cation
of debt-equity hybrid securities alters analysts’
inferences about ?rm value; this theory can be
tested only with relatively realistic stimuli and
value-assessment tasks.
Experiments testing economic theories typically
present participants with less rich information and
less realistic stimuli, because they focus on how par-
ticipants make decisions using economic informa-
tion given particular preferences, constraints, and
incentives. The decision processes depicted in these
theories are not hypothesized to depend on task
realism, so these studies are less concerned with it.
For example, King and Wallin (1991a) test theories
relating the probability that a seller knows the
asset value to the sellers’ disclosure strategies and
buyers’ responses to those disclosures. That study
does not require realism or knowledge of parti-
cular real-world institutions, so it uses abstract
stimuli and tasks to avoid introducing extraneous
factors that might compromise internal validity.
This discussion should not be construed as indi-
cating that all experiments testing theories drawn
from psychology (economics) must have high (low)
stimulus realism. Experiments testing very general
psychological theories (such as the relation between
short-term memory and optimism) could contribute
to ?nancial accounting research with stimuli and
tasks that possess very low degrees of realism.
Similarly, experiments testing the e?ects of super-
ior accounting knowledge on trading pro?ts
would require high degrees of realism. It is the
goal of the experiment that determines whether
realism adds to or detracts from internal validity.
Stimulus realism can also provide bene?ts
beyond that required for an internally valid test of
the underlying theory. First, realism can help
authors convey to readers the ways in which the
results relate to prior research. For example,
Hopkins (1996) and Tan, Libby, and Hunton
(2000) are able to compare their pricing and earn-
ings-forecast di?erence results for some treatments
directly to prior archival studies, which increases
con?dence in the generality of the results of treat-
ment combinations for which no (or insu?cient)
archival data are available. Second, realism can
help subjects understand the task they are being
asked to perform, thereby reducing noise in the
data. This may be particularly important in eco-
nomics-based experiments, which place high
demands on participants’ attention.
However, it is important not to exaggerate the
bene?ts that stimulus realism provides when it is
not directly enhancing internal or external valid-
ity. Such realism may not substantially increase
external validity, which is determined mainly by
the theory itself and how e?ectively the theoretical
constructs have been operationalized. Similarly, it
is important not to exaggerate its costs. Experi-
mental economists often worry that realism may
in?uence behavior in ways that lie outside their
theories, and thus reduce internal validity
(Camerer, 1997; Smith, 1976), but as we will dis-
cuss in Section 4.1.3, these concerns typically can
be dealt with through good experimental design.
4.1.2.2. Choosing levels of independent variables.
After choosing the nature of independent vari-
ables, the researchers must choose their levels. A
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 797
general goal is to choose levels that are di?erent
enough that the experiment has su?cient power to
yield strong e?ects, yet be within the relevant range.
As indicated above, in some cases it is appro-
priate to choose levels that depict real-world con-
ditions. For example, HM’s independent variable
consists of treatment levels that re?ect what ana-
lysts might experience in practice. Given that their
theory is testing the relation between those real-
world incentives and analysts’ behavior, this rea-
listic depiction provides a strong test of the theory.
However, it is usually di?cult to ensure a repre-
sentative sample of independent variable values,
which limits the interpretability of levels of e?ects
and parameter estimates in most experiments.
Choosing realistic versions of naturally occurring
phenomena can also make it di?cult to manip-
ulate only a single theoretical antecedent while
holding all others constant. This is particularly
true in studies of alternative accounting methods
or disclosures, where di?erences in method or dis-
closure (the experimental treatments in these
studies) can convey unintended information about
the nature of the underlying transactions that
a?ect the dependent variable but are not included
in the theory being tested. Experimental controls
discussed under link 5 can be employed to reduce
this concern (e.g. Hopkins, 1996).
In other cases, it can be wise to create levels that
are unrealistically extreme. For example, For-
sythe, Lundholm, and Reitz (1999) compare a
regulatory regime that prohibits disclosure with
one that allows any disclosure (even fraudulent
statements). While these levels are unrealistic, they
allow a very powerful test of e?ects that would
likely generalize to milder changes in disclosure
regulations.
It can even be useful to specify at least one level
of the independent variables that cannot occur in
practice, to enable a cleaner test of the underlying
theory. One example of this approach is provided
by Libby and Tan (1999). They seek to understand
how analysts can say they reward ?rms for issuing
early warning of negative earnings surprises, while
actually punishing them in their forecast revisions.
Libby and Tan address this question by oper-
ationalizing three ‘‘warning’’ conditions. Two
conditions are realistic: one in which no warning
occurs prior to an earnings announcement, and
one in which the warning is followed by the nega-
tive earnings announcement. A third condition
cannot exist in practice: the warning and negative
earnings announcement occur simultaneously.
This ‘‘simultaneous warning’’ condition allows
them to separate the e?ect of the warning from the
sequential processing of two signals by creating
two comparisons (each treatment compared to the
simultaneous warning condition) that manipulate
only one antecedent. The other two settings
enhance external validity by mapping naturally
into the institutional setting and archival ?ndings
the authors seek to inform.
Regardless of how one chooses the levels of the
independent variables, it is usually advisable to
conduct manipulation checks. These are measures,
often taken during debrie?ng, which seek to deter-
mine whether subjects noticed and interpreted cor-
rectly the independent variable(s). Manipulation
checks test link 2 of the predictive validity frame-
work. Manipulation checks are particularly useful
when analyses reveal no signi?cant treatment
e?ect, since one alternative explanation for the
lack of a signi?cant e?ect is ine?ective oper-
ationalization of the independent variable (a link 2
problem). However, it is critical that the manip-
ulation check tests recognition and comprehension
of the independent variable, as opposed to serving
as another test of the treatment e?ect. Otherwise,
the manipulation check is really just a second
measure of the dependent variable (testing link 4
rather than link 2).
4.1.2.3. Measured independent variables. Some
independent variables in accounting experiments
are observed, rather than manipulated. Because
subjects are not assigned randomly to measured
treatment levels, measuring independent variables
gives up some of the experimentalist’s comparative
advantage. Such studies are subject to the same
correlated-omitted variables problems that com-
promise internal validity in archival research.
Therefore, it is typically preferable to manipulate
important independent variables whenever possi-
ble, rather than measuring them.
However, there are at least four circumstances
where measuring independent variables is useful.
798 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
The ?rst is that it is impossible or impractical to
manipulate an antecedent. For example, HM
hypothesize that analysts that are considered by
their ?rms to be more accurate forecasters tend to
use a more directive, hypothesis-driven evidential
search strategy. Because HM cannot randomly
assign analysts to ‘‘high historical accuracy classi-
?cation’’ and ‘‘low historical accuracy classi?ca-
tion’’ treatments, it is possible that historic
accuracy classi?cation is correlated with some
other variable (such as age or intelligence) that
determines use of a directive search strategy. As a
consequence, HM include a number of control
variables to test these alternative explanations for
results, and are careful to discuss these results in
terms of ‘‘associations’’ rather than ‘‘causes.’’ A
second reason to use measured independent vari-
ables is that the theory relating the antecedent to
the consequence involves mediating variables (a
sequence of links through intervening variables).
For example, Hopkins (1996) predicts that the bal-
ance-sheet classi?cation of manditorily redeemable
preferred stock (concept a) a?ects analysts’ beliefs
concerning the total amount of equity outstanding
(concept b), which in turn a?ects their stock price
estimates (concept c). Because analysts’ beliefs
about outstanding equity are actually a dependent
variable in a part of his theory (a a?ects b), Hop-
kins cannot manipulate it directly. Those beliefs
become a measured independent variable when
testing the second part of the theory (b a?ects c).
Similarly, almost every multi-person task involves
intervening variables, because the behavior of one
person is determined by the (necessarily endogen-
ous) behavior of another. For example, King (1996)
tests whether imposing exogenous costs on buyers
for inaccurate value estimates induces sellers to
report values accurately. One simple breakdown
of this theory is that exogenous costs (concept a)
reduce the prices buyers are willing to pay when
the seller has previously reported inaccurately
(concept b), which leads the seller to choose higher
reporting accuracy (concept c). Because equili-
brium models involve many forces acting simulta-
neously (e.g. the seller should anticipate the
buyers’ response to his reports, and the buyers
should anticipate the seller’s response to their
likely price-setting behavior), it is di?cult to
measure all of those forces simultaneously in one
experiment. Thus, King measured some potential
intervening variables (he chose to examine how
sellers’ reporting accuracy a?ects buyers’ reliance
on those reports), but not others.
One way to avoid measured independent vari-
ables is to construct separate experiments testing
the separate parts of the theory. Hopkins could
have tested the ‘‘a,b’’ and ‘‘b,c’’ links separately or
in sequence, reasoning that ?nding support for
both links suggests (but does not demonstrate) an
‘‘a,c’’ link. However, he chose to provide a clean
test of the ‘‘a,c’’ link by testing it directly, and
using subsequent measurement of ‘‘b’’ to provide
comfort that subjects behaved as predicted. Simi-
larly, King could have separately tested buyers’
responses to seller decisions. However, we believe
that both authors were justi?ed in focusing their
cleanest tests on the primary antecedent and con-
sequence concepts in their theory. A full under-
standing of the causal path may be somewhat
encumbered by the problems associated with
measured independent variables, but remaining
problems can be addressed in future research. For
example, Bloom?eld and Hales (2000) use a series
of experiments to understand more of the linkages
in King’s study.
Third, it is sometimes much less interesting to
examine reactions to a manipulated variable than
a naturally occurring one. For example, it would
have been less interesting for Hopkins (1996) to
test whether analysts who are told that there are
more shares outstanding would place a lower value
on a ?rm’s stock, all else held equal. It seems much
more reasonable to ask whether the same analysts
would use that belief to assess stock value when the
belief arises naturally. This type of concern is even
more salient in tests of equilibrium models.
Fourth, measured variables often provide the
keys to understanding underlying processes that
produce the e?ects of interest. For example,
Maines and McDaniel (2000) make a contribution
by demonstrating e?ects of format on judgments
of management e?ectiveness and stock risk (an
‘‘a,b’’ link), even though their lack of signi?cant
e?ects of format on valuation could be viewed as
an insigni?cant ‘‘a,c’’ link. After all, each inter-
vening successive link adds noise and diminishes
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 799
the experimenter’s ability to detect an e?ect of ‘‘a’’
on a later consequence (particularly when the later
consequence is a very complicated judgment like
stock valuation). Only by eliciting intervening
variables does a clear pattern of results emerge.
Hirst, Koonce, and Miller (1999) demonstrate the
importance of specifying the correct causal path.
They show that the form of a forecast will a?ect
trading decisions, not through estimates of future
earnings, but through con?dence in estimates. This
further highlights the need to elicit intervening
dependent variables that aid in interpreting results
with respect to tests of complex theories. We
encourage researchers to measure potential inter-
vening variables whenever possible, if only after
they measure their primary dependent variable.
13
4.1.3. Links 4 and 5: statistics and other
potentially in?uential variables
As noted earlier, internal validity refers to the
degree to which variation in the dependent variable
can be attributed to variation in the independent
variable. Link 4 assesses the relations between the
operational independent and dependent variables.
Link 5 captures ‘‘other potentially in?uential’’ or
‘‘extraneous’’ variables besides the independent
variable that could a?ect the dependent variable. A
key advantage of the experimental approach is that
the e?ects of extraneous variables can be controlled
for primarily by holding them constant or through
randomization. As a result, statistical analyses in
experiments are typically straightforward, often
consisting of simple t-tests, ANOVAs, or non-
parametric equivalents. Extraneous variables can
also be measured as in archival studies, and used
to enhance the power of analyses by accounting
for variation in the dependent variable that is not
related to the theory being tested. Finally, extra-
neous variables can be manipulated to directly test
their e?ect. Given the expense typically associated
with this approach, it should only be used when
the experimenter believes the extraneous variables
cannot be dealt with another way.
Very complex statistics are typically necessary in
experiments only when they rely heavily on mea-
sured independent variables, or when researchers
must try to boost power when subject resources
are scarce. When those circumstances are not
apparent, complicated statistical tests may signal
poor experimental design — the experimenter is
trying to grapple after the fact with concerns that
should have been headed o? with good experi-
mental design.
This section describes some of the powerful
array of techniques experimenters can use to deal
with extraneous variables. The most important
technique available to the experimentalist to con-
trol for extraneous variables is to assign subjects
randomly to treatments. Random assignment,
combined with manipulation of independent vari-
ables, enables experimentalists to ensure that their
results are not biased by factors of which they are
aware, as well as factors of which they are not
aware. For example, HM randomly assign ana-
lysts to incentive-treatment conditions. This
results in an unbiased distribution of industry
familiarity, age, experience, prior accuracy, etc.
across the three levels of the incentive treatment.
Thus, HM can conclude, with a speci?ed level of
statistical con?dence, that these variables, and
other unspeci?ed variables such as motivation or
breakfast size, did not account for the results. In
fact, had HM not chosen to measure analysts’
experience and use it as a covariate to reduce var-
iance in their analysis, they could have ignored
experience and expected that it would not a?ect
their mean results because of random assignment
across treatment conditions.
More generally, random assignment to treat-
ment conditions allows experimentalists to avoid
many of the omitted variable concerns that limit
causality inferences in archival studies. For exam-
ple, Kothari (2000) notes that the direction of
cause and e?ect between relationship and forecast
optimism documented in the archival literature is
not clear. It could as easily result from managers’
selection of investment banks whose analysts pro-
vide a more optimistic forecast as from opportu-
nistic forecasting by analysts with relationships.
This selection alternative explanation is eliminated
in HM by random assignment of analyst subjects.
13
Of course, the experimenter needs to worry about carry-
over e?ects (i.e. earlier measurements a?ecting later behavior).
Sometimes the order in which successive dependent variables
are elicited is manipulated between subjects to reduce this con-
cern. This is discussed further in Section 4.1.3.
800 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
As a result of random assignment, the expected
value of analyst optimism prior to the treatment is
unbiased across the three incentive treatment
groups.
Second, experimentalists can hold extraneous
variables constant at a particular level. For exam-
ple, HM hypothesize that analysts who will be
underwriting securities exhibit di?erent forecast
bias than analysts who do not, because they face
di?erent incentives. However, compared to non-
underwriting analysts, underwriting analysts could
also have larger amounts of information available
about a ?rm, or spend di?erent amounts of time
forecasting earnings. HM deal with these potential
alternative explanations for changes in their
dependent variable (forecast accuracy) by holding
constant across treatments the amount of infor-
mation analysts have available and the amount of
time analysts can spend on the experimental task.
More generally, experimentalists typically hold
constant aspects of the institutional setting that
they believe are potentially important but that are
not part of the portion of the research question
examined in that particular study.
A third way to deal with extraneous variables is
to measure them (typically during debrie?ng).
These measurements can be used as covariates or
measured independent variables to account for
their e?ects. For example, HM identi?ed prior
research that indicated that analysts’ forecast
accuracy changes as they become more experi-
enced. Since a general experience e?ect was not
part of their hypotheses, but might a?ect their
dependent variable, HM measured experience by
eliciting years spent as a ?nancial analyst and used
it as a covariate in their analysis. Years of experi-
ence cannot have been in?uenced by HM’s treat-
ment e?ect, so they use it as a covariate to reduce
noise in their analyses without fear that it is actu-
ally capturing some element of the e?ect of the
independent variable on the dependent variable
(link 4). Similarly, Hirst, Koonce, and Miller
(1999) use a pretest measure of forecasted earnings
taken before the treatment was administered to
reduce noise and increase power.
Measurements of extraneous variables are also
useful for testing competing explanations for
experimental results. For example, Hopkins (1996)
tests whether subjects infer management signaling
or di?erential tax treatment from the balance
sheet classi?cation of the hybrid security. Either of
these inferences could explain an e?ect of classi?-
cation on forecast error, but neither is included in
Hopkins’ theory. Hopkins provides evidence
against these explanations by eliciting in debrie?ng
subjects’ inferences about the underlying transac-
tion and demonstrating a lack of signi?cant dif-
ference in inference between treatment conditions.
Such measures operate much like a manipulation
check, but rather than providing evidence that the
independent variable operationalizes the ante-
cedent concept the experimenter intended, they
provide evidence that the independent variable did
not operationalize antecedent concepts other than
those intended by the experimenter. The assurance
they provide is limited (in that they provide evi-
dence by ?nding an insigni?cant di?erence), but it
is assurance nonetheless.
A fourth way to deal with extraneous variables
is to manipulate them and test their e?ects. For
example, Bloom?eld, Libby, and Nelson (2000b)
present their subjects with a number of securities,
and vary between subjects the order in which
securities are presented. They test for order e?ects
and ?nd none, allowing them to discount order of
presentation as a potential explanation for their
results. Even if they did not test for such e?ects,
manipulating order in a balanced design would
reduce the risk that results are speci?c to a parti-
cular order. In general, manipulating factors
unrelated to the hypotheses can be useful, but
expensive in terms of use of subjects.
Finally, experimentalists can deal with link-5
factors by ignoring them. By ‘‘ignore’’ we really
mean ‘‘abstract from,’’ because those factors will
not be included in the experimental environment.
Ignoring some extraneous variables is necessary
because it is not practical to mimic all elements of
reality in an experiment; some abstraction is
necessary for the experiment to be conducted in a
timely manner. To the extent that subjects make
assumptions about information that is not inclu-
ded in the experimental environment, those
assumptions are randomly distributed across
treatment conditions, and do not a?ect inter-
pretation of results, as long as the treatments do
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 801
not di?erentially a?ect subjects’ assumptions
about extraneous variables.
It is important to note that these methods of
accounting for extraneous variables are e?ective
only when the experimental design manipulates the
variables of primary interest to test e?ects of
directional predictions. For example, Tuttle, Col-
ler, and Burton (1997) wish to examine how
security prices are in?uenced by the order in which
information is revealed to investors. They provide
investors with rich ?rm-speci?c information about
market conditions and corporate events, rather
than the abstract information used in many mar-
kets experiments. Because the authors cannot
know exactly what knowledge investors bring to
bear in interpreting this rich information, it could
have a number of unknown e?ects on stock price,
and might lead prices on average to be higher and
lower than they should be. However, rather than
comparing prices to a point prediction of true
value, they examine whether the order of infor-
mation release causes a di?erence in prices. This
di?erence cannot be a?ected by extraneous vari-
ables created by the rich information (although
they surely exist), because the total information is
held constant across the settings being compared.
As discussed by Bloom?eld and Libby (1996),
this type of ‘‘paired securities’’ design can generally
be used to eliminate concerns about unanticipated
e?ects of realism in experiments. Experiments that
attempt to compare behavior to point predictions
sacri?ce this powerful form of experimental con-
trol. Even apparently innocuous variables in an
experimental setting (such as the color of a com-
puter screen or the time of day at which data col-
lection occurs) could cause deviations of behavior
from a point prediction, but are unlikely to cause
those deviations to vary across levels of the
manipulated independent variables.
4.2. Increasing experimental e?ciency (without
compromising e?ectiveness)
Experimenters make many choices that a?ect
the amount of resources consumed by their
experiments. This section discusses four such
choices: whether to use professional subjects
(which are di?cult to obtain); whether to provide
those subjects with monetary incentives (which are
expensive); whether to use between-subjects designs
(which use more subjects than within-subjects
designs); and whether to place subjects in a labora-
tory market (which requires more subjects than
would a study of individual judgments). Choosing
to consume more resources does not necessarily
increase experimental e?ectiveness. Rather, it
increases e?ectiveness in some circumstances,
reduces it in others, and has a small enough e?ect
in others that it is not justi?ed from a cost/bene?t
perspective. We discuss each choice in turn.
4.2.1. Subject selection
When should experiments use professional sub-
jects? Our advice is to match subjects to the goals
of the experiment, but to avoid using more
sophisticated subjects than is necessary to achieve
those goals.
Experiments that examine the e?ects of some
attribute subjects have developed before entering
the experiment must use subjects who possess the
necessary attribute. Many studies use experiments
to ‘‘peer into the minds’’ of speci?c groups of
experienced professionals to determine what they
have learned about relevant concepts and events
and how that learning a?ects decisions. Hopkins
(1996) examines how knowledge of the di?erential
e?ects of debt and equity o?erings determines how
classi?cation of debt-equity hybrids a?ects ana-
lysts’ judgments. Libby and Kinney (2000) seek to
explore how auditors’ beliefs about managers and
their own incentives determine the e?ect of old
and new regulations. In both of these cases, the
experimenter is interested in how subjects’ use of
some type of knowledge learned in the real world
causes treatment e?ects, so they must use subjects
with the requisite knowledge. Thus, these studies
use professionals as subjects.
In some cases, the experimenter can train stu-
dent subjects to possess an attribute (e.g. knowl-
edge) that the experimenter is interested in
examining. This approach is cost-e?ective given
students’ greater availability than professional
subjects, and is well suited for testing the e?ects of
speci?c features of the learning environment and
elements of the resulting knowledge (cf. Bonner &
Walker, 1994). However, this must be done with
802 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
care since recently acquired knowledge is unlikely
to be of the same depth and breadth, or integrated
as well with subjects’ pre-existing knowledge.
Student subjects are also entirely appropriate in
studies that focus on general cognitive abilities, or
responses to economic institutions or ?nancial-
market forces that are expected to be learned
within the experimental setting. Maines and Hand
(1996) provide an example of the former; they
examine the e?ects of general tendencies in the
processing of time-series information on forecast-
ing behavior. Any of the reporting studies by King
and Wallin (1991a, 1991b, 1995) provide examples
of the latter; those studies examine how subjects
respond to the strategic forces in disclosure games.
Other experiments focus on the judgments of
general or novice investors, and so require subjects
who possess only basic familiarity with accounting
and investing. Student populations that have such
basic familiarity are appropriate here as well.
MBA students and executive-program partici-
pants are particularly useful, as they often have
some accounting knowledge and investing experi-
ence. Studies of this type employing student sub-
jects include Bloom?eld, Libby, and Nelson (1999,
2000a), Hirst, Koonce, and Miller (1999), Hirst,
Koonce, and Simko (1995), Kennedy, Mitchell,
and Sefcik (1998), Lipe (1998), Bloom?eld and
Libby (1996), Maines and McDaniel (2000), and
Nelson, Krische, and Bloom?eld (2000).
In general, experimenters should avoid using
professional subjects unless it is necessary to
achieve their research goals. In addition to
increasing the experimenters’ own time and
expense, inappropriate use of professional subjects
has negative externalities — they may make it
more di?cult for other experimenters to gain
access to this very valuable resource.
4.2.2. Monetary incentives
When is it appropriate to provide explicit
monetary incentives in ?nancial accounting
experiments? As in subject selection, the answer
should be driven primarily by the goals of the
experiment.
First, as noted above, experiments that focus on
incentives rely on participating professionals to
bring their knowledge of and behavior learned in
response to real world incentives to the experi-
ment. Such experiments attempt to examine how
professional practice has provided professionals
with incentives that a?ect their behavior in parti-
cular ways. For example, HM studied the e?ect of
analysts’ incentives on their forecast accuracy,
with those incentives determined by the analysts’
perceptions and understanding of the relationship
that the analyst has with the ?rm whose perfor-
mance is being forecasted. Providing performance-
contingent incentives in this type of experiment
would distort or interfere with the e?ects of the
real world incentives, and is therefore inappropri-
ate. While the e?ects of professionals’ perceived
incentives might be diminished in the experimental
setting, their direction should not be altered, so
their directional e?ects should not be altered.
Experiments testing responses to economic the-
ory (such as those described in Section 3.4) need to
provide performance-contingent incentives in
order to induce subjects to possess the incentives
assumed by the economic model (Smith, 1976).
Without such incentives, a fundamental causal
element of the model may not be present, and
there is no reason to expect theoretical predictions
to hold. Performance-contingent incentives are
almost always appropriate in laboratory market
experiments that examine how individual biases
can be mitigated by competitive forces. For
example, the ‘‘smart trader’’ hypothesis relies on
an assumption that more accurate traders trade
more actively because they will earn money by
doing so.
A researcher who has concluded that perfor-
mance-contingent incentives are appropriate must
then decide on how sensitive payments should be
to variations in performance. Our casual observa-
tions suggest that most experimental tests of eco-
nomic theories pay subjects an average of $8 to
$20 per hour, with payments ranging from $5/
hour to $100/hour (or sometimes more). These
numbers re?ect tradition and resource limitations
more than any reasoned theory. These incentives
are obviously much less than most agents in
?nancial accounting target environments would
expect. However, we doubt behaviors would be
substantially di?erent with larger incentives. Past
experiments show little evidence that biases are
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 803
eliminated by incentive compensation, just as
?nancial rewards have not allowed athletes to run
a 3-min mile. Limitations on abilities, rather than
a lack of reward, drive these results. More gen-
erally, larger monetary incentives might reduce the
size of biases, but are unlikely to alter their basic
nature and direction.
14
Thus, larger incentives
would probably not change the inferences drawn
from directional hypothesis tests.
4.2.3. Within- vs. between-subjects designs
When should experiments use between-subjects
designs, rather than within-subjects designs?
Within-subjects (or ‘‘repeated-measures’’ designs)
where subjects provide more than one observa-
tion, generally enhance statistical power by allow-
ing control of between-subjects di?erences (i.e.
there is a ‘‘subject factor’’ in the analyses that
accounts for subject-speci?c noise). This approach
has the added advantage of using fewer subjects.
However, repeated measures designs can also
a?ect results by making treatment e?ects more
salient, which may signal to subjects that the
experimenter wants them to respond to the
manipulation (the familiar ‘‘demand e?ect’’ con-
cern). Also, repeated measures are vulnerable to
carryover e?ects from the elicitation of one
measure to the next. Therefore, these designs are
most e?ective when increased salience of manipu-
lated variables is desirable from the standpoint of
the experiment’s goals and/or when any carryover
e?ect is desired or can be minimized via manip-
ulation of the order in which measures occur.
As noted earlier, Hirst, Koonce, and Miller
(1999) use one type of repeated measures design,
the pretest–posttest design. Their subjects ?rst
forecast earnings and assess con?dence in that
forecast, given only company background infor-
mation and the prior years’ ?nancial data. The
subjects were then provided with the experimental
treatments (management forecast and information
about management forecast accuracy), and again
forecasted earnings and assessed con?dence. This
pretest–posttest design allows Hirst, Koonce, and
Miller to increase power by using the pretest as a
covariate in their analyses or by analyzing the
change in forecasts caused by the treatment. Since
they want their subjects to attend carefully to the
information contained in their treatments, and
their analyses are based on comparisons between
treatment conditions (which hold treatment sal-
ience constant), they are not concerned about
drawing extra attention to the treatment.
Within-subject treatments are particularly com-
mon in laboratory markets and games. For exam-
ple, Bloom?eld and Wilks (2000) create a setting
in which each group of subjects participates in
eight di?erent treatments (every cell of a 2Â2Â2
design) over the course of two trading sessions.
Such repetition reduces noise in the data, which is
often high in early repetitions because the envir-
onment is so complex. Repetition also uses sub-
jects’ time very e?ciently, which reduces the
already high cash cost of running such experi-
ments. However, repetition also requires Bloom-
?eld and Wilks to balance the orders of the
treatments, to ensure that treatment e?ects are not
confounded with order e?ects.
Tan, Libby, and Hunton (2000) also suggest the
use of a combination of between- and within-sub-
jects designs as a method of partitioning the e?ects
of unintentional biases from intentional judgment
policies. Following Kahneman and Tversky (1996),
they suggest that the between-subjects design pro-
vides a clean test of the subject’s natural reasoning
process, while the within-subjects design draws
attention to the independent variable of interest
and thus gives the subject a chance to detect and
correct errors and inconsistencies in their respon-
ses. Comparison of results under the two approa-
ches highlights how subjects address any con?ict
between what they do and what they know. Evi-
dence of di?erences using between-subjects treat-
ments, but not using within-subjects treatments,
suggests that the between-subjects di?erences are
unintentional. On the other hand, evidence of dif-
ferences using within-subjects treatments, but not
using between-subjects treatments, suggests that
subjects are aware of the implications of the di?er-
ences in the stimuli, but that, in their natural rea-
soning process, the stimuli were ignored or subjects’
related knowledge was not accessed and used. This
method should be useful in other studies that
14
See Kachelmeier and Shehata (1992) for a study on how
very large incentives in?uence responses to risk.
804 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
attempt to distinguish between the e?ects of judg-
ment heuristics versus knowledge.
The choice of between- versus within-subjects
designs a?ects analyses, since within-subjects mani-
pulation (i.e. repeated measures) yields observa-
tions that are not independent. For example,
Bloom?eld and Wilks (2000) observe well over a
thousand closing prices in their study. However,
since there are only eight distinct groups of sub-
jects, their repeated-measures analyses e?ectively
compute the average treatment e?ect (a signed
di?erence) for each group, and then perform a t-
test on the eight di?erences. This design is more
powerful than it might seem, because each of the
eight numbers is the average of a large number of
observations, and therefore has very little noise.
As discussed in Section 4.1.2, most laboratory
markets conduct supplementary analyses that
break a theory into parts using measured inter-
vening variables. For example, Bloom?eld and
Wilks (2000) examine how disclosure quality
a?ects market price through its e?ects on market
liquidity, which is measured. It is more di?cult to
apply pure repeated-measures statistical techni-
ques to such analyses. However, experimenters
should be aware that inappropriate statistical
methods overstate sample size (and therefore
understate P-values), and should be interpreted
with caution. More importantly, researchers must
make every attempt to use repeated-measures
analyses for their main hypothesis tests.
4.2.4. Using laboratory ?nancial markets
When is it necessary to place individuals in
laboratory markets? Critics of individual decision-
making experiments often suggest that biases and
suboptimal behavior would be driven away by
market forces. In our view, this criticism alone
rarely justi?es the cost of a market experiment. As
discussed in Section 3.3, few experiments have
shown that market forces eliminate biases; even
when they mitigate a bias, they tend to a?ect its
magnitude, but not its sign (e.g. market prices are
still too high, but not by as much). Because only
directional e?ects are easily generalized from
experiments to target settings, using a ?nancial
market does not substantially alter an experiment’s
e?ectiveness. On the other hand, the market does
dramatically increase the cost of the experiment. A
group of 50 subjects will yield 50 judgments that
are statistically independent of one another.
Forming those subjects into 10 separate ?ve-trader
markets yields only 10 judgments (market prices)
that are statistically independent of one another.
As a result, the use of a market either reduces
power or increases the costs of the study.
Laboratory markets are most appropriate when
examining particular forces within the market that
might a?ect bias mitigation (such as the smart-
trader hypothesis), or when examining dependent
variables that are simply unde?ned at the indivi-
dual level (such as trading volume or market
liquidity). Even in these cases, however, one can
sometimes address experimental goals in indivi-
dual decision-making tasks. For example, Nelson,
Krische, and Bloom?eld (2000) use an individual
decision-making task to examine how con?dence
in one’s own ability to ‘‘pick winners,’’ relative to
con?dence in large-sample anomalies (such as
post-earnings-announcement drift) can a?ect tra-
ders’ willingness to rely on a disciplined trading
strategy. They do not have traders transact with
each other, but rather examine the number of
shares that each trader o?ers to transact. This
approach allows researchers to examine the rela-
tion between judgment and trading behavior, but
does not allow researchers to capture strategic
interactions between market participants.
Given that one chooses to conduct a ?nancial
market, there are many decisions that can reduce
the cost of each observation. One method used
almost universally in laboratory ?nancial markets
and laboratory games (as in Sections 3.3. and 3.4)
is to have each group provide many observations
(a repeated-measures design). As noted in Section
4.2.3, repeated-measures designs o?er many
advantages, but a?ect the statistical analyses that
must be performed.
5. Conclusions
This paper discusses how recent experimental
research in ?nancial accounting has responded to
past criticisms, discusses how the recent literature
has developed and how it can be extended, and
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 805
provides our perspective on how future experiments
can be designed to maximize both e?ectiveness and
e?ciency. Our comments are driven by our belief
that experiments — whether based on psychologi-
cal or economic theory — must exploit the primary
advantages of the experimental method. Those
advantages include the ability to construct an envir-
onment in which a causal theory of phenomena can
be tested with a maximum of internal validity.
Experimental research is still only a small part
of empirical ?nancial accounting research. This
raises the question of how ?nancial accounting
experiments should relate to the more dominant
archival-empirical work. One of the most notable
characteristics of the better studies that we have
reviewed is their close tie to formal or informal
empirical observation. These observations often
provide part of the motivation for the experi-
mental studies, and are relied upon to demonstrate
the external validity of experimental results.
Future research can relate even more closely to
this literature by testing alternative potential
explanations for archival ?ndings when there are
natural confounds, measurement problems, or
where causality is unclear, by explaining contra-
dictory ?ndings, and by examining conditions
where large samples are unavailable. Experiments
can also point to directions for future archival-
empirical studies by specifying either the limits to
the generality of existing ?ndings or other ?ndings
that should exist in further archival studies.
Acknowledgements
Prepared for the Accounting, Organizations and
Society 25th Anniversary Conference, Oxford Uni-
versity, July 2000. We thank Ron King, Lisa
Koonce, Laureen Maines, Greg Waymire, the par-
ticipants at the AOS 25th Anniversary Conference
and the Emory Behavioral Financial Accounting
Research Conference for their comments and sug-
gestions, and Bernadine Low for her assistance.
References
Abarbanell, J. S., & Bernard, V. L. (1992). Tests of analysts’
overreaction/underreaction to earnings information as an
explanation for anomalous stock price behavior. Journal of
Finance, 47(3), 1181–1208.
Abdel-Khalik, A. R., & El-Sheshi, K. (1980). Information
choice and cue utilization in an experiment on default pre-
diction. Journal of Accounting Research, 18(2), 325–342.
Ackert, L. F., Church, B. K., & Shehata, M. (1997). An
experimental examination of the e?ects of forecast bias on
individuals’ use of forecasted information. Journal of
Accounting Research, 35(1), 25–42.
Anderson, M. J. (1988). A comparative analysis of information
and evaluation behavior of professional and non-profes-
sional ?nancial analysts. Accounting, Organizations and
Society, 13(5), 431–446.
Anderson, M. J., & Sunder, S. (1995). Professional traders as
intuitive Bayesians. Organizational Behavior and Human
Decision Processes, 64(2), 185–202.
Andrade, G. (1999). Do appearances matter? The impact of EPS
accretion and dilution on stock prices. Working Paper, Har-
vard Business School.
Ball, R. (1992). The earnings–price anomaly. Journal of
Accounting and Economics, 15(2), 319–345.
Ball, R., & Bartov, E. (1996). How naive is the stock market’s
use of earnings information? Journal of Accounting and Eco-
nomics, 21(3), 319–337.
Bamber, L. (1987). Unexpected earnings, ?rm size, and trading
volume around quarterly earnings announcements. The
Accounting Review, 62(3), 510–532.
Bamber, L., Barron, O., & Stober, T. (1997). Trading volume
and di?erent aspects of disagreement coincident with earn-
ings announcements. The Accounting Review, 72(4), 575–597.
Barberis, N., Shleifer, A., & Vishny, R. (1998). A model of
investor sentiment. Journal of Financial Economics, 49(3),
307–343.
Bazerman, M. H. (1998). Judgment in managerial decision
making. New York: John Wiley.
Bazerman, M. H., Morgan, K. P., & Loewenstein, G. F. (1997).
The impossibility of auditor independence. Sloan Manage-
ment Review, Summer, 89–94.
Beeler, J., & Hunton, J. E. (2001). Contingent economic rents:
insidious threats to auditor independence. Working Paper,
South Florida University.
Beresford, D. R. (1994). A request for more research to support
?nancial accounting standard setting AAA — accounting,
behavior and organization section. Behavioral Research in
Accounting, 6(Supplement), 190–203.
Berg, J., Dickhaut, J., & McCabe, K. (1995). The individual
versus the aggregate. In R. H. Ashton, & A. H. Ashton
(Eds.), Judgment and decision-making research in accounting
and auditing (pp. 102–134). New York: Cambridge.
Bernard, V. L. (1993). Stock price reactions to earnings
announcements: a summary of recent anomalous evidence
and possible explanations. In R. Thaler (ed.), Advances in
behavioral ?nance (pp. 303–340).
Bernard, V. L., & Skinner, D. J. (1996). What motivates man-
agers’ choice of discretionary accruals? Journal of Accounting
and Economics, 22(1–3), 313–325.
806 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
Bernard, V. L., & Thomas, J. (1989). Post-earnings announce-
ment drift: delayed price response or risk premium? Journal
of Accounting Research, 27(1), 1–48.
Bernard, V. L., & Thomas, J. (1990). Evidence that stock prices
do not fully re?ect the implications of current earnings for
future earnings. Journal of Accounting and Economics, 13(4),
305–340.
Bernheim, D. (1984). Rationalizable strategic behavior. Econo-
metrica, 52(5), 1007–1028.
Bhushan, R. (1994). An informational e?ciency perspective on
the post-earnings-announcement drift. Journal of Accounting
and Economics, 18(1), 45–65.
Biggs, S. F. (1984). Financial analysts’ information search in
the assessment of corporate earning power. Accounting,
Organizations and Society, 9(3), 313–323.
Biggs, S. F., Bedard, J. C., Gaber, B. G., & Linsmeier, T. J.
(1985). The e?ects of task size and similarity on the decision
behavior of bank loan o?cers. Management Science, 31(8),
970–987.
Bloom?eld, R. (1996a). The interdependence of reporting dis-
cretion and informational e?ciency in laboratory markets.
The Accounting Review, 71(4), 493–511.
Bloom?eld, R. (1996b). Quotes, prices and estimates of value in
a laboratory market. Journal of Finance, 51(5), 1791–1808.
Bloom?eld, R., & Hales, J. (2000). Developing reputations for
reliable reporting: the role of expectations. Working Paper,
Cornell University.
Bloom?eld, R., & Libby, R. (1996). Market reactions to di?er-
entially available information in the laboratory. Journal of
Accounting Research, 34(2), 183–207.
Bloom?eld, R., Libby, R., & Nelson, M. W. (1996). Commu-
nication of con?dence as a determinant of group judgment
accuracy. Organizational Behavior and Human Decision Pro-
cesses, 68(3), 287–300.
Bloom?eld, R., Libby, R., & Nelson, M. W. (1999). Con?dence
and the welfare of less-informed investors. Accounting,
Organizations and Society, 24(8), 623–647.
Bloom?eld, R., Libby, R., & Nelson, M. W. (2000a). Over-
reliance on previous years’ earnings. Working Paper, Cornell
University.
Bloom?eld, R., Libby, R., & Nelson, M. W. (2000b). Under-
reactions, over-reactions, and moderated con?dence. Journal
of Financial Markets, 3, 113–137.
Bloom?eld, R., & Wilks, T. J. (2000). Disclosure e?ects in the
laboratory: liquidity, depth and the cost of capital. The
Accounting Review, 75(1), 13–42.
Bonner, S. E. (1990). Experience e?ects in auditing: the role of
task-speci?c knowledge. The Accounting Review, 65(1), 72–92.
Bonner, S. E., & Walker, P. L. (1994). The e?ects of instruction
and experience on the acquisition of auditing knowledge. The
Accounting Review, 69(1), 157–178.
Bouwman, M. J. (1984). Expert vs. novice decision making in
accounting: a summary. Accounting, Organizations and
Society, 9(3), 325–327.
Brown, L., & Han, J. (2000). Do stock prices re?ect the impli-
cations of current earnings for future earnings for AR1
?rms? Journal of Accounting Research (in preparation).
Calegari, M., & Fargher, N. L. (1997). Evidence that prices do
not fully re?ect the implications of current earnings for
future earnings: an experimental markets approach. Con-
temporary Accounting Research, 14(3), 397–433.
Camerer, C. (1987). Do biases in probability judgment matter
in markets, experimental evidence. American Economic
Review, 77(5), 981–997.
Camerer, C. (1992). The rationality of prices and volume in
experimental markets. Organizational Behavior and Human
Decision Processes, 51(2), 237–272.
Camerer, C. (1997). Rules for experimenting in psychology and
economics, and why they di?er. In Van Dam et al., Under-
standing strategic interaction: essays in honor of R Selten.
Berlin, New York: Springer.
Carroll, J. S. & Johnson, E. (1990). Decision research: a ?eld
guide. Sage
Chan, L., Jegadeesh, K. C., & Lakonishok, J. (1996). Momen-
tum strategies. Journal of Finance, 51(5), 1681–1713.
Clement, M. (1999). Analyst forecast accuracy: do ability,
resources, and portfolio complexity matter? Journal of
Accounting and Economics, 27(3), 285–303.
Cloyd, C. B., Pratt, J., & Stock, T. (1996). The use of ?nancial
accounting choice to support aggressive tax positions: public
and private ?rms. Journal of Accounting Research, 34(1), 23–43.
Coller, M. (1996). Information, noise, and asset prices: an
experimental study. Review of Accounting Studies, 1, 35–50.
Cuccia, A. D., Hackenbrack, K., & Nelson, M. W. (1995). The
ability of professional standards to mitigate aggressive
reporting. The Accounting Review, 70(2), 227–248.
Daniel, K., Hirshleifer, D., & Subrahmanyam, A. (1998).
Investor psychology and security market under- and over-
reactions. Journal of Finance, 53(6), 1839–1885.
DeBondt, W., & Thaler, R. (1985). Does the stock market
overreact. Journal of Finance, 40(3), 793–818.
DeBondt, W., & Thaler, R. (1987). Further evidence of investor
overreaction and stock market seasonality. Journal of
Finance, 42(3), 557–581.
DeBondt, W., & Thaler, R. (1990). Do security analysts over-
react? American Economic Review, 80(2), 52–57.
Dechow, P., & Sloan, R. (1997). Returns to contrarian invest-
ment strategies: tests of naive expectation hypotheses. Jour-
nal of Financial Economics, 43(1), 3–27.
Dechow, P. M., Sloan, R. G., & Sweeney, A. P. (1995).
Detecting earnings management. The Accounting Review,
70(2), 193–225.
De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann,
R. J. (1991). The survival of noise traders in ?nancial mar-
kets. The Journal of Business, 64(1), 1–19.
Dietrich, J. R., Kachelmeier, S. J., Kleinmuntz, D. N., & Lins-
meier, T. J. (2000). Market e?ciency, bounded rationality,
and supplemental business reporting disclosures. Journal of
Accounting Research (in preparation).
Dopuch, N., & King, R. R. (1996). The e?ects of lowballing on
audit quality: an experimental markets study. Journal of
Accounting, Auditing and Finance, 11, 45–69.
Dyckman, T. R. (1964). On the investment decision. The
Accounting Review, 39(2), 285–295.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 807
Einhorn, H. J. (1980). Learning From experience and sub-
optimal rules in decision making. In T. Wallsten (Ed.), Cog-
nitive processes in choice and decision behavior (pp. 1–20).
Hillsdale, NJ: Erlbaum.
Einhorn, H. J., & Hogarth, R. M. (1981). Behavioral decision
theory: processes of judgment and choice. Annual Review of
Psychology, 32, 53–88.
Einhorn, H. J., & Hogarth, R. M. (1986). Decision making
under ambiguity. Journal of Business, 59(4), S225–S250.
Fama, E. F. (1970). E?cient capital markets: a review of theory
and empirical work. Journal of Finance, 25(2), 383–417.
Fama, E. F. (1998). Market e?ciency, long-termreturns, andbeha-
vioral ?nance. Journal of Financial Economics, 49(3), 283–306.
Fischer, P., &Verrecchia, R. (1999). Public information and heur-
istic trade. Journal of Accounting and Economics, 27(1), 89–124.
Forsythe, R., & Lundholm, R. (1990). Information aggregation
in an experimental market. Econometrica, 58(2), 309–348.
Forsythe, R., Lundholm, R., & Reitz, T. (1999). Cheap talk,
fraud and adverse selection in ?nancial markets: some experi-
mental evidence. Review of Financial Studies, 12, 518–581.
Foster, G., Olsen, C., & Shevlin, T. (1984). Earnings releases,
anomalies, and the behavior of security returns. The
Accounting Review, 59(4), 574–603.
Frankel, R., & Lee, C. (1998). Accounting valuation, market
expectation, and cross-sectional stock returns. Journal of
Accounting and Economics, 25(3), 283–319.
Ganguly, A. R., Kagel, J. H., & Moser, D. V. (1994). The
e?ects of biases in probability judgments on market prices.
Accounting, Organizations and Society, 19(8), 675–700.
Gervais, S., & Odean, T. (1997). Learning to be overcon?dent.
Unpublished Working Paper, University of Pennsylvania.
Ghosh, D., & Whitecotton, S. M. (1997). Some determinants of
analysts’ forecast accuracy. Behavioral Research in Account-
ing, 9(Supplement), 50–68.
Gibbins, M., Salterio, S., & Webb, A. (2000). Evidence about
auditor–client management negotiation concerning client’s
?nancial reporting. Journal of Accounting Research (in
preparation).
Gibbins, M., & Swieringa, R. J. (1995). Twenty years of judgment
research in accounting and auditing. In R. H. Ashton, & A. H.
Ashton (Eds.), Judgment and decision-making research in
accounting and auditing (pp. 231–249). New York: Cambridge.
Gillette, A. B., Stevens, D. E., Watts, S. G., & Williams, A. W.
(1999). Price and volume reactions to public information relea-
ses: an experimental approach incorporating traders’ subjective
beliefs. Contemporary Accounting Research, 16(3), 437–479.
Gode, D., & Sunder, S. (1993). Allocative e?ciency of markets
with zero-intelligence traders: market as a partial substitute
for individual rationality. The Journal of Political Economy,
101(1), 119–140(February).
Gode, D., &Sunder, S. (1997). What makes markets allocationally
e?cient? The Quarterly Journal of Economics, 112(2), 603–630.
Gonedes, N., & Dopuch, N. (1974). Capital market equilibrium,
information production, and selecting accounting techniques:
theoretical framework and review of empirical work. Journal
of Accounting Research, 12(Supplement), 48–129.
Gri?n, D., & Tversky, A. (1992). The weighing of evidence and
the determinants of con?dence. Cognitive Psychology, 24(3),
411–435.
Hackenbrack, K., & Nelson, M. W. (1996). Auditors’ incen-
tives and their application of ?nancial accounting standards.
The Accounting Review, 71(1), 43–59.
Hand, J. (1990). A test of the extended functional ?xation
hypothesis. The Accounting Review, 65(4), 740–763.
Haynes, C. M., & Kachelmeier, S. J. (1998). The e?ects of
accounting contexts on accounting decisions: a synthesis of
cognitive and economic perspectives in accounting experi-
mentation. Journal of Accounting Literature, 17, 97–136.
Healy, P. M., & Wahlen, J. M. (1999). A review of the earnings
management literature and its implications for standard set-
ting. Accounting Horizons, 13(4), 365–383.
Herrnstein, R., &Vaughn, W. (1980). Melioration and behavioral
allocation. In J. Staddon (Ed.), Limits to action: the allocation
of individual behavior (pp. 143–176). NewYork, NY: Academic
Press.
Hirst, D. E. (1994). Auditor sensitivity to earnings manage-
ment. Contemporary Accounting Research, 11(1), 405–422.
Hirst, D. E., & Hopkins, P. E. (1998). Comprehensive income
reporting and analysts’ valuation judgments. Journal of
Accounting Research, 36(Supplement), 47–75.
Hirst, D. E., Koonce, L., & Miller, J. (1999). The joint e?ect of
management’s prior forecast accuracy and the form of its
?nancial forecasts on investor judgment. Journal of Account-
ing Research, 37(Supplement), 101–124.
Hirst, D. E., Koonce, L., & Simko, P. J. (1995). Investor reac-
tions to ?nancial analysts’ research reports. Journal of
Accounting Research, 33(2), 335–351.
Hodder, L., Koonce, L., & McAnally, M. L. (2001). SEC mar-
ket risk disclosures: implications for judgment and decision
making. Accounting Horizons (in preparation).
Hogarth, R. M. (1993). Accounting for decisions and decisions
for accounting. Accounting, Organizations and Society, 18(5),
407–424.
Hogarth, R. M., & Einhorn, H. J. (1992). Order a?ects in belief
updating: the belief-adjustment model. Cognitive Psychology,
24(1), 1–55.
Hopkins, P. E. (1996). The e?ect of ?nancial statement classi-
?cation of hybrid ?nancial instruments on ?nancial analysts’
stock price judgments. Journal of Accounting Research,
34(Supplement), 33–50.
Hopkins, P. E., Houston, R. W., & Peters, M. F. (2000). Pur-
chase, pooling, and equity analysts’ valuation judgments.
The Accounting Review, 75(3), 257–281.
Hunton, J. E., & McEwen, R. A. (1997). An assessment of the
relation between analysts’ earnings forecast accuracy, moti-
vational incentives and cognitive information search strat-
egy. The Accounting Review, 72(4), 497–515.
Jacob, J., Lys, T., & Neale, M. (1999). Expertise in forecasting
performance of security analysts. Journal of Accounting and
Economics, 28(1), 51–82.
Jensen, R. (1966). An experimental design for study of e?ects
of accounting variations in decision making. Journal of
Accounting Research, 4(2), 224–238.
Jung, W., & Kwon, Y. (1988). Disclosure when the market is
808 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
unsure of information endowment of managers. Journal of
Accounting Research, 26(1), 146–153.
Kachelmeier, S. (1996a). Do cosmetic reporting variations
a?ect market behavior? A laboratory study of the accounting
emphasis on unavoidable costs. Review of Accounting Stud-
ies, 1, 115–140.
Kachelmeier, S. (1996b). Discussion of ‘‘tax advice and report-
ing under uncertainty: theory and experimental evidence.’’.
Contemporary Accounting Research, 13, 81–90.
Kachelmeier, S., & Shehata, M. (1992). Examining risk pre-
ferences under high monetary incentives: experimental evi-
dence from the People’s Republic of China. The American
Economic Review, 82, 1120–1141.
Kahneman, D., & Tversky, A. (1979). Prospect theory: an
analysis of decision under risk. Econometrica, 47(2), 263–291.
Kahneman, D., & Tversky, A. (1996). On the reality of cogni-
tive illusions. Psychological Review, 103(3), 582–588.
Kennedy, J., Kleinmuntz, D. N., & Peecher, M. E. (1997).
Determinants of the justi?ability of performance in ill-struc-
tured audit tasks. Journal of Accounting Research, 35(Sup-
plement), 105–123.
Kennedy, J., Mitchell, T., & Sefcik, S. E. (1998). Disclosure of
contingent environmental liabilities: some unintended con-
sequences? Journal of Accounting Research, 36(Autumn),
257–277.
Kim, O., & Verrecchia, R. (1994). Market liquidity and volume
around earnings announcements. Journal of Accounting and
Economics, 17(1), 41–67.
King, R. R. (1996). Reputation formation for reliable report-
ing: an experimental investigation. The Accounting Review,
71(3), 375–396.
King, R. R. (2001). An experimental investigation of self-serving
biases in an auditing trust game: the e?ect of group a?liation.
Working Paper, Washington University.
King, R. R., & Wallin, D. E. (1991a). Market-induced infor-
mation disclosures: an experimental markets investigation.
Contemporary Accounting Research, 8(1), 170–197.
King, R. R., & Wallin, D. E. (1991b). Voluntary disclosures
when seller’s level of information is unknown. Journal of
Accounting Research, 29(1), 96–108.
King, R. R., & Wallin, D. E. (1995). Experimental tests of dis-
closure with an opponent. Journal of Accounting and Eco-
nomics, 19(1), 139–168.
Kinney, W. R. (1986). Empirical accounting research design for
PhD students. The Accounting Review, 61(2), 338–350.
Kinney, W. R., & Martin, R. D. (1994). Does auditing reduce
bias in ?nancial reporting? A review of audit-related adjust-
ment studies. Auditing: A Journal of Practice and Theory,
13(1), 149–156.
Kinney, W. R., & Nelson, M. W. (1996). Outcome information
and the ‘expectations gap’: the case of loss contingencies.
Journal of Accounting Research, 34(2), 281–299.
Kothari, S. P. (2000). Capital markets research in accounting.
Journal of Accounting and Economics (in preparation).
Kunda, Z. (1990). The case for motivated reasoning. Psycholo-
gical Bulletin, 108(3), 480–498.
Kyle, A. S., & Wang, F. A. (1997). Speculation duopoly with
agreement to disagree. Can overcon?dence survive the mar-
ket test?. Journal of Finance, 52(5), 2073–2090.
LaPorta, R. (1996). Expectations and the cross-section of stock
returns. Journal of Finance, 51(5), 1715–1742.
Lee, C., Myers, J., & Swaminathan, B. (1999). What is the intrin-
sic value of the dow? Journal of Finance, 54(5), 1693–1741.
Lee, C., & Swaminathan, B. (2000). Price momentum and
trading volume. Journal of Finance (in preparation).
Libby, R. (1981). Accounting and human information processing:
theory and applications. Englewood Cli?s: Prentice-Hall.
Libby, R., & Kinney, W. R. (2000). Earnings management,
audit di?erences, and analysts’ forecasts. The Accounting
Review (in preparation).
Libby, R., & Luft, J. (1993). Determinants of judgment per-
formance in accounting settings: ability, knowledge, motiva-
tion, and environment. Accounting, Organizations and
Society, 18(5), 425–450.
Libby, R., & Tan, H-T. (1999). Analysts’ reactions to warnings
of negative earnings surprises. Journal of Accounting
Research, 37(2), 415–436.
Lipe, M. G. (1991). Counterfactual reasoning as a framework for
attribution theories. Psychological Bulletin, 109(3), 456–471.
Lipe, M. G. (1998). Individual investors’ risk judgments and
investment decisions: the impact of accounting and market
data. Accounting, Organizations and Society, 23(7), 625–640.
Lundholm, R. J. (1991). What a?ects the e?ciency of a mar-
ket? Some answers from the laboratory. The Accounting
Review, 66(3), 486–515.
Maines, L. A. (1994). The role of behavioral accounting
research in ?nancial accounting standard setting. Behavioral
Research in Accounting, 6(Supplement), 204–212.
Maines, L. A. (1995). Judgment and decision-making research
in ?nancial accounting: a review and analysis.
In R. H. Ashton, & A.H Ashton (Eds.), Judgment and deci-
sion-making research in accounting and auditing (pp. 76–101).
New York: Cambridge.
Maines, L. A., & Hand, J. R. M. (1996). Individuals’ percep-
tions and misperceptions of time series properties of quar-
terly earnings. The Accounting Review, 71(3), 317–336.
Maines, L. A., Mautz, R. D., Wright, G. B., Graham, L. E., Ros-
man, A. J., &Yardley, J. A. (2000). Implications of international
diversity in joint venture ?nancial-reporting standards for ?nan-
cial analysts’ stock values. Working Paper, Indiana University.
Maines, L. A., & McDaniel, L. S. (2000). E?ects of compre-
hensive income volatility on nonprofessional investors’ judg-
ments: the role of presentation format. The Accounting
Review (in preparation).
Maines, L. A., McDaniel, L. S., & Harris, M. S. (1997). Impli-
cations of proposed segment reporting standards for ?nan-
cial analysts’ investment decisions. Journal of Accounting
Research, 35(Supplement), 1–24.
Mayhew, B. W., Schatzberg, J. W., & Sevcik, G. R. (2000). The
e?ect of accounting uncertainty and auditor reputation on
auditor independence. Working Paper, University of
Wisconsin — Madison.
Maynard Smith, J. (1982). Evolution and the theory of games.
Cambridge, UK: Cambridge University Press.
R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810 809
Mear, R., & Firth, M. (1987). Cue usage and self-insight of
?nancial analysts. The Accounting Review, 62(1), 176–182.
Mikhail, M., Walther, B., & Willis, R. (1997). Do security
analysts improve their performance with experience? Journal
of Accounting Research, 35(Supplement), 131–157.
Milgrom, P., & Stokey, N. (1982). Information, trade and com-
mon knowledge. Journal of Economic Theory, 26(1), 17–27.
Moser, D. V. (1998). Using an experimental economics
approach in behavioral accounting research. Behavioral
Research in Accounting, 10(Supplement), 94–110.
Nelson, M. W., Elliott, J. A., & Tarpley, R. L. (2000). Where
do companies attempt earnings management, and when do
auditors prevent it? Working Paper, Cornell University.
Nelson, M. W., & Kinney, W. R. (1997). The e?ect of ambi-
guity on auditors’ loss contingency reporting judgments. The
Accounting Review, 72(2), 257–274.
Nelson, M. W., Krische, S. D., &Bloom?eld, R. (2000). Sticking
with the program: why investors don’t exploit anomalies shown
in large-sample studies. Working Paper, Cornell University.
O’Brien, J., & Srivastava, S. (1991). Dynamic stock markets
with multiple assets: an experimental analysis. Journal of
Finance, 46(5), 1811–1838.
Odean, T. (1998). Volume, volatility, price, and pro?t when all
traders are above average. Journal of Finance, 53(6), 1887–1934.
Ou, J., & Penman, S. (1989). Financial statement analysis and
the prediction of stock returns. Journal of Accounting and
Economics, 11(4), 295–329.
Panko?, L. D., & Virgil, R. L. (1970). Some preliminary ?nd-
ings from a laboratory experiment on the usefulness of
?nancial accounting information to security analysts. Journal
of Accounting Research, 8(Supplement), 1–48.
Paquette, L., & Kida, T. (1988). The e?ect of decision strategy
and task complexity on decision performance. Organizational
Behavior and Human Decision Processes, 41(1), 128–142.
Payne, J. W., Bettman, J. R., & Johnson, E. J. (1992). Beha-
vioral decision research: a constructive processing perspec-
tive. Annual Review of Psychology, 43, 87–131.
Pearce, D. G. (1984). Rationalizable strategic behavior and the
problem of perfection. Econometrica, 52(5), 1029–1050.
Phillips, F. (1999). Auditor attention to and judgments of
aggressive ?nancial reporting. Journal of Accounting
Research, 37(1), 167–189.
Plott, C., & Sunder, S. (1988). Rational expectations and the
aggregation of diverse information in laboratory security
markets. Econometrica, 56(5), 1085–1118.
Runkel, P., & McGrath, J. (1972). Research on human behavior:
a systematic guide to method. New York: Holt, Rinehart and
Winston.
Salterio, S., & Koonce, L. (1997). The persuasiveness of audit
evidence: the case of accounting policy decisions. Accounting,
Organizations and Society, 22(6), 573–587.
Simon, H. A. (1957). Models of man. New York: Wiley.
Sloan, R. (1996). Do stock prices fully re?ect information in
accruals and cash ?ows about future earnings. The Account-
ing Review, 71(3), 289–315.
Slovic, P., Fleissner, D., & Bauman, W. S. (1972). Analyzing
the use of information in investment decision making: a
methodological perspective. Journal of Business, 45, 283–
301.
Slovic, P., & Lichtenstein, S. C. (1968). The relative importance
of probabilities and payo?s in risk taking. Journal of
Experimental Psychology Monograph Supplement, 78.
Slovic, P., & Lichtenstein, S. C. (1971). Comparison of baye-
sian and regression approaches to the study of information
processing in judgment. Organizational Behavior and Human
Performance, 6, 649–744.
Smith, E. E., & Medin, D. L. (1981). Categories and concepts
(pp. 1–17). Harvard: Cambridge.
Smith, V. (1976). Experimental economics: induced value the-
ory. American Economic Review, 66, 274–279.
Swaminathan, B., & Lee, C. (2000). Do stock prices overreact to
earnings news? Working Paper, Cornell University.
Tan, H. T., Libby, R., & Hunton, J. (2000). Analysts’ reactions
to earnings preannouncement strategies. Working Paper,
Cornell University.
Tan, T. C., & Werlang, S. (1988). The bayesian foundations of
solution concepts of games. Journal of Economic Theory,
45(2), 379–391.
Tetlock, P. (1992). The impact of accountability on judgment and
choice: towardasocial contingencymodel. InL. Berkowitz(Ed.),
Advances in experimental social psychology 25 (pp. 331–376).
NewYork: Academic Press.
Thaler, R. H. (1999). The end of behavioral ?nance. Financial
Analysts’ Journal, 55(November/December), 12–17.
Trotman, K. T. (1996). Research methods for judgment and
decision making studies in auditing. Melbourne, Australia:
Coopers and Lybrand.
Tucker, R. R. (1997). The relationship between public and pri-
vate information: an experimental markets study. Behavioral
Research in Accounting, 9, 219–249.
Tuttle, B., Coller, M., & Burton, F. G. (1997). An examination
of market e?ciency: information order e?ects in a laboratory
market. Accounting, Organizations and Society, 22(1), 89–103.
Tversky, A., & Kahneman, D. (1974). Judgment under uncer-
tainty: heuristics and biases. Science, 185, 1124–1131.
Vincent, L. (1997). Equity valuation implications of purchase
versus pooling accounting. The Journal of Financial State-
ment Analysis, 2(4), 5–19.
Wagenhofer, A. (1990). Voluntary disclosure with a strategic
opponent. Journal of Accounting and Economics, 12(4), 341–363.
Watts, R., & Zimmerman, J. (1986). Positive accounting
research. Englewood Cli?s, NJ: Prentice Hall.
Whitecotton, S. M. (1996). The e?ects of experience and con-
?dence on decision aid reliance: a causal model. Behavioral
Research in Accounting, 8, 194–216.
Wilks, J. (2001). Predecisional distortion of evidence as a con-
sequence of real-time audit review. Working Paper, Brigham
Young University.
Wright, W. F. (1977). Financial information processing models:
an empirical study. The Accounting Review, 52(3), 676–689.
Yetton, P. W., & Bottger, P. C. (1982). Individual versus group
problem solving: an empirical test of a best-member strategy.
Organizational Behavior and Human Decision Processes, 29,
307–321.
810 R. Libby et al. / Accounting, Organizations and Society 27 (2002) 775–810
doc_261777888.pdf