Performance evaluation inflation and compression

jasminepvk · Feb 6, 2016

Description
We provide a behavioral account of subjective performance evaluation inflation (i.e.,
leniency bias) and compression (i.e., centrality bias). When a manager observes noisy signals
of employee performance and the manager strives to produce accurate ratings but
feels worse about unfavorable errors than about favorable errors, the manager’s selfishly
optimal ratings will be biased upwards. Both the uncertainty about performance and the
asymmetry in the manager’s utility are necessary conditions for performance evaluation
inflation. Moreover, the extent of the bias is increasing in the variance of the performance
signal and in the asymmetry in aversion to unfair ratings. Uncertainty about performance
also leads to compressed ratings. These results suggest that performance appraisals based
on well-defined unambiguous criteria will have less bias. Additionally, we demonstrate
that employer and employee can account for biased performance evaluations when they
agree to a contract, and thus, to the extent leniency bias and centrality bias persist, these
biases hurt employee performance and lower firm productivity.

Performance evaluation in?ation and compression
Russell Golman
?
, Sudeep Bhatia
Department of Social and Decision Sciences, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA 15213, USA
a b s t r a c t
We provide a behavioral account of subjective performance evaluation in?ation (i.e.,
leniency bias) and compression (i.e., centrality bias). When a manager observes noisy sig-
nals of employee performance and the manager strives to produce accurate ratings but
feels worse about unfavorable errors than about favorable errors, the manager’s sel?shly
optimal ratings will be biased upwards. Both the uncertainty about performance and the
asymmetry in the manager’s utility are necessary conditions for performance evaluation
in?ation. Moreover, the extent of the bias is increasing in the variance of the performance
signal and in the asymmetry in aversion to unfair ratings. Uncertainty about performance
also leads to compressed ratings. These results suggest that performance appraisals based
on well-de?ned unambiguous criteria will have less bias. Additionally, we demonstrate
that employer and employee can account for biased performance evaluations when they
agree to a contract, and thus, to the extent leniency bias and centrality bias persist, these
biases hurt employee performance and lower ?rm productivity.
Ó 2012 Elsevier Ltd. All rights reserved.
1. Introduction
Subjective performance evaluation is a powerful infor-
mational tool for an organization. It allows employers to
determine compensation and job assignments, and to pro-
vide feedback when objective measures are costly, inaccu-
rate or unavailable (Baker, Gibbons, & Murphy, 1994;
Murphy, 1999; Prendergast, 1999). More than 70% of ?rms
utilize a formal employee performance appraisal mecha-
nism (Murphy & Cleveland, 1991). Subjective evaluations
also play an important role in worker recruitment, with
roughly half of all workers ?nding jobs through external
references (Montgomery, 1991).
Despite its importance for human resource accounting
and management, subjective performance evaluation has
a number of problems. Researchers in psychology, account-
ing and organizational behavior have found that subjective
evaluations suffer from severe leniency effects. Performance
appraisal ratings display an upward bias (Bol, 2011; Saal &
Landy, 1977), with 60–70% of those being assessed rated in
the top two categories of ?ve-point rating scales (Bretz,
Milkovich, & Read, 1992). This effect is more pronounced
in settings where subjective performance ratings are used
to determine worker compensation (Jawahar & Williams,
1997), in settings where information about the employee’s
true competence is scarce (Bol, 2011), and in settings
where the manager and employee have a particularly
strong relationship (Bol, 2011; Lawler, 1990; Murphy &
Cleveland, 1991). Performance evaluations also are shown
to display a centrality bias with supervisors compressing
ratings so that they differ little from the norm (Moers,
2005; Prendergast, 1999).
These biases
1
have been documented through surveys of
organizations and practitioners (Murphy & Cleveland, 1995),
laboratory studies (Bernardin, Cooke, & Villanova, 2000;
Kane, Bernardin, Villanova, & Peyre?tte, 1995) and archival
data sets of ?rms (Bol, 2011; Moers, 2005). In general, they
0361-3682/$ - see front matter Ó 2012 Elsevier Ltd. All rights reserved.http://dx.doi.org/10.1016/j.aos.2012.09.001
?
Corresponding author. Tel.: +1 412 268 9861; fax: +1 412 268 6938.
E-mail address: [email protected] (R. Golman).
1
In this paper ‘bias’ refers to the aforementioned leniency and centrality
biases rather than to the more pernicious demographic biases that also
plague performance evaluation (Castilla & Benard, 2010).
Accounting, Organizations and Society 37 (2012) 534–543
Contents lists available at SciVerse ScienceDirect
Accounting, Organizations and Society
j our nal homepage: www. el sevi er. com/ l ocat e/ aos
can generate a Lake Wobegon Effect with almost everyone
rated above average (Moran & Morgan, 2003). Besides
reducing the informational value of performance evalua-
tions, such biases also distort wages and can impact worker
effort and ?rm productivity.
Given the prevalence of leniency and centrality biases,
it has been suggested that managers willfully alter ratings
in order to help workers, improve training or avoid con?ict
(Levy & Williams, 2004; Longenecker, Sims, & Gioia, 1987;
Prendergast & Topel, 1996; Prendergast, 2002). It is not the
case, however, that managers necessarily have explicit
preferences for in?ated evaluations. In this paper, we pro-
vide an alternative model of performance evaluation,
which assumes that managers prefer to issue accurate rat-
ings, but also have an asymmetry in their aversion to unde-
servedly high and undeservedly low evaluations. This
asymmetry can stem from a number of different causes:
the manager may be sympathetic towards the employee,
as assumed in recent work (Giebe & Gürtler, 2012; Grund
& Przemeck, 2012), or the manager may be indifferent to-
wards the employee’s wellbeing, but may not want to dis-
courage the employee with unfairly low ratings. In either
case, managers prefer accurate ratings, and biases arise
only in the face of uncertainty.
Uncertainty about worker evaluation plays a crucial
role in our model. Job performance measures suffer from
some imprecision or measurement error (Landy & Farr,
1980; Murphy, 2008), and in order to make a subjective
evaluation, managers must aggregate noisy signals of job
performance with prior information about employee com-
petence (Banker & Datar, 1989). If the manager feels worse
about unfavorable errors than about favorable errors, then
despite a desire to get it right, the manager’s sel?shly opti-
mal evaluation will be higher than the best estimate given
the employee’s signal and the manager’s prior beliefs. Con-
sequently, more than half the population will be rated as
above the actual average competence. Moreover, leniency
bias, measured as the difference between the average rat-
ing in the population and the actual average competence,
will increase with the noisiness of the performance signal
and the asymmetry in the manager’s fairness preferences,
as documented in empirical work on subjective perfor-
mance evaluation (Bol, 2011; Lawler, 1990; Murphy &
Cleveland, 1991).
2
In addition to this leniency bias, the man-
ager’s evaluations will also display a centrality bias. The dis-
tribution of the assigned ratings will have lower variance
than the underlying distribution of actual competence. This
compression is also due to the inherent noise in the signal.
To make the best estimate of employee competence, the
manager discounts the magnitude of the signal to account
for this noise. Leniency and centrality biases both arise from
the same assumptions in our model, and the combined effect
is that the least competent employees get the most in?ated
ratings.
After proposing our account for the leniency and cen-
trality effects in subjective performance evaluations, we
consider incentive contracts involving sophisticated princi-
pals and agents. We assume that the ?rm derives organiza-
tional capital from more accurate (more informative)
subjective evaluations, as well as, of course, pro?ts from
the employee’s effort. We consider managers with intrinsic
motivation to provide an accurate evaluation, but also
some degree of altruism towards the employee. In equilib-
rium the manager thus exhibits the aforementioned asym-
metric aversion to undeservedly high and undeservedly
low ratings.
In anticipation of performance evaluation bias, employ-
ers adjust the compensation package they offer their
employees. We showthat the leniency and centrality biases
are detrimental to employee performance and thus costly
to the ?rm in equilibrium. Additionally, employees’ total
wages decrease due to the manager’s leniency as well as
to the manger’s rating compression. Moreover, consistent
with empirical research (Jawahar & Williams, 1997), we
show that leniency bias exists when the manager’s evalua-
tion is used to determine the employee’s pay and is increas-
ing with the manager’s altruism towards the employee.
The rest of the paper is organized as follows. Section 2
discusses subjective performance evaluation and its associ-
ated biases in more detail. It also outlines recent research
on these biases, and highlights the ways that this paper ex-
pands on and complements previous work. Section 3 pro-
vides a general mathematical model of performance
evaluation. Section 4 identi?es leniency and centrality
biases in the manager’s ratings. Section 5 embeds this
model within an incentive contract framework and derives
implications for employee performance and compensation.
Section 6 concludes.
2. Subjective performance evaluation
Management researchers and economists have long
been concerned with the role of performance evaluation
in incentive design and optimal contracting (Dutta, 2008;
Gibbons, 2005; Giebe & Gürtler, 2012; Holmstrom, 1979).
Traditionally it was assumed that contracts specify com-
pensation as a function of a single, veri?able (possibly
noisy) performance measure. This performance measures
is generally required to be objective, as auditors consider
objective measures to be reliable and veri?able. Objective
performance measures alone may distort incentives, how-
ever, leading agents to choose sel?shly optimal behavior
that is harmful to the employer (Baker, 1992; Holmstrom,
1979; Holmstrom & Milgrom, 1991). Firms should use per-
formance measures that capture the full value of the em-
ployee’s actions, including measures that are based on
opinions and other subjective judgments. For this reason,
subjective evaluation is often incorporated as part of an
optimal contract (Baiman & Rajan, 1995; Baker et al.,
1994). Indeed, a ?rm’s future performance can be pre-
dicted by previous discretionary compensation, suggesting
that agents are rewarded for good work even in settings
where the objective returns are delayed into the future
(Hayes & Schaefer, 2000).
Subjectiveperformance evaluation, however, suffers from
two important limitations. In the absence of well-de?ned,
2
While our model accounts for the common ?nding that ratings are
biased upwards, a natural generalization could also describe raters who
would prefer under-reporting performance to over-reporting performance
(Cheatham, Davis, & Cheatham, 1996).
R. Golman, S. Bhatia / Accounting, Organizations and Society 37 (2012) 534–543 535
objective criteria, raters are vulnerable to behavioral biases.
Their evaluations are liable to both in?ation and
compression. The former generates a leniencybias, according
to which too many employees are rated above average,
whereas the latter generates a centrality bias, i.e., there is
too little variation in employee ratings. A stark example of
these biases can be seen in Merck & Co., Inc’s performance
rating system during the 1980s. Murphy (1992) ?nds that
the vast majority of subjects at Merck received a rating in
the top ?ve categories of a thirteen category rating scale.
Furthermore, over 70% of these employees occupied just
three of the thirteen performance categories (see also
Prendergast, 1999).
There have been two recent attempts to explain these
performance evaluation biases in the economics literature.
Grund and Przemeck (2012) capture the leniency and cen-
trality effects by assuming that managers are altruistic,
that workers are inequality averse, and that managers
trade off the bene?ts of helping their workers against the
costs of distorting their evaluations. Giebe and Gürtler
(2012) similarly assume that managers are altruistic to-
wards the employees, and explore settings in which opti-
mal contracts can generate the leniency bias. More
broadly, models of social preference, such as inequality
aversion (Fehr & Schmidt, 1999; Bolton & Ockenfels,
2000) have also been shown to account for a range of other
anomalies in worker and employer behavior, including in-
creased worker effort in trust and gift-exchange settings
(Fehr, Kirchler, Weichbold, & Gachter, 1998), the use of
unenforceable bonus or trust contracts, or incomplete con-
tracts, instead of standard incentive contracts (Fehr &
Schmidt, 2007), and the popularity of team-based incen-
tives (Bartling, 2011; Englmaier & Wambach, 2010).
In line with these models, as well as with empirical
work demonstrating that social factors in?uence a man-
ager’s perceptions of employee performance (Johnson,
Erez, Kiker, & Motowidlo, 2002; Judge & Ferris, 1993; Levy
& Williams, 2004), we too assume an altruistic manager,
but (unlike earlier theory papers) we still consider the
manager to be intrinsically motivated to produce accurate
(fair) evaluations. Moreover, while we focus in Section 5
on altruism as the source of an asymmetry in the man-
ager’s aversion to unfair evaluations, we acknowledge that
other motives, such as a desire not to discourage the em-
ployee or a desire to avoid con?ict, could play a similar
role and thus could also be responsible for leniency bias.
In Sections 3 and 4 we analyze leniency and centrality bias
due to noise in the performance signal, taking the asym-
metry in the manager’s utility as a primitive rather than
assuming a particular source for it. Our premise is that
managers may want to be fair (Maas, van Rinsum, &
Towry, 2009), but are still affected by considerations of
workplace harmony or employee sympathy (Harris,
1994; Murphy, Cleveland, Skattebo, & Kinney, 2004). Rec-
ognizing that uncertainty about employee performance
might be necessary for both leniency and centrality bias
helps us understand why these biases are often observed
together. Additionally, by analyzing leniency and central-
ity bias within an incentive contract framework, our model
makes predictions about the impact of these biases on
compensation contracts, as well as on worker effort and
?rm productivity. We derive comparative statics describ-
ing how the extent of these biases and their adverse effects
on employee effort, performance, and compensation
depend on contextual factors such as the strength of man-
ager–employee relationships or the amount of uncertainty
in the performance measure. Our primary contribution in
this light is bringing together a behavioral economic the-
ory of prosocial preferences with a standard bayesian
learning model and standard contract theory to explain ro-
bust empirical ?ndings on subjective evaluations.
3. A mathematical model of performance evaluation
A manager is tasked with evaluating a heterogeneous
distribution of employees who vary in their levels of com-
petence. For simplicity, assume an employee (arbitrarily,
employee i) has true competence x
i
÷ R (expressible as a
real number). Obviously, x
i
is unknown to the manager,
but the manager does know that x
i
- N(x; h
2
), i.e., that
competence is normally distributed in the population with
mean x and variance h
2
. This knowledge serves as the man-
ager’s prior. The manager then observes a signal of the em-
ployee’s performance y
i
- N(x
i
,r
2
). The signal depends of
course on the employee’s true competence, but has unbi-
ased noise with variance r
2
. Thus, r captures the uncer-
tainty in the performance measure.
After observing the signal of employee performance, the
manager issues a rating z
i
÷ R to employee i. We assume
the utility of the manager takes the form
U
M
(z
i
) =
÷k(x
i
÷z
i
) if z
i
< x
i
÷(z
i
÷x
i
) otherwise;
_
(1)
with k > 1. This re?ects a scenario in which the manager
would like to issue a rating equal to the employee’s true
competence, but the manager feels worse about issuing a
rating that is undeservedly low than about issuing a rating
undeservedly high.
3
Presumably, the manager’s primary
goal is to assign ratings fairly, but the manager may also
sympathize somewhat with the employee or may not want
to discourage the employee, or may wish to ingratiate him
or herself with the employee. These secondary goals pro-
duce an asymmetry in the manager’s utility function that
is captured by the factor k. If the performance evaluation
is used to determine employee compensation in a tourna-
ment or in some other incentive contract, then Eq. 1 with
k > 1 is appropriate when the manager is not the residual
claimant of the employee’s value added. Usually this is the
case (Prendergast & Topel, 1993). In Section 5 we derive
Eq. (1) in the context of incentive contracting by assuming
the manager’s secondary preference is due to altruism for
the employee, but by assuming Eq. (1) for now we are not
yet committing to any one particular source for the asym-
metry in the manager’s utility.
Finally, note that if there was no uncertainty in the per-
formance measure, i.e., if r = 0, the manager would know
3
In cases where the manager feels worse about undeserved high ratings
than undue low ones, we would have k < 1, and we would predict rating
de?ation. This might occur if, for example, the manager is personally
responsible for the compensation package determined by the evaluation.
536 R. Golman, S. Bhatia / Accounting, Organizations and Society 37 (2012) 534–543
the employee’s true competence x
i
and would issue a per-
fectly accurate and unbiased rating z
i
= y
i
= x
i
.
4. Leniency bias and centrality bias
A manager’s rating strategy is a function f : R ÷R
where z
i
= f(y
i
). The manager updates her belief about the
employee’s true competence using Bayesian inference.
She then chooses a rating contingent on this inference.
Her sel?shly optimal rating maximizes her utility function.
Theorem 1. The rule for determining the rating z
i
is
f(y
i
) =

x ÷(y
i
÷

x)
h
2
r
2
÷h
2
÷
????????????????????
2
r
2
h
2
r
2
÷h
2
¸
erf
÷1
k ÷1
k ÷1
_ _
: (2)
A straightforward proof is in the appendix. Note that
erf(t) =
2
???
p
_
_
t
0
e
÷s
2
ds is the error function, which is neces-
sary to express the cumulative distribution function of a
normal distribution.
The second term in Eq. (2) depends on the performance
signal that the manager observes. The normalization factor
of
h
2
r
2
÷h
2
appears because the performance signal is inher-
ently noisy and the manager should discount the magni-
tude of the signal to account for this noise. The manager
knows that the variance of performance signals in the pop-
ulation is var (y
i
) = r
2
+ h
2
whereas the variance of compe-
tence is only var (x
i
) = h
2
. Thus, to balance dispersion
caused by the noisy signal, an employee’s expected rating
conditional on his true competence is compressed towards
the mean. While this compression is in accordance with
Bayes rule, it nevertheless generates centrality bias. The
manager’s ratings, as we will see, end up with less variance
than the actual employee competence.
The last term in Eq. (2) is the source of the leniency
bias. The manager’s best estimate of the employee’s
true competence after seeing the performance signal is

x ÷(y
i
÷

x)
h
2
r
2
÷h
2
, but the manager adds into the rating this
additional term that is positive for k > 1. The amount of
in?ation is increasing in r, in h, and in k. Intuitively, the
greater the aversion to underrating the employee (relative
to overrating him), the more the manager will in?ate the
rating. Similarly, the more uncertain the manager is about
employee competence, the more she will in?ate the rating
to reduce the chance of an underrating.
We thus obtain the following predictions (for k > 1):
Corollary 1. An employee’s expected rating, conditional on
his true competence x
i
, is increasing linearly in his compe-
tence, with compression towards the mean

x and in?ation
(addition of a positive constant).
Corollary 2. Leniency Bias: The average rating in the popula-
tion exceeds average competence and is increasing in signal
noisiness r, in employee heterogeneity h, and in preference
asymmetry k.
Corollary 3. The Lake Wobegon Effect: The fraction of the
population rated above average competence is greater than
one half and is increasing in signal noisiness r and in prefer-
ence asymmetry k, but decreasing
4
in employee heterogeneity h.
Corollary 4. entrality Bias: The distribution of assigned rat-
ings has lower variance than the underlying distribution of
competence. The variance in ratings is actually decreasing
with the signal noisiness r.
Theorem 1 and Corollaries 1–4 capture the leniency and
centrality biases, as documented by Bretz et al. (1992), Bol
(2011), Jawahar and Williams (1997), Moers (2005) and
Prendergast (1999). These results also accord with addi-
tional empirical ?ndings characterizing these effects. For
example, Corollary 2 indicates that leniency biases de-
pends on the noise in the performance signal provided by
the employee, as documented by Bol (2011). Likewise,
the dependence of leniency bias on k matches the empiri-
cal ?nding that leniency bias increases with the strength of
the manager–employee relationship (Bol, 2011; Jawahar &
Williams, 1997; Lawler, 1990; Murphy & Cleveland, 1991).
Note that leniency bias is not simply a trivial implica-
tion of this preference asymmetry. We could construct a
bimodal distribution of competence, with low-competence
types more common than high-competence types, such
that for a suf?ciently noisy signal the average rating would
be below average competence, despite the aversion to un-
fairly low ratings. Of course, such a peculiar distribution of
competence would have no empirical basis.
Example. To illustrate how ratings become compressed
and in?ated despite the manager’s preference for an
accurate evaluation, we provide a numerical example with
convenient parameter values. We take the average com-
petence in the population to be

x = 50 and the distribution
to have standard deviation h = 8. We consider the manager,
Alice, to be using a noisy performance measure with
standard deviation r = 6 and to have an aversion to
undeservedly low ratings (relative to undeservedly high
ones) captured by k = 6. Alice would prefer her ratings of
her employees to match their competence levels, but she
does not know their actual competences, and she considers
it six times worse to underrate than to overrate. Suppose
she observes one employee, Barry, and his performance
appears to re?ect a competence of y
Barry
= 60. Barry
appears to be a good worker – Alice ?nds his performance
signal to be one standard deviation above the average.
Suppose another employee, Bob, appears to be a poor
worker, with y
Bob
= 40. Clearly, Alice judges Barry to be
better than Bob. But how much of Barry’s strong perfor-
mance (and Bob’s weak one) can she attribute to his
competence as opposed to good (or, in Bob’s case, bad)
luck? And, moreover, what if she is wrong?
4
While leniency bias is increasing in employee heterogeneity, so is the
distance of a below-average employee from average competence ratings.
Intuitively, with greater variance in competence, less of the population
should be able to make the jump to above average.
R. Golman, S. Bhatia / Accounting, Organizations and Society 37 (2012) 534–543 537
As a Bayesian, Alice knows the best estimate (given the
standard deviations above) is to attribute 36% of the
variation in signals to noise, leaving 64% to be explained
by differences in competence. Maybe she caught Barry on a
good day and Bob on a bad day. Her best estimate of Barry’s
competence would be 56.4 and Bob’s would be 43.6. Both
of these estimates are compressed towards the mean, 50.
But Alice does not use only her best estimate in order to
determine a rating. It is possible her best estimate is too
high or too low. If it is too high, that is bad, but if it is too
low, that is much worse. She would like to decrease the
chance that she underrates her employees even though
that means increasing the chance that she overrates them.
To maximize her expected utility, she in?ates each esti-
mate by 5.1, rating Barry at z
Barry
~ 61.5 and Bob at
z
Bob
~ 48.7. That is, even though it’s more likely that noise
helped Barry rather than hurt him, Alice considers both
scenarios possible and is concerned enough about the
latter scenario that she issues a rating even better than the
signal that she observed. But Bob’s rating is boosted even
higher, relative to his performance signal, because it is
probable that his signal did not do him justice.
For symmetry, we chose Barry to generate a perfor-
mance signal one standard deviation above the average
and Bob one standard deviation below. We can now
observe leniency bias in their ratings. While their
(expected) average competence is 50, the average of their
ratings is 55.1. We can also observe centrality bias in their
ratings. Their ratings differ from the average rating by 6.4
in each direction. This is less than one standard deviation
in the distribution of competence, which we took to be 8. If
Alice could somehow introduce a better accounting system
and reduce the noise in her performance measure, she
would be able to reduce the leniency bias and the
centrality bias distorting her evaluations. As we will see
in the subsequent section, this would improve employee
performance and ?rm productivity.
5. Incentive contracts
When offering an incentive contract with compensation
based on a manager’s subjective evaluation, a sophisticated
employer accounts for the manager’s biased ratings. The
employer (the principal) utilizes an incentive contract to
align the incentives of an employee (the agent) facing mor-
al hazard in deciding how much effort to exert on the job.
When effort is not observable objectively, it is not directly
contractible, and an additional agent (the manager) may be
tasked with evaluating the employee’s performance. The
employment contract may specify that pay depends on this
subjective evaluation,
5
and this evaluation may also inform
the employer about ongoing training and development and
hiring needs, thereby directly contributing to organizational
capital.
6
Employees may vary in their ability, and it may be
impossible to distinguish ability from effort generally.
Now, suppose employee (i’s) competence x
i
÷ R is the
(weighted) sum of ability and effort, x
i
= a
i
+ qe
i
. Ability is
normally distributed in the population with mean

a and
variance h
2
, i.e., a
i
- N(

a; h
2
). Effort e
i
÷ Ris a choice variable
for the employee. As in Section 3, the manager observes a
noisy signal of the employee’s performance y
i
- N(x
i
,r
2
)
and issues a rating z
i
÷ R to maximize her own utility.
The rating z
i
is contractible performance measure,
whereas neither competence nor effort is contractible,
and their effect on total ?rm value is too diffuse to be a
useful measure.
7
A contract between the employer and
the employee will specify the wage as a function of the man-
ager’s rating, w
i
= f(z
i
). For illustration, we suppose the value
to the ?rm of the employee’s contributions is V(x
i
) = e
kx
i
. We
adopt this exponential functional form because it seems rea-
sonable that value created is convex in competence, as the
marginal productivity of effort should be increasing in abil-
ity, and because it guarantees that the employee’s value is
positive. We also suppose there is loss in organizational
capital from an inaccurate performance evaluation. This
loss is increasing in the magnitude of the error, captured
as L([z
i
÷ x
i
[) for some ‘‘well-behaved’’
8
increasing function
L. The employer’s pro?t is then P = V(x
i
) ÷ L([z
i
÷ x
i
[) ÷ w
i
.
Employee utility is assumed to be an additively separa-
ble function of wealth and effort exertion. We consider risk
averse employees with Bernoulli utility for wealth ln (w
i
)
satisfying constant relative risk aversion. The cost of effort
c(e) is increasing and convex, with lim
e÷e
min
c
/
(e) = 0;
lim
e÷emax
c
/
(e) = ·, and c
//
(e) > 0 for all e.
9
Thus, U
E
= ln
(w
i
) ÷ c(e
i
). We suppose the manager is intrinsically motivated
to do a good job (Likert, 1961), i.e., to report an accurate
evaluation, but is also altruistic towards the employee. (As
discussed earlier, there could be many reasons for the asym-
metry in aversion to unfair ratings, from the desire to boost
employee morale to the desire to avoid con?ict, but for the
sake of parsimony we focus on the manager’s sympathy for
the employee.) We have U
M
= ÷[z
i
÷ x
i
[ + g U
E
, where g > 0
indicates the degree of altruism (relative to the degree of
intrinsic motivation).
We assume the employee (and obviously the employer
as well) does not know his own ability when agreeing to a
contract (as in Tsoulouhas & Marinakis, 2007). The em-
ployee then discovers his type after agreeing to a contract,
but before deciding how much effort to exert on the job. If
agents knew their type before agreeing to a contract, there
would be adverse selection in choosing from a menu of
5
Details about the implementation of an optimal contract, i.e., whether
it may be incomplete or implicit, are beyond our scope.
6
See Prescott and Visscher (1980) and Jovanovic (1979), seminal works
modeling organizational capital derived from employee job ?t.
7
See Section 2 for a discussion of the limitations of objective measures of
performance in employee appraisal. Additionally, for analysis demonstrat-
ing the inef?ciency of constructing incentives based on total ?rm value, see
Feltham and Xie (1994).
8
For tractability we impose the technical condition that L increases
without bound, but the convolution of L([ [) with a normal distribution
always exists. We could allow L to asymptote, but then we would need to
restrict some other parameters (e.g., taking g (introduced later) to be small
enough or the cost function c to be growing quickly enough) in order to
guarantee that employer’s pro?t has a well-de?ned maximum.
9
We can think of the cost of effort as net of intrinsic motivation, but by
assuming utility is additively separable in wealth and effort, we would then
be disregarding the possibility that monetary incentives might crowd out
intrinsic motivation, as suggested by Deci (1972), Benabou and Tirole
(2006), Gneezy, Meier, and Rey-Biel (2011), Heyman and Ariely (2004), to
name a few.
538 R. Golman, S. Bhatia / Accounting, Organizations and Society 37 (2012) 534–543
contracts, with low-ability types trying to imitate high-
ability types and high-ability types trying to distinguish
themselves (Bhattacharaya & Guasch, 1988; Levy & Vukina,
2002; O’Keeffe, Viscusi, & Zeckhauser, 1984; Riis, 2010). It
would be ef?cient for employees to sort themselves and
avoid exposure to the uncertainty surrounding their true
ability, and we expect employees would obtain credentials
to signal their ability (Lazear & Rosen, 1981). As we are
interested in retaining heterogeneity in employee perfor-
mance, we consider the case in which sorting contracts
by ability is impossible. While we would generally assume
that agents know their own type when one’s type deter-
mines one’s preferences, in models in which one’s type re-
fers to one’s quality, it is quite reasonable to assume a lack
of self-knowledge (Kruger & Dunning, 1999).
We consider contracts of the form w
i
= ae
bz
i
with a P0
and b P0. An exponential contract of this form would be
optimal if the manager was observing perfect noiseless sig-
nals of employee performance (Edmans & Gabaix, 2011).
When employees cannot predict their eventual compensa-
tion precisely due to noisy performance signals, some such
functional form assumption is necessary for tractability.
The common linear contract is only appropriate given rigid
assumptions about the employee’s utility from money
(Holmstrom & Milgrom, 1987), which we ?nd less reason-
able for many reasons. Structuring compensation through
a promotion tournament or with stock options leads to
convex, not linear, incentives. Murphy (1999) argues that
a log-linear relationship between compensation and per-
formance (i.e., an exponential contract) is empirically more
relevant (Edmans & Gabaix, 2011), as it is a percentage
change in pay, not an absolute change, that is best corre-
lated with a percentage change in ?rm value. Also, in our
setting a linear contract would allow for unbounded nega-
tive wages.
10
Given this exponential contract form, the man-
ager’s utility U
M
can, after a positive linear transformation,
be expressed as in Eq. (1) with k =
1÷gb
1÷gb
. Of course, it remains
for us to show that in equilibrium gb < 1.
We suppose that the labor market is competitive and
there is just a single employer. In equilibrium employees
are indifferent between accepting the contract or taking
an outside option with utility normalized to 0. The employ-
er offers the contract that maximizes its pro?t subject to
this constraint.
Theorem 2. In equilibrium, restricting to contracts of the
form w
i
= ae
bz
i
, the employer offers (and the employee
accepts) a contract with
b = arg max
^
bP0
e
1
2
k
2
h
2
÷k a÷q(c
/
)
÷1
^
bq
h
2
r
2
÷h
2
_ _ _ _
_
÷e
1
2
^
b
2 h
4
r
2
÷h
2
÷c (c
/
)
÷1
^
bq
h
2
r
2
÷h
2
_ _ _ _
÷E[L([D[)[
_
(3)
(introducing the random variable D - N
??????????????
2
r
2
h
2
r
2
÷h
2
_
erf
÷1
_
(g
^
b);
r
2
h
2
r
2
÷h
2
) in Eq. (3) above) and
a = exp c(e
+
) ÷b

a ÷qe
+
÷
????????????????????
2
r
2
h
2
r
2
÷h
2
¸
erf
÷1
(gb)
_
_
_
_
_
_
_
_
:
(4)
All employees exert effort e
+
= (c
/
)
÷1
bq
h
2
r
2
÷h
2
_ _
. The optimal
effort level is independent of ability. Employee competence
is normally distributed, x
i
- N(x; h
2
), with mean x =

a ÷qe
+
.
The manager’s rating function for determining z
i
is given by
f(y
i
) =

x ÷(y
i
÷

x)
h
2
r
2
÷h
2
÷
????????????????????
2
r
2
h
2
r
2
÷h
2
¸
erf
÷1
(gb); (5)
in accordance with Eq. (2) from Theorem 1.
The proof, which relies on backward induction, is in the
appendix. The fact that optimal effort is independent of
ability may be surprising, considering that ability and ef-
fort interact here. However, it turns out that the employ-
ee’s marginal utility of increased wages from increased
effort is constant, so the optimal effort level is the same
for all employees, depending only on the marginal disutil-
ity of effort.
Theorem 2 has many implications that accord with sub-
stantial empirical evidence. For example, it is well known
that managers are more likely to distort their evaluations
when money is on the line (Jawahar & Williams, 1997;
Landy & Farr, 1980; Murphy & Cleveland, 1991). Indeed,
the assumptions of Theorem 2 do not pin down whether
in fact pay will depend on performance (b > 0) or not
(b = 0) – it depends on parameter values – but the theorem
does imply that subjective evaluations will be in?ated
when they affect wages whereas leniency bias will vanish
when pay is independent of performance.
The leniency bias and centrality bias in the manager’s
performance rating affect the contract that the employer
and employee agree upon. Centrality bias has a straightfor-
ward consequence. Knowing that the eventual evaluation
will be compressed toward mean competence, the employ-
ee has less incentive to put in effort. The employer may
mitigate this problem by offering a contract with stronger
or weaker performance incentives (a new value of b), but
the net result is lower effort, lower performance, and a
lower expected wage. In addition to the decline in wage
from poorer job performance, there is also a decline be-
cause compression of ratings decreases the variance in
the wage, and the employee then requires less compensa-
tion for exposure to risk.
Leniency bias has a somewhat more complicated effect.
The direct impact is to increase the expected wage, but rec-
ognizing that there will be leniency bias, the employer and
employee agree to a contract that pays the employee pro-
portionately less, i.e., a decreases with leniency bias. These
two effects precisely balance out. Additionally, though, le-
niency bias makes performance evaluation less useful for
research and employee development purposes, causing a
decline in organizational capital. The employer, to the ex-
tent it cares about the employee development objective,
mitigates the loss in organizational capital by decreasing
the performance incentives in the employee’s contract
10
Additionally, an exponential contract generates a lognormal distribu-
tion of wages in our model. This too has empirical support (Lydall, 1968).
R. Golman, S. Bhatia / Accounting, Organizations and Society 37 (2012) 534–543 539
(lower b), thereby reducing the altruistic manager’s desire
to be lenient. The weaker performance incentives naturally
lead to lower effort, worse employee performance, and a
lower expected wage.
We thus obtain the following comparative statics.
Corollary 5. While b > 0:
1. The greater the manager’s altruism, the more leniency bias
we should observe, and in turn, weaker performance
incentives, poorer employee performance, and lower over-
all wages. (As g increases, E[z
i
÷ x
i
] increases, but b,x
i
, and
w
i
decrease.
11
)
2. The more the employer cares about the organizational cap-
ital accruing from performance evaluation, the weaker the
employee performance incentives, and in turn there is less
leniency bias, poorer employee performance, and lower
overall wages. (As we magnify the function L (i.e., L ?cL
for c > 1), we ?nd that b, E[z
i
÷ x
i
], x
i
, and w
i
all decrease.)
3. The more valuable the employee’s work, the stronger the
performance incentives, and in turn there is more leniency
bias, better employee performance, and higher overall
wages. (As k increases, b, E[z
i
÷ x
i
], x
i
, and w
i
all increase.)
4. Uncertainty in the performance measure exacerbates both
centrality bias and leniency bias. (As r increases, E[z
i
÷ x
i
]
increases and var(z
i
) decreases.) Depending on parameter
values, the employer may respond to a noisier performance
signal by specifying stronger or weaker performance incen-
tives in the contract. In either case, the actual incentive to
exert effort is weaker (because even if the contract ramps
up incentives, it does not completely counterbalance the
manager’s compression of ratings), so employee perfor-
mance becomes worse with noisier performance measures.
(As r increases, x
i
decreases.) In the case that performance
incentives become weaker in more uncertain environ-
ments, then of course the expected wage decreases as well.
(If r increases and b decreases, then E[w
i
] decreases.)
We should emphasize that we ?nd an ambiguous rela-
tionship between uncertainty about performance and the
degree of pay-for-performance in employee compensation.
Whereas the traditional model identi?es a negative rela-
tionship due to the tradeoff between incentivizing the em-
ployee and exposing him to risk, we might obtain such a
negative relationship for the same reason or because of
the tradeoff between incentivizing the employee and dis-
torting performance evaluation (by exacerbating leniency
bias), but we might also obtain a positive relationship
due to centrality bias – in more uncertain environments
it takes stronger incentives to get the employee to put in
even close to the same level of effort. Indeed, various
empirical studies have found a positive relationship, a neg-
ative relationship, or the absence of any clear relationship
between uncertainty and incentives, depending on the do-
main (see Prendergast, 2002).
Some of the comparative statics that we obtain are
straightforward. It is not surprising that more valuable
employees would be offered stronger incentives and that
these incentives would improve performance. We also
identify a tradeoff for the employer between using the
evaluation to incentivize the employee to perform or get-
ting more accurate evaluations to inform human resources
management. This tradeoff has natural consequences,
although we suspect that if we made explicit how this
organizational capital contributed to ?rm value (instead
of treating the mechanism as a black box), we might well
have found that employee performance actually improves
the more the employer values organizational capital.
Our ?nding that altruism on the manager’s part has the
perverse effects of hurting employee performance and
decreasing wages is counterintuitive. It should be
acknowledged that the manager’s altruism does not make
the employee worse off in utility terms. Market forces
drive employees’ expected utility to that of their outside
options, regardless of altruism. We would chalk up to
the law of unintended consequences that the manager’s
desire to help the employee turns out, in equilibrium, to
make the employee no better off and to actually harm
the employer.
Of all the comparative statics described in Corollary 5,
the pernicious effects of performance measure uncertainty
are especially worth recognizing. By exacerbating the
biases that plague subjective performance evaluation,
noise in the performance signal not only costs the employ-
er a direct loss in organizational capital from lost informa-
tion, but also demotivates employees leading to poorer
performance and a loss in productivity. Coming up with
better subjective performance measures would thus
directly, and indirectly, bene?t the ?rm.
6. Conclusion
Noise in the performance signal and a stronger aversion
to unfairly low ratings than to overly high ones together
bring a manager to in?ate performance evaluation ratings.
We need not assume the manager desires an in?ated pro-
?le of ratings. The manager may well wish to have accu-
rate, unbiased ratings, but if noisy signals are inevitable,
the manager may still introduce an upward bias to coun-
teract the inherent imprecision of the performance signal.
Experimental evidence that asymmetric fairness consider-
ations boost performance evaluations provides some de-
gree of corroboration for our proposed model (Bol &
Smith, 2011). Further support comes from a ?eld study
?nding that higher information gathering costs in conjunc-
tion with strong employee–manager relationships increase
performance evaluation bias (Bol, 2011). In our model both
the amount of noise in measuring performance and the de-
gree of asymmetry in preferences over ratings error con-
tribute to the size of the bias in the manager’s sel?shly
optimal rating. This suggests that extreme leniency in per-
formance evaluation can be mitigated by de?ning more
concrete, unambiguous evaluation criteria.
12
11
When we say that x
i
or w
i
increases (decreases), we mean that the
distribution (of x
i
or w
i
respectively) shifts so that the new distribution
?rst-order stochastically dominates (is dominated by) the old one.
12
Indeed, a due-process performance appraisal system can reduce the
ambiguity in performance measures and thereby decrease leniency bias
(Taylor, Tracy, Renard, Harrison, & Carroll, 1995).
540 R. Golman, S. Bhatia / Accounting, Organizations and Society 37 (2012) 534–543
While an objectively measurable performance signal
may occasionally determine compensation directly, often
a subjective judgment of employee performance must
be made (Baker et al., 1994). In such cases, a manager’s
incentives will in?uence the rating given to the employee
(Prendergast & Topel, 1993). An incentive compatible
mechanism for the manager to report unbiased
performance signals must go beyond simply creating a
preference for accurate ratings. A manager with other-
regarding preferences will still introduce bias into the
performance evaluation to the extent that the perfor-
mance signal is imprecise and noisy and the extent that
the evaluation will affect compensation. The biases
introduced into a subjective performance evaluation can
be accounted for by a sophisticated employer and
employee and thus affect the contract determining
compensation. In turn, leniency bias and centrality bias
end up demotivating the employee, leading to poorer
performance and lower pro?t.
The framework we present highlights the relevance of
two non-standard variables in subjective performance
evaluation. Ratings are sensitive to both the relationship
between rater and the employee, and purpose of the
evaluation itself. Higher altruism for the employee gener-
ates increased in?ation. Likewise, ratings that affect the
employee’s welfare (such as those that determine wages)
will be more in?ated than ratings that serve some other
informational purpose for the ?rm (such as those for
research or employee development). While these
variables are largely ignored in most theoretical work
on performance evaluation and contracting, they do
impact actual evaluations.
Our framework also emphasizes the role of noise in the
performance signal, as a determinant of performance eval-
uation distortion. Instead of merely adding variance to the
manager’s estimates of the employee’s competence, dis-
persion in the signal generates both systematic in?ation
and compression. One effective way to combat these dis-
tortions is thus simply to reduce the noise in the perfor-
mance signals. Raters who are certain of the true value of
the employee’s competence will not in?ate or compress
their ratings and will provide the ?rm with accurate per-
formance evaluations.
Our ?rst comparative static in Corollary 5 raises the
possibility that ?rms might try to hire managers who
speci?cally are less altruistic in order to reduce leniency
bias. We caution that this comparative static relies on
managers having a ?xed intrinsic motivation to do a good
job, but we have no reason to believe that altruism and
workplace conscientiousness are uncorrelated traits. We
might perhaps recommend ?rms look for conscientious
managers, but we expect they already do so. In the other
direction, of course, malicious managers with utility
decreasing in employee compensation would also pro-
duce biased evaluations (de?ation of ratings rather than
in?ation), so these types should indeed be avoided. We
are most comfortable in recommending the development
of more precise subjective performance evaluation
systems, acknowledging altruism as inevitable. Setting
clear guidelines for managers and carefully speci?ed
criteria for employees could reduce noise in evaluations
and thus mitigate performance evaluation biases.
Although we have focused on the simple case where
managers provide a single subjective rating of the
employee’s competence, the evaluation biases we study
apply to more complex domains as well. For example,
settings in which managers are able to specify the
performance measures that will objectively determine
employee evaluation and compensation are also suscep-
tible to leniency and centrality biases. Managers would
systematically bias not the rating itself, but the weights
placed on the measures that determine the rating.
Indeed Ittner, Larcker, and Meyer (2003) document
leniency bias for subjectively weighted balanced score-
card measures.
The formal model we develop could also be applied
outside the context of employee performance evaluations.
In a laboratory study of motivated communication, there
is more in?ation of reported evaluations when there is
greater uncertainty about the true value of a noisy
variable (Schweitzer & Hsee, 2002). Moving outside of
the lab, ?nancial analysts exhibit leniency bias when
rating securities (Michaely & Womack, 1999), and this
bias is known to be increasing in the uncertainty of
earnings forecasts (Ackert & Athanassakos, 1997; Das,
Levine, & Sivaramakrishnan, 1998). Our model is consis-
tent with these ?ndings, though certainly not conclusive
as we have less intuition supporting fairness as the pri-
mary motive of ?nancial analysts (Fischer & Verrecchia,
2000). Jurors also exhibit a leniency bias when reaching
unanimous verdicts, as compared with solitary decisions,
with a reasonable-doubt standard of proof, but not with
a preponderance-of-evidence standard (MacCoun &
Kerr, 1988). Exposure to contrasting opinions during
group deliberation might increase uncertainty about the
correct verdict, and if we associate a reasonable-doubt
standard with a stronger aversion to convicting an
innocent person than to acquitting a guilty one and
a preponderance-of-evidence standard to symmetric
fairness preferences, then our model predicts leniency
bias here as well.
Appendix A
A.1. Proof of Theorem 1
Straightforward application of Bayes’ Law yields the
posterior density p(x
i
[y
i
) - N

x ÷(y
i
÷

x)
h
2
r
2
÷h
2
;
r
2
h
2
r
2
÷h
2
_ _
(see
Gelman, Carlin, Stern, & Rubin (2004), p. 46). For a given
y
i
, the expected utility from rating z
i
is
E[U(z
i
)[ =
_
z
i
÷·
÷(z
i
÷x
i
) p(x
i
[y
i
) dx
i
÷
_
·
z
i
÷k(x
i
÷z
i
) p(x
i
[y
i
) dx
i
:
The ?rst order condition for a sel?shly optimal rating is
then
R. Golman, S. Bhatia / Accounting, Organizations and Society 37 (2012) 534–543 541
d
dz
i
E[U(z
i
)[ =÷
_
z
i
÷·
p(x
i
[y
i
) dx
i
÷k
_
·
z
i
p(x
i
[y
i
) dx
i
=k÷(k ÷1)
1
2
1÷erf
z
i
÷

x ÷(y
i
÷

x)
h
2
r
2
÷h
2
?????????????
2
r
2
h
2
r
2
÷h
2
_
_
_
_
_
_
_
_
¸
_
_
¸
_ =0:
Eq. (2) is found by inverting to solve for z
i
. h
A.2. Proof of Corollaries 1–4
1. Integrate over the performance signal to ?nd that an
employee’s expected rating conditional on his true
competence x
i
is
E[f(y
i
)[x
i
[ =

x ÷(x
i
÷

x)
h
2
r
2
÷h
2
÷
????????????????????
2
r
2
h
2
r
2
÷h
2
¸
erf
÷1
k ÷1
k ÷1
_ _
:
2. Integrating over the performance signal and the compe-
tence level reveals that the average rating in the popu-
lation is x ÷
??????????????
2
r
2
h
2
r
2
÷h
2
_
erf
÷1
k÷1
k÷1
_ _
.
3. As there is a normal distribution of performance signals
across the population, the fraction of employees rated
above average competence is
Pr(z
i
>

x) = U
???
2
_
r
h
erf
÷1
k ÷1
k ÷1
_ _ _ _
=
1
2
1 ÷erf
r
h
erf
÷1
k ÷1
k ÷1
_ _ _ _ _ _
:
4. The variance in ratings is var(z
i
) =
h
4
r
2
÷h
2
< h
2
. h
A.3. Proof of Theorem 2
Given that x
i
- N(x; h
2
) and b <
1
g
, we obtain the man-
ager’s rating function in Theorem 1 with k =
1÷gb
1÷gb
, which
is explicitly given by Eq. (5). (If b P
1
g
, then the manager
would give every employee the maximal (in?nite) rating.
This obviously does not take place.)
Given the contract parameters and the manager’s rating
function, an employee with ability a
i
chooses effort e
i
(and
thus competence x
i
= a
i
+ qe
i
) to maximize E[ln(ae
bf(y
i
)
)[÷c(e
i
)
where y
i
is of course a stochastic function with mean x
i
.
The ?rst order condition is then bq
h
2
r
2
÷h
2
= c
/
(e
i
). The con-
vexity of c() implies this is indeed a maximum of
expected utility, and the range of c
/
() from 0 to · guaran-
tees that there is a solution: e
i
= (c
/
)
÷1
bq
h
2
r
2
÷h
2
_ _
for all i.
We denote this equilibrium effort level e
?
. Because ability
is normally distributed across employees and all employ-
ees choose the same effort level regardless of ability, com-
petence is then also normally distributed.
An employee’s expected utility if he agrees to the con-
tract (not knowing his own ability) is
U
E
= ln ae
b x÷
??????????
2
r
2
h
2
r
2
÷h
2
_
erf
÷1
gb ( )
_ _ _
_
_
_
÷c(e
+
);
after averaging over his ability and the signal the manager
receives. In a competitive labor market with an outside op-
tion yielding 0 utility, U
E
= 0. Thus,
ln(a) ÷b

x ÷
????????????????????
2
r
2
h
2
r
2
÷h
2
¸
erf
÷1
gb ( )
_
_
_
_
÷c(e
+
) = 0:
Solving for a yields Eq. (4).
The employer determines b (and implicitly a) to maxi-
mize expected pro?t. We integrate with respect to the
cumulative distribution functions P
y[x
(y
i
[x
i
) and P
a
(a
i
) for
the manager’s signal given employee competence and for
employee ability to ?nd expected pro?t:
P(b) =
_
·
÷·
_
·
÷·
e
k(a
i
÷qe
+
(b))
÷a(b)e
b (y
i
÷x)
h
2
r
2
÷h
2
÷x÷
??????????
2
r
2
h
2
r
2
÷h
2
_
erf
÷1
gb ( )
_ _
÷L (y
i
÷x)
h
2
r
2
÷h
2
÷x÷
???????????????????
2
r
2
h
2
r
2
÷h
2
¸
erf
÷1
(gb) ÷(a
i
÷qe
+
(b))
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
¸
_
_
_
_
×dP
y[x
(y
i
[a
i
÷qe
+
) dP
a
(a
i
)
=e
1
2
k
2
h
2
÷k( a÷qe
+
(b))
÷a(b) e
1
2
b
2 h
4
r
2
÷h
2
÷b x÷
?????????????????????????
2
r
2
h
2
r
2
÷h
2
erf
÷1
(gb)
_ _ _
÷E[L([D[)[
where we have introduced the random variable
D - N
??????????????
2
r
2
h
2
r
2
÷h
2
_
erf
÷1
(gb);
r
2
h
2
r
2
÷h
2
_ _
. Plugging in for the func-
tions e
?
(b) and a(b), we have
P(b) = e
1
2
k
2
h
2
÷k a÷q(c
/
)
÷1
bq
h
2
r
2
÷h
2
_ _ _ _
÷e
1
2
b
2 h
4
r
2
÷h
2
÷c (c
/
)
÷1
bq
h
2
r
2
÷h
2
_ _ _ _
÷E L ([D[) [ [:
To guarantee P(b) attains a maximum on b ÷ 0;
1
g
_ _
, we ob-
serve that lim
b÷
1
g
P(b) = ÷· because the expected loss in
organizational capital blows up. Thus, Eq. (3) is well-
de?ned. h
References
Ackert, L., & Athanassakos, G. (1997). Prior uncertainty, analyst bias, and
subsequent abnormal returns. The Journal of Financial Research, 20,
263–273.
Baiman, S., & Rajan, M. (1995). The informational advantages of
discretionary bonus schemes. The Accounting Review, 70, 557–579.
Baker, G. (1992). Incentive contracts and performance measurement.
Journal of Political Economy, 100, 598–614.
Baker, G., Gibbons, R., & Murphy, K. (1994). Subjective performance
measures in optimal incentive contracts. The Quarterly Journal of
Economics, 109, 1125–1156.
Banker, R., & Datar, S. (1989). Sensitivity, precision, and linear aggregation
of signals for performance evaluation. Journal of Accounting Research,
17, 21–39.
Bartling, B. (2011). Relative performance or team evaluation? Optimal
contracts for other-regarding agents. Journal of Economic Behavior and
Organization, 79, 183–193.
Benabou, R., & Tirole, J. (2006). Incentives and prosocial behavior.
American Economic Review, 96, 1652–1678.
Bernardin, H., Cooke, D., & Villanova, P. (2000). Conscientiousness and
agreeableness as predictors of rating leniency. Journal of Applied
Psychology, 85, 232–236.
Bhattacharaya, S., & Guasch, J. (1988). Heterogeneity, tournaments, and
hierarchies. Journal of Political Economy, 96, 867–881.
Bol, J. (2011). The determinants and performance effects of managers’
performance evaluation biases. The Accounting Review, 86,
1549–1575.
Bol, J., & Smith, S. (2011). Spillover effects in subjective performance
evaluation: Bias, fairness, and controllability. The Accounting Review,
86, 1213–1230.
542 R. Golman, S. Bhatia / Accounting, Organizations and Society 37 (2012) 534–543
Bolton, G., & Ockenfels, A. (2000). ERC: A theory of equity, reciprocity, and
competition. American Economic Review, 90, 166–193.
Bretz, R., Milkovich, G., & Read, W. (1992). The current state of
performance appraisal research and practice: Concerns, directions,
and implications. Journal of Management, 18, 321–352.
Castilla, E., & Benard, S. (2010). The paradox of meritocracy in
organizations. Administrative Science Quarterly, 55, 543–576.
Cheatham, C., Davis, D., & Cheatham, L. (1996). Hollywood pro?ts: Gone
with the wind? The CPA Journal, 66, 32–35.
Das, S., Levine, C., & Sivaramakrishnan, K. (1998). Earnings predictability
and bias in analysts’ earnings forecasts. The Accounting Review, 73,
277–294.
Deci, E. (1972). The effects of contingent and noncontingent rewards and
controls on intrinsic motivation. Organizational Behavior and Human
Performance, 8, 217–229.
Dutta, S. (2008). Managerial expertise, private information, and pay-
performance sensitivity. Management Science, 54, 429–442.
Edmans, A., & Gabaix, X. (2011). Tractability in incentive contracting.
Working Paper.
Englmaier, F., & Wambach, A. (2010). Optimal incentive contracts under
inequality aversion. Games and Economic Behavior, 69, 312–328.
Fehr, E., Kirchler, E., Weichbold, A., & Gachter, S. (1998). When social
norms overpower competition: Gift exchange in experimental labor
markets. Journal of Labor Economics, 16, 324–351.
Fehr, E., & Schmidt, K. (1999). A theory of fairness, competition and
cooperation. The Quarterly Journal of Economics, 114, 817–868.
Fehr, E., & Schmidt, K. (2007). Adding a stick to the carrot? The interaction
of bonuses and ?nes. American Economic Review, 97, 177–181.
Feltham, G., & Xie, J. (1994). Performance measure congruity and diversity
in multi-task principal/agent relations. The Accounting Review, 69,
429–453.
Fischer, P., & Verrecchia, R. (2000). Reporting bias. The Accounting Review,
75, 229–245.
Gelman, A., Carlin, J., Stern, H., & Rubin, D. (2004). Bayesian data analysis
(2nd ed.). Boca Raton, FL: Chapman and Hall/CRC Press.
Gibbons, R. (2005). Incentives between ?rms (and within). Management
Science, 51, 2–17.
Giebe, T., & Gürtler, O. (2012). Optimal contracts for lenient supervisors.
Journal of Economic Behavior and Organization, 81, 403–420.
Gneezy, U., Meier, S., & Rey-Biel, P. (2011). When and why incentives
(dont) work to modify behavior. Journal of Economic Perspectives, 25,
1–21.
Grund, C., & Przemeck, J. (2012). Subjective performance appraisal and
inequality aversion. Applied Economics, 44, 2149–2155.
Harris, M. (1994). Rater motivation in the performance appraisal context:
A theoretical framework. Journal of Management, 20, 737–756.
Hayes, R., & Schaefer, S. (2000). Implicit contracts and the explanatory
power of top executive compensation for future performance. RAND
Journal of Economics, 31, 273–293.
Heyman, J., & Ariely, D. (2004). Effort for payment. Psychological Science,
15, 787–793.
Holmstrom, B. (1979). Moral hazard and observability. Bell.
Holmstrom, B., & Milgrom, P. (1987). Aggregation and linearity in the
provision of intertemporal incentives. Econometrica, 55, 308–328.
Holmstrom, B., & Milgrom, P. (1991). Multitask principal-agent analyses:
Incentive contracts, asset ownership, and job design. Journal of Law,
Economics, and Organization, 7, 24–52.
Ittner, C., Larcker, D., & Meyer, M. (2003). Subjectivity and the weighting
of performance measures: Evidence from a balanced scorecard. The
Accounting Review, 78, 725–758.
Jawahar, I., & Williams, C. (1997). Where all the children are above
average: The performance appraisal purpose effect. Personnel
Psychology, 50, 905–925.
Johnson, D., Erez, A., Kiker, D., & Motowidlo, S. (2002). Liking and
attributions of motives as mediators of the relationships between
individuals’ reputations, helpful behaviors and raters’ reward
decisions. Journal of Applied Psychology, 87, 808–815.
Jovanovic, B. (1979). Firm-speci?c capital and turnover. Journal of Political
Economy, 87, 1246–1260.
Judge, T., & Ferris, G. (1993). Social context of performance evaluation
decisions. Academy of Management Journal, 36, 80–105.
Kane, J., Bernardin, J., Villanova, P., & Peyre?tte, J. (1995). Stability of rater
leniency: Three studies. The Academy of Management Journal, 38,
1036–1051.
Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How
dif?culties in recognizing one’s own incompetence lead to in?ated
self-assessments. Journal of Personality and Social Psychology, 77,
1121–1134.
Landy, F., & Farr, J. (1980). Performance rating. Psychological Bulletin, 87,
72–107.
Lawler, E. E. (1990). Strategic pay: Aligning organizational strategies and pay
systems. San Francisco: Jossey-Bass.
Lazear, E., & Rosen, S. (1981). Rank-order tournaments as optimum labor
contracts. The Journal of Political Economy, 89, 841–864.
Levy, A., & Vukina, T. (2002). Optimal linear contracts with heterogeneous
agents. European Review of Agricultural Economics, 29, 205–217.
Levy, P., & Williams, J. (2004). The social context of performance
appraisal: A review and framework for the future. Journal of
Management, 30, 881–905.
Likert, R. (1961). New patterns of management. New York: McGraw-Hill.
Longenecker, C., Sims, H., & Gioia, D. (1987). Behind the mask: The politics
of employee appraisal. Academy of Management Executive, 1, 183–
193.
Lydall, H. (1968). The structure of earnings. London: Oxford University
Press.
Maas, V., van Rinsum, M., & Towry, K. (2009). In search of informed
discretion: An experimental investigation of fairness and trust reciprocity.
Working Paper.
MacCoun, R., & Kerr, N. (1988). Asymmetric in?uence in Mock Jury
deliberation: jurors’ bias for leniency. Journal of Personality and Social
Psychology, 54, 21–33.
Michaely, R., & Womack, K. (1999). Con?ict of interest and the credibility
of underwriter analyst recommendations. Review of Financial Studies,
12, 653–686.
Moers, F. (2005). Discretion and bias in performance evaluation: The
impact of diversity and subjectivity. Accounting, Organizations and
Society, 30, 67–80.
Montgomery, J. D. (1991). Social networks and labor market outcomes.
American Economic Review, 81, 1407–1418.
Moran, J., & Morgan, J. (2003). Employee recruiting and the lake Wobegon
effect. Journal of Economic Behavior & Organization, 50, 165–182.
Murphy, K. (1992). Performance measurement and appraisal: Motivating
managers to identify and reward performance. In W. Bruns (Ed.),
Performance measurement, evaluation, and incentives (pp. 37–62).
Harvard Business School.
Murphy, K. (1999). Executive compensation. Handbook of Labor
Economics, 3, 2485–2563.
Murphy, K. (2008). Explaining the weak relationship between job
performance and ratings of job performance. Industrial and
Organizational Psychology, 1, 148–160.
Murphy, K., & Cleveland, J. (1991). Performance appraisal: An
organizational perspective. Needham Heights, MA: Allyn & Bacon.
Murphy, K., & Cleveland, J. (1995). Understanding performance appraisal.
Thousand Oaks, CA: Sage.
Murphy, K., Cleveland, J., Skattebo, A., & Kinney, T. (2004). Raters who
pursue different goals give different ratings. Journal of Applied
Psychology, 89, 158–164.
O’Keeffe, M., Viscusi, W., & Zeckhauser, R. (1984). Economic contests:
Comparative rewards scheme. Journal of Labor Economics, 2, 27–56.
Prendergast, C. (1999). The provision of incentives in ?rms. Journal of
Economic Literature, 37, 7–63.
Prendergast, C. (2002). Uncertainty and incentives. Journal of Labor
Economics, 20, 115–137.
Prendergast, C., & Topel, R. (1993). Discretion and bias in performance
evaluation. European Economic Review, 37, 355–365.
Prendergast, C., & Topel, R. (1996). Favoritism in organizations. Journal of
Political Economy, 104, 958–978.
Prescott, E., & Visscher, M. (1980). Organization capital. Journal of Political
Economy, 88, 446–461.
Riis, C. (2010). Ef?cient contests. Journal of Economics and Management
Strategy, 19, 643–665.
Saal, F. E., & Landy, F. J. (1977). The mixed standard rating scale: An
evaluation. Organizational Behavior and Human Performance, 18,
19–35.
Schweitzer, M., & Hsee, C. (2002). Stretching the truth: Elastic justi?cation
and motivated communication of uncertain information. Journal of
Risk and Uncertainty, 25, 185–201.
Taylor, M. S., Tracy, K., Renard, M., Harrison, J. K., & Carroll, S. (1995). Due
process in performance appraisal: A quasi-experiment in procedural
justice. Administrative Science Quarterly, 40, 495–523.
Tsoulouhas, T., & Marinakis, K. (2007). Tournaments with ex post
heterogeneous agents. Economics Bulletin, 4, 1–9.
R. Golman, S. Bhatia / Accounting, Organizations and Society 37 (2012) 534–543 543

doc_862609342.pdf

Performance evaluation inflation and compression

Attachments