Description
Behavioral research in accounting deals with
the behavior of accountants. As such, it uses
accounting subjects. Accounting subjects are
very difficult to come by because of the nature
of the accounting environment. First,
professional accountants operate in a pressured
environment in which they have little or no time
to participate in behavioral research. Second,
professional accountants operate in an
environment of high service charges and have
little or no interest in participating in behavioral
experiments free or for a token remuneration
Accounting Research Journal
A Commentary on Sample Design Issues in Behavioral Accounting Experiments
Freddie Choo Kim Tan
Article information:
To cite this document:
Freddie Choo Kim Tan, (2006),"A Commentary on Sample Design Issues in Behavioral Accounting
Experiments", Accounting Research J ournal, Vol. 19 Iss 2 pp. 153 - 158
Permanent link to this document:http://dx.doi.org/10.1108/10309610680000685
Downloaded on: 24 January 2016, At: 21:00 (PT)
References: this document contains references to 0 other documents.
To copy this document: [email protected]
The fulltext of this document has been downloaded 450 times since 2006*
Users who downloaded this article also downloaded:
Callum Scott, (2006),"Measuring Contagion in the South-East Asian Economic Crisis: An
Exploration Using Artificial Neural Networks", Accounting Research J ournal, Vol. 19 Iss 2 pp.
139-152http://dx.doi.org/10.1108/10309610680000684
Pak K. Auyeung, Ron Dagwell, Chew Ng, J ohn Sands, (2006),"Educators’ Epistemological Beliefs
of Accounting Ethics Teaching: A Cross-Cultural Study", Accounting Research J ournal, Vol. 19 Iss
2 pp. 122-138http://dx.doi.org/10.1108/10309610680000683
Kevin Clarke, J ack Flanagan, Sharron O'Neill, (2011),"Winning ARC grants: comparing accounting
with other commerce-related disciplines", Accounting Research J ournal, Vol. 24 Iss 3 pp. 213-244http://dx.doi.org/10.1108/10309611111186984
Access to this document was granted through an Emerald subscription provided by emerald-
srm:115632 []
For Authors
If you would like to write for this, or any other Emerald publication, then please use our
Emerald for Authors service information about how to choose which publication to write for and
submission guidelines are available for all. Please visit www.emeraldinsight.com/authors for
more information.
About Emerald www.emeraldinsight.com
Emerald is a global publisher linking research and practice to the benefit of society. The
company manages a portfolio of more than 290 journals and over 2,350 books and book series
volumes, as well as providing an extensive range of online products and additional customer
resources and services.
Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the
Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative
for digital archive preservation.
*Related content and download information correct at time of
download.
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
A Commentary on Sample Design Issues in Behavioral Accounting Experiments
153
A Commentary on Sample Design Issues in
Behavioral Accounting Experiments
Freddie Choo
Department of Accounting & International Business
College of Business
San Francisco State University
and
Kim Tan
Department of Accounting & Finance
College of Business
California State University Stanislaus
Abstract0F
Behavioral research in accounting deals with
the behavior of accountants. As such, it uses
accounting subjects. Accounting subjects are
very difficult to come by because of the nature
of the accounting environment. First,
professional accountants operate in a pressured
environment in which they have little or no time
to participate in behavioral research. Second,
professional accountants operate in an
environment of high service charges and have
little or no interest in participating in behavioral
experiments free or for a token remuneration.
Third, professional accountants are usually
inaccessible because behavioral researchers
have few or no opportunities for contacts within
a CPA firm. Finally, professional accountants
operate in the real world in which they perceive
behavioral research as too abstract to have
practical value for them to participate in. Given
the difficulties in getting accounting subjects,
behavioral researchers often lament that the
pool of available accounting subjects is very
small. As such, they cannot rely on
conventional research strategies that assume,
Acknowledgement: We are indebted to the Editor Tim
Brailsford for very helpful comments and to an anonymous
reviewer who contributed significantly to the revision of this
paper. We thank Dr. Harriet Blodgett for proof reading this
paper. We also thank the participants for their feedback on
this paper at the 2004 Asian Pacific Interdisciplinary
Research Conference in Singapore.
among other things, normal distribution and
homogeneity of variances. In this paper, we
suggest a broad range of research strategies
including sampling, design, measurement, and
analysis to deal specifically with a very small
pool of available accounting subjects. We cite
some prior behavioral accounting studies and
refer to some statistic textbooks deemed best for
the application of these research strategies. Our
suggestions should benefit anyone doing
behavioral research in accounting.
1. Introduction
Behavioral research in accounting deals with
the behavior of accountants. As such, it uses
accounting subjects. Accounting subjects are
defined here as professional accountants who
participate in behavioral experiments at a
prearranged time and location. Professional
accountants who participate in mail-
questionnaire surveys and accounting students
who participate in behavioral experiments are
excluded from the definition. Accounting
subjects, as defined here, are very difficult to
come by because of the nature of the accounting
environment. First, professional accountants
operate in a pressured environment in which
they have little or no time to participate in
behavioral research. Second, professional
accountants operate in an environment of high
service charges and have little or no interest in
participating in behavioral experiments free or
for a token remuneration. Third, professional
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
ACCOUNTING RESEARCH JOURNAL VOLUME 19 NO 2 (2006)
154
accountants are usually inaccessible because
behavioral researchers have few or no
opportunities for contacts within a CPA firm.
Finally, professional accountants operate in the
real world in which they perceive behavioral
research as too abstract to have practical value
for them to participate in. Given the difficulties
in getting accounting subjects, behavioral
researchers often lament that the pool of
available accounting subjects is very small. As
such, they cannot rely on conventional research
strategies that assume, among other things,
normal distribution and homogeneity of
variances. In this paper, we suggest a broad
range of research strategies including sampling,
design, measurement, and analysis to deal
specifically with a very small pool of available
accounting subjects. We cite some prior
behavioral accounting studies and refer to some
statistic books deemed best for the application
of these research strategies. Our suggestions
should benefit anyone doing behavioral
research in accounting.
2. Sampling
In formulating research proposals, behavioral
researchers in accounting often face the
question, “With how small a sample is it
reasonable to proceed?” We suggest a sample
size of as small as 20 but no fewer than 10
accounting subjects in a treatment group. Two
key factors of power and cost, govern our
suggestions here.
2.1 Power and Cost
Statistical power is related to type I and type II
errors. A type I error (?), or false positive, is
claiming a relationship between two variables
that does not in fact exist. Conversely, a type II
error (?), or false negative, is failing to claim a
relationship that does exist. Power is defined as
the probability of not making a type II error, i.e.,
of not overlooking a relationship that is there
and, therefore, equals 1 minus the probability of
a type II error or, power = 1- ?. A more detailed
discussion on statistical power is available in
Maxwell and Delaney (2000).
In planning a research, behavioral
researchers in accounting use power to
determine the size of a sample needed to reach a
given alpha (?) level (e.g., at a conventional p
value of 0.05) for a particular effect size that
might be expected. Effect size refers to the
strength of the relationship between two
variables, or the magnitude of an experimental
effect. Although there are many different ways
of representing an effect size, Cohen’s d (Cohen
1988) is most common because he has provided
a large number of tables that are indispensable
to behavioral researchers in accounting.
The first column in Table 1 shows the
number of accounting subjects per treatment
group (rounded sample sizes) required to detect
effect sizes (d) of 1.0, 0.70, and 0.50 at a p
value of 0.05. Columns 2, 3, and 4 in Table 1
show the power of the various effect sizes and
columns 6, 7, and 8 show their respective
incremental increases in power. At a base level
of 10 accounting subjects per treatment group,
Table 1 shows that it can yield reasonable
statistical power at various effect sizes. Each
successive increase of 10 accounting subjects
“buys” less power. For example, in raising
sample size from 10 to 20, an accounting
researcher gains a substantial 30% in power
Table 1
Trade Off between Power and Cost
Power* Incremental Increase in Power
Number of
Accounting
Subjects
Effect Size
= 0.50
Effect Size
= 0.70
Effect Size
= 1.0
Cost Effect Size
= 0.50
Effect Size
= 0.70
Effect Size
= 1.0
10 0.15 0.15 0.40 $1,000 – – –
20 0.30 0.50 0.70 $2,000 15% 35% 30%
30 0.40 0.70 0.85 $3,000 10% 20% 15%
40 0.50 0.80 0.90 $4,000 10% 10% 10%
* The relationship between the number of subject and power is based on a two-sample t-test, using a p
value of 5% with effect sizes of 1.0, 0.70, and 0.50.
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
A Commentary on Sample Design Issues in Behavioral Accounting Experiments
155
when the effect size is 1.0 (i.e., group means
separated by one standard deviation). The next
increase of 10 (20 to 30) subjects gains a smaller
15%, and the next increase of 10 (30 to 40)
subjects gains an even smaller 10%. Each
succeeding increase of 10 (40 to 50, 50 to 60,
etc.) subjects contributes very little gain in
power.
Table 1 also shows the relationship between
cost and power. If each accounting subject costs
$100, say, to recruit and process, then an
accounting researcher’s payment of $1,000 to
raise the sample size from 10 to 20 is clearly a
good decision as there is a gain of 30% in power
when the effect size is 1.0. The next $1,000 to
raise the sample size from 20 to 30 is more
questionable as there is a gain only of 15% in
power, and successive increases are even less
cost effective. In sum, the trade offs between
power and cost lead us to recommend a sample
size of as small as 20 but no fewer than 10
accounting subjects in a treatment group when
only a very small pool of subjects is available.
2.2 Determining Entry Criteria
Behavioral researchers in accounting not only
have to contend with a very small sample size,
but also with accounting subjects who may
not be willing to do the experiment or be
committed to doing it appropriately; that is,
researchers face a very small subject pool of
convenience. For a very small subject pool
of convenience, the conventional wisdom is
that inclusion/exclusion entry criteria should be
strictly set because heterogeneity of response
tends to attenuate the power of statistical tests.
Thus imposition of strict entry criteria improves
the homogeneity of response, which leads to
increased power of statistical tests. But the
imposition of strict entry criteria exponentially
increases the logistical problems of behavioral
research in accounting because with fewer
subjects eligible for entry, it may be difficult to
realize a sample size of as small as 20 but no
fewer than 10 accounting subjects in a treatment
group. Thus we suggest setting rather liberal
entry criteria for a very small subject pool of
convenience, as long as the nature of subject-
task matches reasonably well. For example,
Choo (1996) used a liberal entry criterion to
eliminate accounting subjects who blatantly
refused to follow the instructions of the
experimental task.
2.3 Assigning Subjects to Treatment
Groups
In assigning subjects to the treatment and
control groups, conventional wisdom described
in Trotman (1996, pp.76-79) is that subjects
should be randomly assigned,1F
1
using, for
example, coin flips, random-number tables, or
shuffling of cards, and they should be as
unobtrusive as possible. This procedure
assumes that the subjects will not violate
randomization in an accounting experiment. In
reality, this is never assured. For example, Choo
and Firth (1998) randomly assigned accounting
subjects to two groups: time-pressure vs. self-
pace. A subject in the time-pressure group
exceeded the time-pressure condition; thus he
transferred himself to the self-paced group.
Conversely, a subject in the self-paced group
speeded through the experimental task; thus he
transferred himself to the time-pressure group.
In the long run, random assignment will
result in the sample sizes being near equal
(balanced) in the treatment and control groups.
However, an experiment with a very small pool
of available subjects typically does not operate
“in the long run.” In the short run, random
assignment is more likely to end with
unbalanced group sizes. This is because subjects
usually enter the study sequentially, depending
on whenever and wherever they are available to
participate in an accounting experiment. Thus
by the time it is clear that the imbalance caused
by sequential random assignment is not just
temporary; the study may be near completion.
For example, accounting subjects entered Libby
and Kinney’s (2000) study sequentially at
various times and at various offices of a Big 5
CPA firm; and by the time the study was
completed, the sequentially and randomly
1 Random assignment is different from random sampling
in which every subject in a population has the same
chance of being included. Random sampling is seldom
possible in behavioral accounting research. Rather, a
convenience sample based on the availability of the
subjects is used. This convenience sample gets even
smaller as behavioral accounting researchers often
require subjects to have certain specialized knowledge
such as financial accounting, tax, or auditing. Unlike a
random sample that produces random errors, a
convenience sample produces systematic errors. A
careful planning of the experiment can prevent these
systematic errors. Sachs (1984) provides a more detailed
discussion on random and systematic errors.
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
ACCOUNTING RESEARCH JOURNAL VOLUME 19 NO 2 (2006)
156
assigned group sizes were unbalanced at a 75-
43 split. We recommend a simple solution
suggested by Efron (1971): the first subject who
enters the study, or any subject who is assigned
when group sizes are equal, is assigned
randomly. Any subject who is assigned when
group sizes are unequal has a 1-3 chance of
entry into the majority group, 2-3 into the
minority group. This randomization procedure
will exert a constant pressure to balance group
sizes in an experiment with a very small pool of
available subjects.
3. Design
For a very small pool of available subjects, we
do not recommend an endpoint design in which
an accounting researcher evaluates response
data only at the end of an experiment. This is
because the response data from a very small
number of subjects increase the variability
associated with individual differences, not only
because of true behavioral differences but also
because of differences in the ways subjects
evaluate experimental procedures, the variation
in response styles, and the effects of the
environment on perception. A recent web-based
research methodology (Reips 2000) has
enhanced the application of the endpoint design
for a very small pool of available subjects. For
example, Kadous et al. (2003) used a web-based
endpoint design to reduce the variability
associated with individual subjects’ aversion to
the researchers’ experimental conditions by
allowing the subjects easy and anonymous
withdrawal from the web-based endpoint
experimental design.
For a very small pool of available subjects,
we also do not recommend a changed design in
which an accounting researcher evaluates
response data both pre- and post-treatment in
both treatment and control groups. This is
because under the changed design, half of what
is already a very small number of subjects is
assigned to the control group. If one or more
subjects drop out of the experiment, the design
may be substantially imbalanced, a situation
that yields weak statistical inferences.
We recommend a repeated-measures (or
crossover) design, in which an accounting
researcher puts each subject through a period of
repeated treatment and control in random order,
for a very small pool of available subjects.
However, this design tends to report false
positive results when the number of subjects is
very small.2F
2
Therefore, it is necessary to adjust
the repeated-measures ANOVA. One example
is Libby and Frederick (1990, p.357, footnote
11), who used a Bonferroni procedure to
produce conservative significant levels in order
to improve the validity of their response data
from three very small groups of subjects. An in-
depth discussion on the parametric Bonferroni
procedure can be found in Neter et al (1985).
Another example is Mayper (1982), who used a
non-parametric Kendal’s coefficients of
concordance to adjust his 3-factorial repeated-
measures design consisting of 34 subjects.
Siegal and Castellan (1988) provide more
discussions on the Kendal’s coefficients and
other non-parametric statistics.
We have covered a majority of the
experimental designs that have been used in
behavioral accounting research except for a few
complex experimental designs. For example,
Spires (1991) employed a complex repeated-
measures incomplete Latin Square design to
gather his research data from ten subjects in six
CPA firms. A detailed discussion on the more
complex experimental designs can be found in
Kirk (1968).
2 This is because in a repeated-measures ANOVA, all that
is common is attributed to the repeated measures on a
subject to a “subject effect”, and all that is common is
attributed to the repeated measures at a point in time to
the “time effect.” What remains when the “subject
effect” and “time effect” are removed is attributed to the
error term. When the number of subjects is very small,
there is a high propensity for serial correlations between
the error terms. Thus there is a common thread linking
the error terms that will mistakenly be attributed to the
“subject effect” and hence deducted from the error terms.
As a result, the error mean square (MS
e
) tends to
underestimate true error variance (?
?
2
). Moreover, only
when the error terms are independent is the sum of
squares for error distributed approximately as ?
?
2
.?
2
with
(n - 1) (t - 1) degrees of freedom (n subjects, t time
points). With serial correlations, not only is the multiplier
no longer ?
?
2
, but if ?
2
approximates the distribution at
all, it is with fewer than (n - 1) (t -1) degrees of freedom.
The cumulative effect of underestimation of the degrees
of freedom associated with the error term is to
exaggerate the significance of all the test results.
Rosenthal and Rosnow (1991) provide a lot more
explanations on this issue.
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
A Commentary on Sample Design Issues in Behavioral Accounting Experiments
157
4. Measurement
4.1 Validity
For a very small pool of available subjects,
behavioral researchers in accounting should
emphasize the validity (a “right” measure)
instead of the reliability (a “good” measure) of a
research instrument. In general, a research
instrument is valid if it measures what it is
intended to measure. On the other hand, a
research instrument is reliable if its
measurements are repeatable. Nunnally (1978,
p.178) pointed out that a large proportion of
journal articles on psychological measurement
have been overly devoted to the issue of
reliability. He explained that high reliability
does not necessarily mean high validity and that
reliability is a necessary but not sufficient
condition for validity. We suggest behavioral
researchers in accounting emphasize the validity
of a research instrument in three ways. First, use
research instruments with known validity or
those that have been validated in a pilot
study. Second, discard any measurements of
questionable validity, any which are
distinctively lower than others in validity, and
combine the remaining measurements to step up
validity. For example, in measuring audit
expertise, Bonner and Lewis (1990) emphasized
the validity of their instrument by combining
the subjects’ self-reported level of knowledge,
ability, and years of general audit experience
with their test scores on questions from auditing
textbooks, questions from the CPA exam, and
questions from the Graduate Record Exam.
Finally, use an interactive computer program as
a research instrument when it is practical to do
so. For example, Nelson et al. (1995) used an
interactive computer program to enhance
validity by standardizing the timing of
presenting the measurement and providing
immediate feedback on the measurement; at the
same time, they used the interactive computer
program to enhance reliability by measuring a
large number of realistic cases across blocks of
trials.
5. Analysis
5.1 Variance-Stabilizing Transformations
Common statistical procedures such as t test,
ANOVA, regression, correlation, etc., are
specifically designed for normally distributed
data. Although a very small pool of available
subjects usually does not provide normally
distributed data, we do not recommend
normalizing them. This is because correct
applications of the common statistical
procedures do not depend on the normality of
the data per se; rather, they depend on the
homogeneity of variance of the data (Kraemer
1980). Therefore, we recommend variance-
stabilizing transformations of the data for a very
small pool of available subjects. For example,
Frederick (1991) encountered persistent
violations of the homogeneity of variance in his
small sample of data. He stabilized the data with
an arcsin-square-root transformation before
applying the common statistical procedures.
Detail procedures for the arcsin-square-root
transformation can be found in Press (1972).
5.2 Non-Parametric Techniques
When the assumptions of normality and
homogeneity of variance are not met in a very
small pool of available subjects, behavioral
researchers in accounting may use non-
parametric techniques to test their hypotheses.
Some of these techniques include the following:
1. Sign Test and Wilcoxon Signed Rank Test –
These two non-parametric techniques are
applicable to a very small single data set or
data collected as pairs. The inferences are
concerned with a measure of central
tendency, which is specified to be the median
M of the population for the single sample
case and the median M
D
of the population of
differences for the paired sample case. The
parametric counterparts of these techniques
are those based on a Z statistic or a Student’s
t statistic that requires the assumption of a
normal distribution.
2. Mann-Whitney U Test and Wilcoxon Rank
Sum Test – These two non-parametric
techniques are for comparing the central
tendency (the medians) of two mutually
independent small samples. The parametric
counterparts of these techniques are those
based on a Student’s t statistic or an ANOVA
statistic that requires the assumption of a
normal distribution with equal variances.
3. Kruskal-Wallis Test – This non-parametric
technique extends the two-sample Mann-
Whitney U Test and Wilcoxon Rank Sum
Test to the case of three or more mutually
independent small samples. The parametric
counterparts of this technique are one-way
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
ACCOUNTING RESEARCH JOURNAL VOLUME 19 NO 2 (2006)
158
ANOVA and Bonferroni multiple
comparisons procedures that require the
assumption of a normal distribution with
equal variances.
4. Friedman Test – This non-parametric
technique extends the Kruskal-Wallis Test to
the case of three or more non-mutually
independent (i.e., related or matched) small
samples. The parametric counterparts of this
technique are two-way ANOVA and one-
factor ANOVA with repeated measures that
require the assumption of a normal
distribution with equal variances.
Gibbon (1993) provides detail procedures for
all these non-parametric techniques.
5.3 Exploratory Analysis and Meta-
Analysis
We think that a behavioral research project with
a very small pool of available subjects should be
an exploratory experience and a generator of
new hypotheses. Accordingly, we suggest using
exploratory analyses to learn from the data and
to generate new hypotheses for extension and
replication. For example, Trotman and Yetton
(1985) first learnt about auditors’ group-
judgments in small treatment groups of 8 to 15
subjects. Their exploratory study generated
seven published extensions or replications
with/by various accounting researchers.
Finally, we also suggest using meta-analyses
to pool across related studies with very small
pools of subjects to obtain statistical power
equivalent to a study with a large pool of
subjects. Meta-analysis procedures for pooling
studies are beyond the scope of this paper. For
more detail about these procedures, the reader
may wish to consult Cooper and Hedges (1994).
References
Bonner, S.E. and Lewis, B.L. (1990), ‘Determinants of
Auditor Expertise’, Journal of Accounting Research,
Supplement, pp. 1–20.
Choo, F. (1996), ‘Auditors’ Knowledge Content and
Judgment Performance: A Cognitive Script Approach’,
Accounting, Organizations and Society, vol.21,
pp. 339–359.
Choo, F. and Firth, M. (1998), ‘The Effect of Time Pressure
on Auditor’s Configural Information Processing’,
International Journal of Auditing, vol. 2, pp. 21–33.
Cohen, J. (1988), Statistical Power Analysis for the
Behavioral Sciences, New Jersey: Erlbaum Associates,
Inc.
Cooper, H.M. and Hedges, L.V. (1994), The Handbook of
Research Synthesis, Newbury Park, California: Sage
Publications.
Efron, B. (1971), ‘Forcing a Sequential Experiment to be
Balanced’, Biometrika, vol.58, pp. 403–417.
Frederick, D.M. (1991), ‘Auditors’ Representation and
Retrieval of Internal Control Knowledge’, The
Accounting Review, Spring, pp. 240–258.
Gibbons, J.D. (1993), Non-Parametric Statistics, Newbury
Park, California: Sage Publications.
Kadous, K., Kennedy, S.J. and M.E. Peecher (2003), ‘The
Effect of Quality Assessment and Directional Goal
Commitment on Auditors’ Acceptance of Client-
Preferred Accounting Methods’, The Accounting Review,
Fall, pp.759–778.
Kirk, R. E. (1968), Experimental Design: Procedures for the
Behavioral Sciences, California: Wadsworth Publishing
Company, Inc.
Kraemer, H.C. (1980), ‘Robustness of the Distribution
Theory of the Product-Moment Correlation Coefficient’,
Journal of Educational Statistics, vol.5, pp. 115–128.
Libby, R. and Frederick, D.M. (1990), ‘Experience and the
Ability to Explain Audit Findings’, Journal of
Accounting Research, Fall, pp. 348–366.
Libby, R. and Kinney, W.R. Jr. (2000), ‘Does Mandated
Audit Communication Reduce Opportunistic Corrections
to Manage Earnings to Forecasts?’ The Accounting
Review, Fall, pp. 383–404.
Mayper, A.G. (1982), ‘Consensus of Auditors’ Materiality
Judgments of Internal Accounting Control Weaknesses’,
Journal of Accounting Research, Fall, pp. 773–783.
Maxwell S.E. and Delaney, H.D. (2000), Designing
Experiments and Analyzing Data, New Jersey: Lawrence
Erlbaum Associates, Inc.
Nelson, M.W., Libby, R. and Bonner, S.E. (1995),
‘Knowledge Structures and the Estimation of
Conditional Probabilities in Audit Planning’, The
Accounting Review, Spring, pp. 27–47.
Neter, J., Wasserman, W. and Kutner, M.H. (1985), Applied
Linear Statistical Models, Illinois: Irwin, Inc.
Nunnally, J.C. (1978). Psychometric Theory (Second
Edition), New York: McGraw-Hill, Inc.
Press, S.J. (1972). Applied Multivariate Analysis, New York:
Holt, Rinehart, and Winston.
Reips, U. (2000), ‘The Web Experiment Method:
Advantages, Disadvantages, and Solutions’ in
Psychological Experiments on the Internet, edited by
M.H. Birnbaum, pp. 89–117. San Diego, California:
Academic Press.
Rosenthal, R. and Rosnow, R. (1991), Essentials of
Behavioral Research: Methods and Data Analysis, New
York: McGraw-Hill, Inc.
Sachs, L. (1984), Applied Statistics: A Handbook of
Techniques, New York: Springer-Verlag.
Siegel, S. and Castellan, N.J. (1988), Nonparametric
Statistics for the Behavioral Sciences, New York:
McGraw-Hill, Inc.
Spires, E.E. (1991), ‘Auditors’ evaluation of test-of-control
strength’, The Accounting Review, Spring, pp. 259–276.
Trotman, K. (1996), Research Methods for Judgement and
Decision Making Studies in Auditing, Melbourne,
Australia: Coopers & Lybrand.
Trotman, K. and Yetton, P. (1985), ‘The Effect of the
Review Process on Auditor Judgments’, Journal of
Accounting Research, Spring, pp. 256–267.
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
This article has been cited by:
1. Duane M. Brandon, James H. Long, Tina M. Loraas, Jennifer Mueller-Phillips, Brian
Vansant. 2014. Online Instrument Delivery and Participant Recruitment Services: Emerging
Opportunities for Behavioral Accounting Research. Behavioral Research in Accounting 26, 1-23.
[CrossRef]
2. Freddie Choo, Kim TanThe effect of fraud triangle factors on students’ cheating behaviors
205-220. [Abstract] [Full Text] [PDF] [PDF]
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
doc_970903092.pdf
Behavioral research in accounting deals with
the behavior of accountants. As such, it uses
accounting subjects. Accounting subjects are
very difficult to come by because of the nature
of the accounting environment. First,
professional accountants operate in a pressured
environment in which they have little or no time
to participate in behavioral research. Second,
professional accountants operate in an
environment of high service charges and have
little or no interest in participating in behavioral
experiments free or for a token remuneration
Accounting Research Journal
A Commentary on Sample Design Issues in Behavioral Accounting Experiments
Freddie Choo Kim Tan
Article information:
To cite this document:
Freddie Choo Kim Tan, (2006),"A Commentary on Sample Design Issues in Behavioral Accounting
Experiments", Accounting Research J ournal, Vol. 19 Iss 2 pp. 153 - 158
Permanent link to this document:http://dx.doi.org/10.1108/10309610680000685
Downloaded on: 24 January 2016, At: 21:00 (PT)
References: this document contains references to 0 other documents.
To copy this document: [email protected]
The fulltext of this document has been downloaded 450 times since 2006*
Users who downloaded this article also downloaded:
Callum Scott, (2006),"Measuring Contagion in the South-East Asian Economic Crisis: An
Exploration Using Artificial Neural Networks", Accounting Research J ournal, Vol. 19 Iss 2 pp.
139-152http://dx.doi.org/10.1108/10309610680000684
Pak K. Auyeung, Ron Dagwell, Chew Ng, J ohn Sands, (2006),"Educators’ Epistemological Beliefs
of Accounting Ethics Teaching: A Cross-Cultural Study", Accounting Research J ournal, Vol. 19 Iss
2 pp. 122-138http://dx.doi.org/10.1108/10309610680000683
Kevin Clarke, J ack Flanagan, Sharron O'Neill, (2011),"Winning ARC grants: comparing accounting
with other commerce-related disciplines", Accounting Research J ournal, Vol. 24 Iss 3 pp. 213-244http://dx.doi.org/10.1108/10309611111186984
Access to this document was granted through an Emerald subscription provided by emerald-
srm:115632 []
For Authors
If you would like to write for this, or any other Emerald publication, then please use our
Emerald for Authors service information about how to choose which publication to write for and
submission guidelines are available for all. Please visit www.emeraldinsight.com/authors for
more information.
About Emerald www.emeraldinsight.com
Emerald is a global publisher linking research and practice to the benefit of society. The
company manages a portfolio of more than 290 journals and over 2,350 books and book series
volumes, as well as providing an extensive range of online products and additional customer
resources and services.
Emerald is both COUNTER 4 and TRANSFER compliant. The organization is a partner of the
Committee on Publication Ethics (COPE) and also works with Portico and the LOCKSS initiative
for digital archive preservation.
*Related content and download information correct at time of
download.
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
A Commentary on Sample Design Issues in Behavioral Accounting Experiments
153
A Commentary on Sample Design Issues in
Behavioral Accounting Experiments
Freddie Choo
Department of Accounting & International Business
College of Business
San Francisco State University
and
Kim Tan
Department of Accounting & Finance
College of Business
California State University Stanislaus
Abstract0F
Behavioral research in accounting deals with
the behavior of accountants. As such, it uses
accounting subjects. Accounting subjects are
very difficult to come by because of the nature
of the accounting environment. First,
professional accountants operate in a pressured
environment in which they have little or no time
to participate in behavioral research. Second,
professional accountants operate in an
environment of high service charges and have
little or no interest in participating in behavioral
experiments free or for a token remuneration.
Third, professional accountants are usually
inaccessible because behavioral researchers
have few or no opportunities for contacts within
a CPA firm. Finally, professional accountants
operate in the real world in which they perceive
behavioral research as too abstract to have
practical value for them to participate in. Given
the difficulties in getting accounting subjects,
behavioral researchers often lament that the
pool of available accounting subjects is very
small. As such, they cannot rely on
conventional research strategies that assume,
Acknowledgement: We are indebted to the Editor Tim
Brailsford for very helpful comments and to an anonymous
reviewer who contributed significantly to the revision of this
paper. We thank Dr. Harriet Blodgett for proof reading this
paper. We also thank the participants for their feedback on
this paper at the 2004 Asian Pacific Interdisciplinary
Research Conference in Singapore.
among other things, normal distribution and
homogeneity of variances. In this paper, we
suggest a broad range of research strategies
including sampling, design, measurement, and
analysis to deal specifically with a very small
pool of available accounting subjects. We cite
some prior behavioral accounting studies and
refer to some statistic textbooks deemed best for
the application of these research strategies. Our
suggestions should benefit anyone doing
behavioral research in accounting.
1. Introduction
Behavioral research in accounting deals with
the behavior of accountants. As such, it uses
accounting subjects. Accounting subjects are
defined here as professional accountants who
participate in behavioral experiments at a
prearranged time and location. Professional
accountants who participate in mail-
questionnaire surveys and accounting students
who participate in behavioral experiments are
excluded from the definition. Accounting
subjects, as defined here, are very difficult to
come by because of the nature of the accounting
environment. First, professional accountants
operate in a pressured environment in which
they have little or no time to participate in
behavioral research. Second, professional
accountants operate in an environment of high
service charges and have little or no interest in
participating in behavioral experiments free or
for a token remuneration. Third, professional
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
ACCOUNTING RESEARCH JOURNAL VOLUME 19 NO 2 (2006)
154
accountants are usually inaccessible because
behavioral researchers have few or no
opportunities for contacts within a CPA firm.
Finally, professional accountants operate in the
real world in which they perceive behavioral
research as too abstract to have practical value
for them to participate in. Given the difficulties
in getting accounting subjects, behavioral
researchers often lament that the pool of
available accounting subjects is very small. As
such, they cannot rely on conventional research
strategies that assume, among other things,
normal distribution and homogeneity of
variances. In this paper, we suggest a broad
range of research strategies including sampling,
design, measurement, and analysis to deal
specifically with a very small pool of available
accounting subjects. We cite some prior
behavioral accounting studies and refer to some
statistic books deemed best for the application
of these research strategies. Our suggestions
should benefit anyone doing behavioral
research in accounting.
2. Sampling
In formulating research proposals, behavioral
researchers in accounting often face the
question, “With how small a sample is it
reasonable to proceed?” We suggest a sample
size of as small as 20 but no fewer than 10
accounting subjects in a treatment group. Two
key factors of power and cost, govern our
suggestions here.
2.1 Power and Cost
Statistical power is related to type I and type II
errors. A type I error (?), or false positive, is
claiming a relationship between two variables
that does not in fact exist. Conversely, a type II
error (?), or false negative, is failing to claim a
relationship that does exist. Power is defined as
the probability of not making a type II error, i.e.,
of not overlooking a relationship that is there
and, therefore, equals 1 minus the probability of
a type II error or, power = 1- ?. A more detailed
discussion on statistical power is available in
Maxwell and Delaney (2000).
In planning a research, behavioral
researchers in accounting use power to
determine the size of a sample needed to reach a
given alpha (?) level (e.g., at a conventional p
value of 0.05) for a particular effect size that
might be expected. Effect size refers to the
strength of the relationship between two
variables, or the magnitude of an experimental
effect. Although there are many different ways
of representing an effect size, Cohen’s d (Cohen
1988) is most common because he has provided
a large number of tables that are indispensable
to behavioral researchers in accounting.
The first column in Table 1 shows the
number of accounting subjects per treatment
group (rounded sample sizes) required to detect
effect sizes (d) of 1.0, 0.70, and 0.50 at a p
value of 0.05. Columns 2, 3, and 4 in Table 1
show the power of the various effect sizes and
columns 6, 7, and 8 show their respective
incremental increases in power. At a base level
of 10 accounting subjects per treatment group,
Table 1 shows that it can yield reasonable
statistical power at various effect sizes. Each
successive increase of 10 accounting subjects
“buys” less power. For example, in raising
sample size from 10 to 20, an accounting
researcher gains a substantial 30% in power
Table 1
Trade Off between Power and Cost
Power* Incremental Increase in Power
Number of
Accounting
Subjects
Effect Size
= 0.50
Effect Size
= 0.70
Effect Size
= 1.0
Cost Effect Size
= 0.50
Effect Size
= 0.70
Effect Size
= 1.0
10 0.15 0.15 0.40 $1,000 – – –
20 0.30 0.50 0.70 $2,000 15% 35% 30%
30 0.40 0.70 0.85 $3,000 10% 20% 15%
40 0.50 0.80 0.90 $4,000 10% 10% 10%
* The relationship between the number of subject and power is based on a two-sample t-test, using a p
value of 5% with effect sizes of 1.0, 0.70, and 0.50.
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
A Commentary on Sample Design Issues in Behavioral Accounting Experiments
155
when the effect size is 1.0 (i.e., group means
separated by one standard deviation). The next
increase of 10 (20 to 30) subjects gains a smaller
15%, and the next increase of 10 (30 to 40)
subjects gains an even smaller 10%. Each
succeeding increase of 10 (40 to 50, 50 to 60,
etc.) subjects contributes very little gain in
power.
Table 1 also shows the relationship between
cost and power. If each accounting subject costs
$100, say, to recruit and process, then an
accounting researcher’s payment of $1,000 to
raise the sample size from 10 to 20 is clearly a
good decision as there is a gain of 30% in power
when the effect size is 1.0. The next $1,000 to
raise the sample size from 20 to 30 is more
questionable as there is a gain only of 15% in
power, and successive increases are even less
cost effective. In sum, the trade offs between
power and cost lead us to recommend a sample
size of as small as 20 but no fewer than 10
accounting subjects in a treatment group when
only a very small pool of subjects is available.
2.2 Determining Entry Criteria
Behavioral researchers in accounting not only
have to contend with a very small sample size,
but also with accounting subjects who may
not be willing to do the experiment or be
committed to doing it appropriately; that is,
researchers face a very small subject pool of
convenience. For a very small subject pool
of convenience, the conventional wisdom is
that inclusion/exclusion entry criteria should be
strictly set because heterogeneity of response
tends to attenuate the power of statistical tests.
Thus imposition of strict entry criteria improves
the homogeneity of response, which leads to
increased power of statistical tests. But the
imposition of strict entry criteria exponentially
increases the logistical problems of behavioral
research in accounting because with fewer
subjects eligible for entry, it may be difficult to
realize a sample size of as small as 20 but no
fewer than 10 accounting subjects in a treatment
group. Thus we suggest setting rather liberal
entry criteria for a very small subject pool of
convenience, as long as the nature of subject-
task matches reasonably well. For example,
Choo (1996) used a liberal entry criterion to
eliminate accounting subjects who blatantly
refused to follow the instructions of the
experimental task.
2.3 Assigning Subjects to Treatment
Groups
In assigning subjects to the treatment and
control groups, conventional wisdom described
in Trotman (1996, pp.76-79) is that subjects
should be randomly assigned,1F
1
using, for
example, coin flips, random-number tables, or
shuffling of cards, and they should be as
unobtrusive as possible. This procedure
assumes that the subjects will not violate
randomization in an accounting experiment. In
reality, this is never assured. For example, Choo
and Firth (1998) randomly assigned accounting
subjects to two groups: time-pressure vs. self-
pace. A subject in the time-pressure group
exceeded the time-pressure condition; thus he
transferred himself to the self-paced group.
Conversely, a subject in the self-paced group
speeded through the experimental task; thus he
transferred himself to the time-pressure group.
In the long run, random assignment will
result in the sample sizes being near equal
(balanced) in the treatment and control groups.
However, an experiment with a very small pool
of available subjects typically does not operate
“in the long run.” In the short run, random
assignment is more likely to end with
unbalanced group sizes. This is because subjects
usually enter the study sequentially, depending
on whenever and wherever they are available to
participate in an accounting experiment. Thus
by the time it is clear that the imbalance caused
by sequential random assignment is not just
temporary; the study may be near completion.
For example, accounting subjects entered Libby
and Kinney’s (2000) study sequentially at
various times and at various offices of a Big 5
CPA firm; and by the time the study was
completed, the sequentially and randomly
1 Random assignment is different from random sampling
in which every subject in a population has the same
chance of being included. Random sampling is seldom
possible in behavioral accounting research. Rather, a
convenience sample based on the availability of the
subjects is used. This convenience sample gets even
smaller as behavioral accounting researchers often
require subjects to have certain specialized knowledge
such as financial accounting, tax, or auditing. Unlike a
random sample that produces random errors, a
convenience sample produces systematic errors. A
careful planning of the experiment can prevent these
systematic errors. Sachs (1984) provides a more detailed
discussion on random and systematic errors.
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
ACCOUNTING RESEARCH JOURNAL VOLUME 19 NO 2 (2006)
156
assigned group sizes were unbalanced at a 75-
43 split. We recommend a simple solution
suggested by Efron (1971): the first subject who
enters the study, or any subject who is assigned
when group sizes are equal, is assigned
randomly. Any subject who is assigned when
group sizes are unequal has a 1-3 chance of
entry into the majority group, 2-3 into the
minority group. This randomization procedure
will exert a constant pressure to balance group
sizes in an experiment with a very small pool of
available subjects.
3. Design
For a very small pool of available subjects, we
do not recommend an endpoint design in which
an accounting researcher evaluates response
data only at the end of an experiment. This is
because the response data from a very small
number of subjects increase the variability
associated with individual differences, not only
because of true behavioral differences but also
because of differences in the ways subjects
evaluate experimental procedures, the variation
in response styles, and the effects of the
environment on perception. A recent web-based
research methodology (Reips 2000) has
enhanced the application of the endpoint design
for a very small pool of available subjects. For
example, Kadous et al. (2003) used a web-based
endpoint design to reduce the variability
associated with individual subjects’ aversion to
the researchers’ experimental conditions by
allowing the subjects easy and anonymous
withdrawal from the web-based endpoint
experimental design.
For a very small pool of available subjects,
we also do not recommend a changed design in
which an accounting researcher evaluates
response data both pre- and post-treatment in
both treatment and control groups. This is
because under the changed design, half of what
is already a very small number of subjects is
assigned to the control group. If one or more
subjects drop out of the experiment, the design
may be substantially imbalanced, a situation
that yields weak statistical inferences.
We recommend a repeated-measures (or
crossover) design, in which an accounting
researcher puts each subject through a period of
repeated treatment and control in random order,
for a very small pool of available subjects.
However, this design tends to report false
positive results when the number of subjects is
very small.2F
2
Therefore, it is necessary to adjust
the repeated-measures ANOVA. One example
is Libby and Frederick (1990, p.357, footnote
11), who used a Bonferroni procedure to
produce conservative significant levels in order
to improve the validity of their response data
from three very small groups of subjects. An in-
depth discussion on the parametric Bonferroni
procedure can be found in Neter et al (1985).
Another example is Mayper (1982), who used a
non-parametric Kendal’s coefficients of
concordance to adjust his 3-factorial repeated-
measures design consisting of 34 subjects.
Siegal and Castellan (1988) provide more
discussions on the Kendal’s coefficients and
other non-parametric statistics.
We have covered a majority of the
experimental designs that have been used in
behavioral accounting research except for a few
complex experimental designs. For example,
Spires (1991) employed a complex repeated-
measures incomplete Latin Square design to
gather his research data from ten subjects in six
CPA firms. A detailed discussion on the more
complex experimental designs can be found in
Kirk (1968).
2 This is because in a repeated-measures ANOVA, all that
is common is attributed to the repeated measures on a
subject to a “subject effect”, and all that is common is
attributed to the repeated measures at a point in time to
the “time effect.” What remains when the “subject
effect” and “time effect” are removed is attributed to the
error term. When the number of subjects is very small,
there is a high propensity for serial correlations between
the error terms. Thus there is a common thread linking
the error terms that will mistakenly be attributed to the
“subject effect” and hence deducted from the error terms.
As a result, the error mean square (MS
e
) tends to
underestimate true error variance (?
?
2
). Moreover, only
when the error terms are independent is the sum of
squares for error distributed approximately as ?
?
2
.?
2
with
(n - 1) (t - 1) degrees of freedom (n subjects, t time
points). With serial correlations, not only is the multiplier
no longer ?
?
2
, but if ?
2
approximates the distribution at
all, it is with fewer than (n - 1) (t -1) degrees of freedom.
The cumulative effect of underestimation of the degrees
of freedom associated with the error term is to
exaggerate the significance of all the test results.
Rosenthal and Rosnow (1991) provide a lot more
explanations on this issue.
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
A Commentary on Sample Design Issues in Behavioral Accounting Experiments
157
4. Measurement
4.1 Validity
For a very small pool of available subjects,
behavioral researchers in accounting should
emphasize the validity (a “right” measure)
instead of the reliability (a “good” measure) of a
research instrument. In general, a research
instrument is valid if it measures what it is
intended to measure. On the other hand, a
research instrument is reliable if its
measurements are repeatable. Nunnally (1978,
p.178) pointed out that a large proportion of
journal articles on psychological measurement
have been overly devoted to the issue of
reliability. He explained that high reliability
does not necessarily mean high validity and that
reliability is a necessary but not sufficient
condition for validity. We suggest behavioral
researchers in accounting emphasize the validity
of a research instrument in three ways. First, use
research instruments with known validity or
those that have been validated in a pilot
study. Second, discard any measurements of
questionable validity, any which are
distinctively lower than others in validity, and
combine the remaining measurements to step up
validity. For example, in measuring audit
expertise, Bonner and Lewis (1990) emphasized
the validity of their instrument by combining
the subjects’ self-reported level of knowledge,
ability, and years of general audit experience
with their test scores on questions from auditing
textbooks, questions from the CPA exam, and
questions from the Graduate Record Exam.
Finally, use an interactive computer program as
a research instrument when it is practical to do
so. For example, Nelson et al. (1995) used an
interactive computer program to enhance
validity by standardizing the timing of
presenting the measurement and providing
immediate feedback on the measurement; at the
same time, they used the interactive computer
program to enhance reliability by measuring a
large number of realistic cases across blocks of
trials.
5. Analysis
5.1 Variance-Stabilizing Transformations
Common statistical procedures such as t test,
ANOVA, regression, correlation, etc., are
specifically designed for normally distributed
data. Although a very small pool of available
subjects usually does not provide normally
distributed data, we do not recommend
normalizing them. This is because correct
applications of the common statistical
procedures do not depend on the normality of
the data per se; rather, they depend on the
homogeneity of variance of the data (Kraemer
1980). Therefore, we recommend variance-
stabilizing transformations of the data for a very
small pool of available subjects. For example,
Frederick (1991) encountered persistent
violations of the homogeneity of variance in his
small sample of data. He stabilized the data with
an arcsin-square-root transformation before
applying the common statistical procedures.
Detail procedures for the arcsin-square-root
transformation can be found in Press (1972).
5.2 Non-Parametric Techniques
When the assumptions of normality and
homogeneity of variance are not met in a very
small pool of available subjects, behavioral
researchers in accounting may use non-
parametric techniques to test their hypotheses.
Some of these techniques include the following:
1. Sign Test and Wilcoxon Signed Rank Test –
These two non-parametric techniques are
applicable to a very small single data set or
data collected as pairs. The inferences are
concerned with a measure of central
tendency, which is specified to be the median
M of the population for the single sample
case and the median M
D
of the population of
differences for the paired sample case. The
parametric counterparts of these techniques
are those based on a Z statistic or a Student’s
t statistic that requires the assumption of a
normal distribution.
2. Mann-Whitney U Test and Wilcoxon Rank
Sum Test – These two non-parametric
techniques are for comparing the central
tendency (the medians) of two mutually
independent small samples. The parametric
counterparts of these techniques are those
based on a Student’s t statistic or an ANOVA
statistic that requires the assumption of a
normal distribution with equal variances.
3. Kruskal-Wallis Test – This non-parametric
technique extends the two-sample Mann-
Whitney U Test and Wilcoxon Rank Sum
Test to the case of three or more mutually
independent small samples. The parametric
counterparts of this technique are one-way
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
ACCOUNTING RESEARCH JOURNAL VOLUME 19 NO 2 (2006)
158
ANOVA and Bonferroni multiple
comparisons procedures that require the
assumption of a normal distribution with
equal variances.
4. Friedman Test – This non-parametric
technique extends the Kruskal-Wallis Test to
the case of three or more non-mutually
independent (i.e., related or matched) small
samples. The parametric counterparts of this
technique are two-way ANOVA and one-
factor ANOVA with repeated measures that
require the assumption of a normal
distribution with equal variances.
Gibbon (1993) provides detail procedures for
all these non-parametric techniques.
5.3 Exploratory Analysis and Meta-
Analysis
We think that a behavioral research project with
a very small pool of available subjects should be
an exploratory experience and a generator of
new hypotheses. Accordingly, we suggest using
exploratory analyses to learn from the data and
to generate new hypotheses for extension and
replication. For example, Trotman and Yetton
(1985) first learnt about auditors’ group-
judgments in small treatment groups of 8 to 15
subjects. Their exploratory study generated
seven published extensions or replications
with/by various accounting researchers.
Finally, we also suggest using meta-analyses
to pool across related studies with very small
pools of subjects to obtain statistical power
equivalent to a study with a large pool of
subjects. Meta-analysis procedures for pooling
studies are beyond the scope of this paper. For
more detail about these procedures, the reader
may wish to consult Cooper and Hedges (1994).
References
Bonner, S.E. and Lewis, B.L. (1990), ‘Determinants of
Auditor Expertise’, Journal of Accounting Research,
Supplement, pp. 1–20.
Choo, F. (1996), ‘Auditors’ Knowledge Content and
Judgment Performance: A Cognitive Script Approach’,
Accounting, Organizations and Society, vol.21,
pp. 339–359.
Choo, F. and Firth, M. (1998), ‘The Effect of Time Pressure
on Auditor’s Configural Information Processing’,
International Journal of Auditing, vol. 2, pp. 21–33.
Cohen, J. (1988), Statistical Power Analysis for the
Behavioral Sciences, New Jersey: Erlbaum Associates,
Inc.
Cooper, H.M. and Hedges, L.V. (1994), The Handbook of
Research Synthesis, Newbury Park, California: Sage
Publications.
Efron, B. (1971), ‘Forcing a Sequential Experiment to be
Balanced’, Biometrika, vol.58, pp. 403–417.
Frederick, D.M. (1991), ‘Auditors’ Representation and
Retrieval of Internal Control Knowledge’, The
Accounting Review, Spring, pp. 240–258.
Gibbons, J.D. (1993), Non-Parametric Statistics, Newbury
Park, California: Sage Publications.
Kadous, K., Kennedy, S.J. and M.E. Peecher (2003), ‘The
Effect of Quality Assessment and Directional Goal
Commitment on Auditors’ Acceptance of Client-
Preferred Accounting Methods’, The Accounting Review,
Fall, pp.759–778.
Kirk, R. E. (1968), Experimental Design: Procedures for the
Behavioral Sciences, California: Wadsworth Publishing
Company, Inc.
Kraemer, H.C. (1980), ‘Robustness of the Distribution
Theory of the Product-Moment Correlation Coefficient’,
Journal of Educational Statistics, vol.5, pp. 115–128.
Libby, R. and Frederick, D.M. (1990), ‘Experience and the
Ability to Explain Audit Findings’, Journal of
Accounting Research, Fall, pp. 348–366.
Libby, R. and Kinney, W.R. Jr. (2000), ‘Does Mandated
Audit Communication Reduce Opportunistic Corrections
to Manage Earnings to Forecasts?’ The Accounting
Review, Fall, pp. 383–404.
Mayper, A.G. (1982), ‘Consensus of Auditors’ Materiality
Judgments of Internal Accounting Control Weaknesses’,
Journal of Accounting Research, Fall, pp. 773–783.
Maxwell S.E. and Delaney, H.D. (2000), Designing
Experiments and Analyzing Data, New Jersey: Lawrence
Erlbaum Associates, Inc.
Nelson, M.W., Libby, R. and Bonner, S.E. (1995),
‘Knowledge Structures and the Estimation of
Conditional Probabilities in Audit Planning’, The
Accounting Review, Spring, pp. 27–47.
Neter, J., Wasserman, W. and Kutner, M.H. (1985), Applied
Linear Statistical Models, Illinois: Irwin, Inc.
Nunnally, J.C. (1978). Psychometric Theory (Second
Edition), New York: McGraw-Hill, Inc.
Press, S.J. (1972). Applied Multivariate Analysis, New York:
Holt, Rinehart, and Winston.
Reips, U. (2000), ‘The Web Experiment Method:
Advantages, Disadvantages, and Solutions’ in
Psychological Experiments on the Internet, edited by
M.H. Birnbaum, pp. 89–117. San Diego, California:
Academic Press.
Rosenthal, R. and Rosnow, R. (1991), Essentials of
Behavioral Research: Methods and Data Analysis, New
York: McGraw-Hill, Inc.
Sachs, L. (1984), Applied Statistics: A Handbook of
Techniques, New York: Springer-Verlag.
Siegel, S. and Castellan, N.J. (1988), Nonparametric
Statistics for the Behavioral Sciences, New York:
McGraw-Hill, Inc.
Spires, E.E. (1991), ‘Auditors’ evaluation of test-of-control
strength’, The Accounting Review, Spring, pp. 259–276.
Trotman, K. (1996), Research Methods for Judgement and
Decision Making Studies in Auditing, Melbourne,
Australia: Coopers & Lybrand.
Trotman, K. and Yetton, P. (1985), ‘The Effect of the
Review Process on Auditor Judgments’, Journal of
Accounting Research, Spring, pp. 256–267.
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
This article has been cited by:
1. Duane M. Brandon, James H. Long, Tina M. Loraas, Jennifer Mueller-Phillips, Brian
Vansant. 2014. Online Instrument Delivery and Participant Recruitment Services: Emerging
Opportunities for Behavioral Accounting Research. Behavioral Research in Accounting 26, 1-23.
[CrossRef]
2. Freddie Choo, Kim TanThe effect of fraud triangle factors on students’ cheating behaviors
205-220. [Abstract] [Full Text] [PDF] [PDF]
D
o
w
n
l
o
a
d
e
d
b
y
P
O
N
D
I
C
H
E
R
R
Y
U
N
I
V
E
R
S
I
T
Y
A
t
2
1
:
0
0
2
4
J
a
n
u
a
r
y
2
0
1
6
(
P
T
)
doc_970903092.pdf