Description
Response rate (also known as completion rate or return rate) in survey research refers to the number of people who answered the survey divided by the number of people in the sample. It is usually expressed in the form of a percentage.
ABSTRACT
Title of Dissertation:
BEYOND RESPONSE RATES: THE EFFECT OF PREPAID INCENTIVES ON MEASUREMENT ERROR Rebecca Medway, Doctor of Philosophy, 2012
Dissertation directed by:
Dr. Roger Tourangeau
As response rates continue to decline, survey researchers increasingly offer incentives as a way to motivate sample members to take part in their surveys. Extensive prior research demonstrates that prepaid incentives are an effective tool for doing so. If prepaid incentives influence behavior at the stage of deciding whether or not to participate, they also may alter the way that respondents behave while completing surveys. Nevertheless, most research has focused narrowly on the effect that incentives have on response rates. Survey researchers should have a better empirical basis for assessing the potential tradeoffs associated with the higher responses rates yielded by prepaid incentives. This dissertation describes the results of three studies aimed at expanding our understanding of the impact of prepaid incentives on measurement error. The first study explored the effect that a $5 prepaid cash incentive had on twelve indicators of respondent effort in a national telephone survey. The incentive led to significant reductions in item nonresponse and interview length. However, it had little effect on the other indicators, such as response order effects and responses to open-ended items. The second study evaluated the effect that a $5 prepaid cash incentive had on responses to
sensitive questions in a mail survey of registered voters. The incentive resulted in a significant increase in the proportion of highly undesirable attitudes and behaviors to which respondents admitted and had no effect on responses to less sensitive items. While the incentive led to a general pattern of reduced nonresponse bias and increased measurement bias for the three voting items where administrative data was available for the full sample, these effects generally were not significant. The third study tested for measurement invariance in incentive and control group responses to four multi-item scales from three recent surveys that included prepaid incentive experiments. There was no evidence of differential item functioning; however, full metric invariance could not be established for one of the scales. Generally, these results suggest that prepaid incentives had minimal impact on measurement error. Thus, these findings should be reassuring for survey researchers considering the use of prepaid incentives to increase response rates.
BEYOND RESPONSE RATES: THE EFFECT OF PREPAID INCENTIVES ON MEASUREMENT ERROR
By Rebecca Medway
Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2012
Advisory Committee: Dr. Roger Tourangeau, Chair Professor Frauke Kreuter Professor Stanley Presser Professor John Robinson Professor Eleanor Singer
© Copyright by Rebecca Medway 2012
ACKNOWLEDGMENTS The research presented in this dissertation was made possible through the support of several organizations. I am indebted to JPSM for allowing me to include my research in the 2011 JPSM Practicum survey. The results of this study became a focal point of both Chapters 2 and 4 of my dissertation. Additionally, I am grateful to the U.S. Census Bureau for awarding me a Dissertation Fellowship, which allowed me to conduct the mail survey featured in Chapters 3 and 4 of this dissertation. I also am grateful to the Survey Research Center at the University of Michigan for providing me access to data from the Survey of Consumers. There also are several individuals whose help was invaluable in creating this dissertation. First, I’d like to thank the members of my committee, particularly my Chair Roger Tourangeau, for their helpful feedback and unwavering support over the past two years. And finally, thank you to my parents, Marc and Cathy: for nurturing a love of learning from my earliest days and for always pushing me to be the best version of myself – both in and out of the classroom.
ii
TABLE OF CONTENTS
List of Tables ..................................................................................................................... iv List of Figures ................................................................................................................... vii Chapters 1. Incentives and Survey Research ..........................................................................1 2. Satisficing in Telephone Surveys: Do Prepaid Cash Incentives Make a Difference? .........................................................................................................29 3. The Effect of Prepaid Cash Incentives on Responses to Sensitive Items ..........90 4. Testing For Measurement Invariance in the Responses of Incentive and Control Group Respondents .............................................................................124 5. Summary and Conclusions ..............................................................................159 Appendices A. Survey Materials, JPSM Practicum Survey ...................................................168 B. Survey Items Used for Each Effort Indicator (Chapter 2) .............................204 C. Maryland Mail Survey Materials ...................................................................205 D. Item Sensitivity Ratings (Chapter 3)..............................................................218 E. Consumer Sentiment Items ............................................................................222 F. Additional Analyses for Chapter 4 ................................................................224 References ........................................................................................................................239
iii
LIST OF TABLES 1.1. Effect of Incentive on Accuracy of Self-Reports ........................................................21 2.1. Outcome Rates, by Incentive Condition .....................................................................52 2.2. Costs (Dollars), by Incentive Condition .....................................................................53 2.3. Sample Composition, by Incentive Condition ............................................................55 2.4. Item Nonresponse, by Incentive Condition ................................................................57 2.5. Mean Number of Words Provided, by Incentive Condition .......................................59 2.6. Number of Unique Ideas Provided for Consent Explanation, by Incentive Condition................................................................................................................59 2.7. Straight-lining and Non-differentiation, by Incentive Condition ...............................60 2.8. Mean Number of Affirmative Responses, by Incentive Condition ............................62 2.9. Odds Ratios from Logistic Regressions Predicting Acquiescence .............................63 2.10. Proportion of Items for Which Respondents Displayed Response Order Effects, by Incentive Condition .............................................................................64 2.11. Proportion of Respondents Selecting First Two Options in Original Order, by Incentive Condition...........................................................................................65 2.12. Logistic Regressions Predicting Selection of First Two Options in Original Order ......................................................................................................................66 2.13. Mean Proportion of Items for which Respondents Provided Round or Prototypical Responses, by Incentive Condition ...................................................67 2.14. Proportion of Respondents Reporting Each Strategy, by Incentive Condition ........69 2.15. Logistic Regressions Predicting Affirmative Response to Filter Items ....................71 2.16. Mean Seconds per Question, by Incentive Condition ...............................................73 2.17. Accuracy of Length of Residence Reports, by Incentive Condition ........................74 2.18. Interviewer Ratings of How Often Respondent Answered Questions to Best of Ability, by Incentive Condition .................................................................74 2.19. Mean Proportion of Items for Which Respondent Satisficed, by Incentive Condition................................................................................................................75 2.20. Prevalence of Cognitive Ability Indicators, by Incentive Condition ........................77 2.21. Advance Letter and Incentive Awareness, by Incentive Condition ..........................78 2.22. Linear Models Predicting Respondent Effort ...........................................................80 2.23. Linear Model Predicting Satisficing .........................................................................83 3.1. Sample Design ............................................................................................................97 3.2. Mailing Schedule ........................................................................................................97 3.3. Disposition Codes and Response Rate, by Incentive Condition ...............................101 3.4. Costs (Dollars), by Incentive Condition ...................................................................102 3.5. Demographics, by Incentive Condition ....................................................................103 3.6. Means from Linear Models Predicting Impression Management Scores .................104 iv
3.7. Means from Linear Models Predicting Proportion of Undesirable Responses.........105 3.8. Means from Linear Models Predicting Proportion of Items Skipped .......................107 3.9. Estimated Percentage of Voters and Bias Estimates, by Incentive Condition ..........109 3.10. Estimates from Logistic Regressions Predicting Survey Participation...................113 3.11. Estimates from Logistic Regressions Predicting Inaccurate Voting Reports .........115 3.12. Estimates from Logistic Regressions Predicting Survey Participation...................117 3.13. Estimates from Logistic Regressions Predicting Inaccurate Voting Reports .........119 4.1. Proportion of Cases with Complete Data, by Incentive Condition ...........................140 4.2. Mean Index Score, by Incentive Condition ..............................................................141 4.3. Cronbach’s Alpha, by Incentive Condition ..............................................................142 4.4. Differential Item Functioning: Patriotism .................................................................144 4.5. Differential Item Functioning: Conscientiousness....................................................145 4.6. Differential Item Functioning: Impression Management .........................................146 4.7. Differential Item Functioning: Consumer Sentiment: $10 vs. 0 ...............................147 4.8. Differential Item Functioning: Consumer Sentiment: $5 vs. 0 .................................147 4.9. Differential Item Functioning: Consumer Sentiment: $10 vs. $5 .............................148 4.10. Fit Indices for Initial Measurement Models, by Incentive Condition .....................149 4.11. Fit Indices for Configural Invariance Models .........................................................150 4.12. Fit Indices for Metric Invariance Models ...............................................................151 4.13. Modification Indices Produced by Metric Invariance Models ...............................153 4.14. Fit Indices for Metric Invariance Models with One Equality Constraint Removed ..............................................................................................................153 4.15. Modification Indices After Releasing One Equality Constraint .............................154 4.16. Unstandardized Factor Loadings: Consumer Sentiment.........................................155 4.17. Standardized Factor Loadings: Consumer Sentiment .............................................155 5.1. Key Findings .............................................................................................................161 D.1. Mean Respondent Sensitivity Ratings .....................................................................219 D.2. Grouping Items by Sensitivity .................................................................................221 F.1A. Correlation Matrix for Patriotism Items ................................................................225 F.1B. Correlation Matrix for Conscientiousness Items ...................................................225 F.1C. Correlation Matrix for Impression Management Items .........................................225 F.1D. Correlation Matrix for Consumer Sentiment Items: $10 vs. Control ....................226 F.1E. Correlation Matrix for Consumer Sentiment Items: $5 vs. Control .......................226 F.1F. Correlation Matrix for Consumer Sentiment Items: $10 vs. $5 .............................226 F.2A. Covariance Matrix for Patriotism Items: Incentive Condition ..............................227 F.2B. Covariance Matrix for Patriotism Items: Control Condition .................................227 F.2C. Covariance Matrix for Conscientiousness Items: Incentive Condition .................227 F.2D. Covariance Matrix for Conscientiousness Items: Control Condition ....................228 v
F.2E. Covariance Matrix for Impression Management Items: Incentive Condition........228 F.2F. Covariance Matrix for Impression Management Items: Control Condition ..........228 F.2G. Covariance Matrix for Consumer Sentiment Items: $10 Condition ......................229 F.2H. Covariance Matrix for Consumer Sentiment Items: $5 Condition ........................229 F.2I. Covariance Matrix for Consumer Sentiment Items: Control Condition .................229 F.3A. Cronbach’s Alpha Coefficient, by Incentive Condition ........................................230 F.3B. Differential Item Functioning: Patriotism ..............................................................231 F.3C. Differential Item Functioning: Conscientiousness .................................................232 F.3D. Differential Item Functioning: Impression Management (n=1010) ......................233 F.3E. Differential Item Functioning: Consumer Sentiment: $10 vs. 0 (n=529) ..............234 F.3F. Differential Item Functioning: Consumer Sentiment: $5 vs. 0 (n=514) ................234 F.3G. Differential Item Functioning: Consumer Sentiment: $10 vs. $5 (n=579) ............235 F.3H. Fit Indices for CFA Models, by Incentive Condition: Patriotism..........................235 F.3I. Fit Indices for CFA Models, by Incentive Condition: Conscientiousness ..............236 F.3J. Fit Indices for CFA Models, by Incentive Condition: Impression Management ....236 F.3K. Fit Indices for CFA Models, by Incentive Condition: Consumer Sentiment ........237 F.3L. Modification Indices Produced by Metric Invariance Models: Consumer Sentiment ..........................................................................................................237 F.3M. Fit Indices for Metric Invariance Models with One Equality Constraint Removed: Consumer Sentiment .......................................................................238 F.3N. Modification Indices After Releasing One Equality Constraint: Consumer Sentiment ..........................................................................................................238
vi
LIST OF FIGURES 2.1. Demographic Characteristics, by Survey ....................................................................54 2.2. Item Nonresponse for Open-Ended Items, by Incentive Condition ............................58 2.3. Difference between No-Exclusion Mean Response and Exclusion Mean Response, by Incentive Condition .........................................................................70 3.1. Mean Proportion Undesirable Behaviors and Attitudes, by Incentive Condition and Item Sensitivity ............................................................................106 3.2. Mean Item Nonresponse, by Incentive Condition and Item Sensitivity ...................107 3.3. Proportion of Respondents that Provided Inaccurate Voting Reports, by Incentive Condition ..............................................................................................108 3.4. Proportion of Nonvoters Participating in Survey, by Incentive Condition ...............112 3.5. Proportion of Voters Participating in Survey, by Incentive Condition .....................112 3.6. Proportion of Nonvoters Providing Inaccurate Voting Report, by Incentive Condition..............................................................................................................114 3.7. Proportion of Voters Providing Inaccurate Voting Report, by Incentive Condition..............................................................................................................114 3.8. Proportion of Sample Members that Participated, by Voting History and Incentive Condition ..............................................................................................116 3.9. Proportion of Sample Members that Provided Inaccurate Voting Reports for 2010 Election, by Voting History and Incentive Condition ................................118 3.10. Proportion of Sample Members that Provided Inaccurate Voting Reports for 2008 Election, by Voting History and Incentive Condition ................................118 3.11. Proportion of Sample Members that Provided Inaccurate Voting Reports for 2004 Election, by Voting History and Incentive Condition ................................118
vii
CHAPTER 1 INCENTIVES AND SURVEY RESEARCH 1.1 INTRODUCTION As survey response rates continue to decline, incentives are increasingly used as a way to motivate sample members to participate (Cantor, O’Hare, & O’Connor, 2008; Kulka, Eyerman, & McNeeley, 2005; Singer, Van Hoewyk, & Maher, 2000). Extensive research shows that incentives can increase response rates (e.g., Church, 1993; Hopkins & Gullickson, 1992; Singer, Van Hoewyk, Gebler, Raghunathan, & McGonagle, 1999; Trussell & Lavrakas, 2004); they clearly convince some sample members to participate who otherwise would not have done so. If they influence behavior at the stage of deciding whether or not to participate, it is reasonable to believe that incentives also may alter the way that respondents act during the survey interview. Thus, it is important to determine whether the use of incentives influences the magnitude of measurement error in survey estimates. Nevertheless, as pointed out by Singer and Ye (forthcoming), the majority of incentives research has focused narrowly on the effect that incentives have on response rates. Groves (2008) voices similar concerns and urges researchers to, “re-conceptualize the focus away from response rates”. Likewise, Cantor et al. (2008) speak to the need to improve our understanding of the impact that incentives have on data quality. Incentives conceivably could lead to an increase or a decrease in measurement error. On one hand, they could reduce measurement error if they create a sense of obligation to the researcher that causes respondents to make greater effort and provide more thorough, thoughtful responses to questions. Such a result would be reassuring to
1
survey practitioners who are enticed by the promise of higher response rates but lack sufficient empirical evidence of other benefits to justify the costs that can be associated with incentives. On the other hand, incentives could increase measurement error if they convince otherwise uninterested sample members to participate who lack intrinsic motivation to do so. As Brehm (1994) argues, “If we happen to get 100 percent of our respondents, but they all told us lies, we get 100 percent garbage” (p. 59). Survey practitioners should have a better empirical basis for assessing the potential tradeoffs associated with the higher responses rates yielded by prepaid incentives. This dissertation aims to expand our understanding of the impact of prepaid incentives on measurement error. In this chapter, I begin by reviewing the existing literature assessing the effect of incentives on both nonresponse and measurement error. In the three analytical chapters that follow, I address the following questions in turn: Do incentives affect the level of effort that respondents put into completing surveys? Do they influence self-presentation concerns, thereby altering responses to sensitive questions? Finally, does measurement invariance exist between responses received with an incentive and those received without one?
2
1.2 1.2.1
INCENTIVES AND NONRESPONSE Declining Survey Response Rates Survey response rates have declined considerably over the past several decades
(Brick & Williams, forthcoming; Steeh, Kirgis, Cannon, & DeWitt, 2001). For example, the response rate for the National Immunization Survey decreased by fourteen percentage points between 1995 and 2004 (Battaglia et al., 2008), while the response rates for the Consumer Expenditure Diary Survey and the National Health Interview Survey declined by twelve and eight percentage points, respectively, in the 1990s (Atrostic, Bates, Burt & Silberstein, 2001). De Leeuw and de Heer (2002) demonstrate that this is an international phenomenon; reviewing a multi-national sample of household surveys, they report that response rates have decreased by an average of half a percentage point per year over the past twenty years. Furthermore, the speed of this decline may be increasing; the response rate for the Survey of Consumer Attitudes decreased by one and a half percentage points per year from 1996 to 2003 – double the average annual decline observed from 1979 to 1996 (Curtin, Presser, & Singer, 2005). Low response rates can be problematic for several reasons. First, although the response rate is not always a good predictor of nonresponse bias (Groves, 2006), lower response rates may increase the potential for nonresponse bias. Nonresponse bias is a function of both the response rate and the difference between respondents and nonrespondents on survey estimates; if those individuals who respond are not representative of the larger sample on the variables of interest, the estimates will be biased (Groves & Couper, 1998). Furthermore, low response rates may increase survey costs, as they mean that larger initial samples are required to attain the number of
3
respondents necessary to achieve desired levels of precision in survey estimates (Groves, Dillman, Eltinge, & Little, 2002). Survey nonresponse generally can be broken into two major components: inability to reach sample members (“noncontacts”) and failure to persuade them to complete the survey once they have been contacted (“refusals”). Research on surveys in modes where we can more easily disentangle these two components, such as face-to-face and telephone, repeatedly demonstrates that refusals account for a considerably larger proportion of nonresponse than do noncontacts (Brick & Williams, forthcoming; Curtin et al., 2005; Smith, 1995). Typical reasons provided for refusing include being too busy, not being interested in the survey topic, privacy concerns (such as not wanting to share personal information with a stranger), or negative reactions to aspects of the survey (such as its length) (Brehm, 1993; Bates, Dalhamer & Singer, 2008; Couper, Singer, Conrad, & Groves, 2008). Incentives may be an effective tool for reducing some of these refusals – either as an expression of gratitude for respondents’ time or as a way of overcoming a lack of interest in the survey topic. In fact, Couper et al. (2008) found that, following altruistic desires to be helpful or to influence policy, receiving money was one of the most common reasons provided for agreeing to respond to a (hypothetical) survey request. The results of several studies suggest that incentives’ effect on the response rate is largely a function of reduced refusal rates (Holbrook, Krosnick, & Pfent, 2008; Shettle & Mooney, 1999; Tourangeau, Groves, & Redline, 2010; Willimack, Schuman, Pennell, & Lepkowski, 1995). However, many other studies have not disentangled the effect of incentives on noncontact from their effect on refusals (Singer & Ye, forthcoming) –
4
possibly because such a large proportion of the existing experiments are part of mail surveys, where it can be difficult to determine whether nonresponse is caused by lack of contact or lack of willingness to participate. 1.2.2 Effect of Incentives on Response Rates Survey practitioners are searching continually for ways to combat declining response rates. Several tools, such as pre-notification, multiple follow-up contacts, and incentives, have proven effective and have become part of common survey practice. For example, in a meta-analysis of 251 mail surveys, Edwards and colleagues found that offering cash incentives doubled the odds of response, while pre-notification and followup contacts each multiplied the odds of response by about 1.5 (Edwards et al., 2002). Numerous experimental studies have demonstrated that incentives are an effective tool for increasing survey response rates (e.g., James & Bolstein, 1990; Shettle & Mooney, 1999; Petrolia & Bhattacharjee, 2009; Yammarino, Skinner, & Childers, 1991; Yu & Cooper, 1983). Several meta-analyses have shown that the successfulness of incentives spans all survey modes. For example, Church (1993) found that offering an incentive in mail surveys increases the response rate by an average of 13 percentage points. Similarly, incentives multiply the odds of response to Internet surveys by 1.3 on average (Göritz, 2006). Finally, a meta-analysis of both prepaid and promised incentive experiments in interviewer-administered surveys confirmed that incentives have a positive, but smaller, impact on response rates in these types of surveys as well; in these experiments, each dollar that was given to respondents increased the response rate by about one-third of a percentage point on average (Singer et al., 1999).
5
Certain types of incentives have proven to be more effective than others. Prepaid incentives tend to be more successful than promised ones contingent on completion of the survey (Armstrong, 1975; Berk, Mathiowetz, Ward, & White, 1987; Church, 1993; James & Bolstein, 1992; Petrolia & Bhattacharjee, 2009), and monetary incentives tend to be more effective than non-monetary ones (Hansen, 1980; Petrolia & Bhattacharjee, 2009; Warriner, Goyder, Gjertsen, Hohner, & McSpurren, 1996). As these findings imply, prepaid monetary incentives generally have the greatest impact on response rates. Two separate meta-analyses of incentive experiments in mail surveys both concluded that prepaid cash incentives increase mail survey response rates by 19 percent points on average (Church, 1993; Hopkins & Gullickson, 1992). Replicating Singer et al.’s (1999) finding that incentives have a smaller impact in interviewer-administered surveys, Cantor et al. (2008) found that prepaid cash incentives of up to $10 led to a median increase of six percentage points in RDD surveys. While some studies observe a linear relationship between the value of the incentive and the increase in the response rate (e.g., Church, 1993; Trussell & Lavrakas, 2004; Yu & Cooper, 1983), others conclude that increases in the incentive value may have diminishing influence on the response rate (Cantor et al., 2008; Fox, Crask, & Kim, 1988; James & Bolstein, 1992). Finally, although this dissertation generally focuses on the use of incentives at initial contact in cross-sectional surveys, incentives also have proven effective in other contexts. For example, incentives may reduce attrition in longitudinal studies (Creighton, King, & Martin, 2007; Goetz, Tyler, & Cook, 1984), and they may be an effective tool for refusal conversion (Brick, Montaquila, Hagedorn, Roth, & Chapman, 2005).
6
1.2.3
Theoretical Explanations for the Effectiveness of Incentives Multiple theories of survey response provide potential explanations for
incentives’ success at increasing response rates. For example, utility theory suggests that individuals weigh the costs and benefits of completing a task and will take action when the benefits of doing so exceed the costs (Groves & Couper, 1998). Offering an incentive is one way that researchers can make the perceived benefits of taking part in survey research greater than the perceived costs. Under such a framework, respondents may see the incentive as payment or reimbursement for their time and effort (Biner & Kidd, 1994). Conceptualizing the incentive as an economic exchange helps to explain why larger incentives have at times been found to be more effective than smaller ones (e.g., Trussell & Lavrakas, 2004). Other researchers have suggested that the effectiveness of incentives is not due to an economic exchange but a social one. Under social exchange theory (Blau, 1964; Homans, 1961), rewards and costs remain important decision-making factors, and individuals still choose to take action only when they feel it is in their self-interest to do so. However, social exchange is different from economic exchange in two main ways (Dillman, Smyth, & Christian, 2009). First, in social exchange, the definition of rewards and costs are more flexible; namely, the rewards do not have to be monetary. Second, the importance of trust is much greater in social exchanges. Social exchanges typically are not bound by contracts, and so individuals have to trust that the other party will provide a reward in the future that will be worth whatever cost they must bear. Actors in such exchanges are able to trust one another due to several rules and norms of exchange by which they can assume the other party will abide. One of the
7
central rules of social exchange is the norm of reciprocity; this rule suggests that when an individual takes an action that benefits you, you are expected to respond in kind (Gouldner, 1960). Incentives can be seen as a benefit that the survey sponsor provides to the sample member; when sample members receive an incentive they may feel obligated to return the kindness by responding to the survey. This may explain the effectiveness of prepaid incentives (Dillman et al., 2009). However, the mixed success of promised incentives suggests that sample members do not trust survey researchers enough to incur the costs of participation without having received their reward in advance. Other researchers have suggested a related explanation for the effectiveness of prepaid cash incentives, based on cognitive dissonance theory (Festinger, 1957). According to this theory, once respondents have received a prepaid incentive, the idea of keeping it without completing the survey creates a feeling of dissonance (Furse & Stewart, 1982). Sample members have two options for resolving this unpleasant feeling. The first is to dispose of the incentive; however, Furse and Stewart (1982) argue that most people will not choose this option because throwing money away also makes them feel uncomfortable, and because sending the money back to the researcher may involve almost as much effort as simply completing the survey. Therefore, most people will choose the second option – participating in the survey. Finally, leverage-saliency theory suggests that the impact of various design features on the participation decision differs across sample members (Groves, Singer, & Corning, 2000). According to this theory, the influence of each feature on an individual’s decision to respond depends on three factors: (1) how important the feature is to the sample member (leverage), (2) whether the sample member sees this as a positive or
8
negative feature (valence), and (3) the degree to which the feature is highlighted in the survey request (salience). For example, some sample members may choose to respond because they are interested in the survey topic described in the survey cover letter. Other sample members may lack such an interest but may be convinced to participate by a cash incentive included in the envelope. Thus, incentives may convince certain sample members to respond who are not drawn to other survey features such as the topic, and they may have little or no effect on other sample members’ willingness to participate (e.g., Baumgartner & Rathbun, 1996). 1.2.4 Effect of Incentives on Sample Composition and Nonresponse Error Several experimental studies in both mail and interviewer-administered modes have found that incentives, whether prepaid or promised, do not have much of an effect on sample composition (e.g., Brick et al., 2005; Cantor et al., 2008; Furse & Stewart, 1982; Goetz et al., 1984; James & Bolstein, 1990; Shettle & Mooney, 1999; Warriner et al., 1996; Willimack et al., 1995). However, the results of other experiments suggest that incentives can have two types of effects on sample composition. First, incentives may improve representation of traditionally underrepresented groups, such as young people (Dillman, 1996; Miller, 1996; Storms & Loosveldt, 2004), minorities (Berlin et al., 1992; Mack, Huggins, Keathley, & Sundukchi, 1998), and those with lower incomes (Mack et al., 1998) or less education (Berlin et al., 1992; Nederhof, 1983; Petrolia & Bhattacharjee, 2009). Second, the use of incentives may alter the characteristics of the respondent pool along dimensions other than the typical demographic variables measured in surveys. For example, as leverage-saliency theory might predict, incentives may help attract
9
respondents who are less interested in the survey topic (Baumgartner & Rathbun, 1996; Coogan & Rosenberg, 2004; Petrolia & Bhattacharjee, 2009). However, in a series of prepaid cash incentive experiments embedded in mail and telephone surveys, Groves and colleagues found only mixed support for this hypothesis (Groves et al., 2006; Groves, Presser, & Dipko, 2004). Additionally, Moyer and Brown (2008) actually found the reverse effect: promising a cash incentive for completing the National Cancer Institute’s telephone-administered Health Information National Trends Survey (HINTS)
significantly increased the proportion of respondents who had had cancer. The use of incentives also may reduce the proportion of respondents with certain personality traits or values, such as altruism or selflessness, due to an influx of more selfish respondents. Altruistic or selfless sample members are likely to respond to surveys even without an incentive, while incentives may serve as a motivating factor for sample members low in these traits (Storms & Loosveldt, 2004). For example, in a mail followup to the Detroit Area Study (DAS), Groves et al. (2000) found that a $5 prepaid cash incentive had a significantly greater impact on the response rate among DAS respondents who had reported low levels of community involvement than it did among those who had reported being more involved. Medway and colleagues found that offering a $5 prepaid incentive increased the proportion of respondents to a mail survey who had not volunteered in the past year – although this same effect was not found in an equivalent experiment conducted as part of a telephone survey (Medway, Tourangeau, Viera, Turner, & Marsh, 2011). To the extent that incentives improve representation of groups that are underrepresented when incentives are not used, they may lead to a reduction in
10
nonresponse error. This seems particularly likely in cases where incentives improve representation of individuals who lack interest in the survey topic. For example, Tourangeau et al. (2010) found that offering a prepaid cash incentive of $5 improved representation of nonvoters and reduced the nonresponse bias in reports of voting behavior in two recent elections by about six percentage points – although these differences did not reach statistical significance. However, improved representation of demographic groups that traditionally are underrepresented in surveys will reduce nonresponse bias only if these groups also differ from better-represented groups on key survey variables. For example, in an experiment that assigned sample members to receive either $5 cash or a pass to a local park, Ryu, Couper, and Marans (2005) found that the two types of incentives resulted in differences in respondents’ education level, marital status, and work status; however, they did not find differences in the other response distributions for the two groups. Finally, in their meta-analysis of nonresponse bias analysis studies, Groves and Peytcheva (2008) reported that, overall, the use of an incentive did not have a significant impact on the magnitude of nonresponse bias – though very few of the studies included in their analysis made use of incentives.
11
1.3
INCENTIVES AND MEASUREMENT ERROR Measurement error is any inaccuracy in survey responses that is due to the
process of measurement; this type of error can be differentiated from nonresponse error, discussed earlier, which arises from the failure to get some sample members to respond in the first place. Measurement error exists when the measured value in a survey differs from the corresponding unobserved “true” value (Borhnstedt, 2010), although it may be difficult, or even impossible, for the researcher to know this true value. Several potential sources of measurement error have been identified in the literature, including the interviewer, the respondent, and features of the survey design, such as mode of administration or question wording (Groves, 1989). Offering an incentive is an additional design decision that could have repercussions for the magnitude of measurement error in the resulting estimates. However, this possibility has received relatively little attention in the literature as compared to the effect of incentives on nonresponse. 1.3.1 Theory-Based Expectations for Effect of Incentives on Measurement Error Incentives conceivably have the potential to either increase or decrease measurement error through their influence on respondent behavior. The theories used to explain why incentives convince sample members to respond have conflicting implications for the effect of incentives on the quality of the answers provided during the interview. For example, according to social exchange theory, offering prepaid incentives is potentially the first step toward building a positive relationship between the researcher and the respondent; giving sample members a reward before receiving their responses implies that the researcher trusts and respects them. If respondents are motivated by a sense of community with the researchers, they may feel more comfortable while completing the survey, and, as a result, they may put forth more effort than they would 12
have otherwise. They also may be more willing to respond honestly to questions that are typically subject to social desirability biases. For example, a review of 74 incentive experiments in laboratory studies suggested that self-presentation concerns were reduced among the participations who had received incentives (Camerer & Hogarth, 1999). However, this feeling of having a positive relationship with the researcher could also lead respondents to focus too heavily on pleasing the researcher; as a result, respondents may provide more positive ratings, either generally across all items or specifically for items referring to the survey sponsor, than they would have otherwise. Offering respondents incentives also could affect their motivations for completing the survey; in particular, it may lead them to focus on extrinsic motivations instead of intrinsic ones. For example, according to social exchange theory, incentives may create a sense of obligation toward the researcher, and this feeling may be what motivates sample members to respond. Another possibility, as suggested by leverage-saliency theory, is that incentives may convince otherwise uninterested sample members to respond. In both cases, respondents are focused on an extrinsic motivation, as opposed to an intrinsic one. It seems reasonable that people who are motivated by extrinsic factors such as monetary rewards may put forth less effort than those who are focused on intrinsic ones, such as interest in the survey topic or enjoyment of sharing one’s opinions. Research on the importance of intrinsic motivation to academic success has supported this assumption (e.g., Bolkan, Goodboy, & Griffin, 2011; Fransson, 1977). Research on the quality of responses provided by reluctant respondents, who can be assumed to have low levels of motivation to participate, suggests that such respondents sometimes provide lower quality data than do more eager respondents
13
(Cannell and Fowler, 1963; Triplett, Blair, Hamilton, & Kang, 1996; Fricker & Tourangeau, 2010); however, other studies have failed to find a clear relationship between reluctance and data quality (Kaminska, McCutcheon, & Billiet, 2010; Yan, Tourangeau, & Arens, 2004). A final possibility is that, once sample members have committed to taking part in the survey, the incentive has little to no further impact on their behavior. Social exchange theory suggests that sample members are driven to respond by a sense of obligation to the researcher, while cognitive dissonance theory suggests they are driven by the desire to avoid the dissonance associated with refusal once they have accepted the incentive. If agreeing to participate in the survey satisfies these needs, then any further behaviors taken during data collection may not be influenced by the fact that the respondent has received an incentive. In support of the non-importance of incentives on respondent behavior while completing the survey, Camerer and Hogarth’s (1999) review of incentive experiments in laboratory studies found that incentives typically do not affect performance in such studies. 1.3.2 Comparison of Incentive and Control Group Response Distributions In the incentives literature, the presence of measurement error is typically assessed in one of three ways. The first is to compare the response distributions of two or more groups of respondents who have been randomly assigned to different experimental conditions. Differences between the groups’ responses suggest that there may be a greater amount of error in one of the groups. However, it can be difficult to know whether these differences are caused by a change in who responds (nonresponse error) or by a change in how they respond (measurement error). Furthermore, in the absence of some gold
14
standard to which we can compare the survey responses, we cannot easily tell which of the groups exhibits more error. There is some evidence that offering incentives can affect survey response distributions. Generally, these differences have been observed for attitudinal items in studies that have offered prepaid cash incentives - suggesting that incentives can lead to more positive survey responses. For example, respondents who received a prepaid cash incentive in a mail survey offered more positive comments about the sponsor in openended items than did those who did not receive an incentive; the researchers argue this occurred because receiving the incentive led to increased favorability toward the sponsor (James & Bolstein, 1990). In the telephone-administered Survey of Consumers, offering a $5 prepaid cash incentive had a significant effect on responses to four of seventeen key attitudinal questions; the authors suggest this happened because receiving the incentive put the respondents in a good mood (Singer et al., 2000). Brehm (1994) also found that offering a prepaid cash incentive led to more positive responses to several political attitude questions in a telephone survey. The use of prepaid cash incentives led to greater reported levels of concern about social issues for six of ten items in a mail survey, though the incentives did not increase respondents’ willingness to pay to improve the condition of these social issues (Wheeler, Lazo, Heberling, Fisher, & Epp, 1997). Finally, respondents who had been promised a $1 reduction in their hotel rate in exchange for completing a questionnaire were less likely to provide negative comments about their stay, as compared to a control group (Trice, 1984). I am aware of only two studies where incentives were found to affect response distributions to non-demographic factual items. In these cases, providing prepaid cash
15
incentives resulted in lower estimates of community involvement (Groves et al., 2000) and volunteering (Medway et al., 2011). However, it is impossible to know whether these differences were caused by changes in sample composition (those who volunteer their time for community activities also may be the type of people who are willing to do surveys without incentives, while those who do not do so may require an incentive to motivate them to take part in survey research) or by an increased obligation to be honest about not taking part in these socially desirable behaviors. Several other studies have concluded that incentives do not affect response distributions. Offering a prepaid cash incentive led to significantly different responses for only five percent of the questions in a mail study (Shettle & Mooney, 1999). Similarly, overall, James and Bolstein (1990) did not find significant differences in the response distributions of 28 closed questions in a mail survey when prepaid cash incentives were offered. Offering a prepaid non-monetary incentive did not have a significant effect on responses to ten items in the face-to-face DAS (Willimack et al., 1995). Finally, offering a contingent cash incentive between $10 and $40 did not affect response distributions in two government-sponsored face-to-face studies on substance abuse (the National Survey on Drug Use & Health (NSDUH) and the Alcohol and Drug Services Study (ADSS)) (Eyerman, Bowman, Butler, & Wright, 2005; Krenzke, Mohadjer, Ritter, & Gadzuk, 2005). It is not clear why incentives affect response distributions in some surveys and not in others. One reason may be that researchers have been inconsistent across studies in their selection of items to analyze. For example, some studies focus only on responses to the key survey questions (e.g., Curtin, Singer, & Presser, 2007; Singer et al., 2000), while
16
others consider all of the survey items as one large group (e.g., Shettle & Mooney, 1999). It is difficult to know whether restricting the analysis (or not doing so) is what led to these divergent results. Moving forward, the literature would benefit from a more systematic examination of the importance of various item characteristics. For example, does it matter if the item is a “key” measure that is directly related to the stated survey topic? Are attitude questions more likely to be affected by incentives than factual ones? Does the sensitivity of the item matter? How about placement in the questionnaire? Many other recent incentive experiments fail to discuss the potential effect of incentives on response distributions. In others, the possibility of an effect is mentioned but quickly dismissed without analyzing the data; this decision is, at times, based on the results of a handful of older studies that found offering incentives did not affect survey responses. However, these older studies exhibit features that prevent their results from generalizing to all surveys offering incentives. For example, several of them used very specialized, highly educated populations and surveyed them about topics that were of specific interest to them (Goodstadt, Chung, Kronitz, & Cook, 1977; Hansen, 1980; Mizes, Fleece, & Roos, 1984). Furthermore, several of these studies differed from more recent studies in that they were able to achieve response rates of over 60%, even for the groups that did not receive an incentive (Goodstadt et al., 1977; Mizes et al., 1984; Nederhof, 1983). 1.3.3 Comparison of Survey Responses to Validation Data A weakness of comparing response distributions is that, even when differences are observed between incentive and control group responses, it often is impossible to tell which group’s responses exhibit less error. A second approach, which overcomes this
17
limitation, is to compare survey responses to validation data – often administrative records. In incentives research, this means that the relative accuracy of responses provided by those who have received an incentive can be compared to that of respondents who have not received one. This method can be challenging to implement because of the difficulty of obtaining access to administrative records; however, a fair number of studies have successfully used validation to data to demonstrate that respondents often provide inaccurate answers. For example, this method has been used to demonstrate underreporting of socially desirable behaviors, such as voting (Traugott & Katosh, 1979), or respondents’ difficulty recalling certain types of events, such as their children’s vaccination history (Luman, Ryman, & Sablan, 2009). This method also has been used to demonstrate the impact of design features, such as survey mode, on the accuracy of survey responses to sensitive questions (Kreuter, Presser, & Tourangeau, 2008). However, using a record check to determine the accuracy of survey responses has rarely been done in conjunction with incentive experiments; in fact, in their review of the incentives literature, Singer and Ye (forthcoming) specifically point to the lack of research investigating the impact of incentives on the validity of survey responses. I am aware of only four incentive studies that have compared the relative accuracy of reports given by respondents who have received an incentive and those who have not received one. Two of these studies offered prepaid cash incentives. The first was a mail survey of people who had bought a major appliance at one of five stores in the Midwest; sample members were randomly assigned to receive 25 cents prepaid or no incentive. McDaniel and Rao (1980) asked respondents factual questions about their
18
purchase (such as model name, price paid, and date of purchase) and compared respondents’ reports with store records. They found that respondents who had received the prepaid incentive committed significantly fewer errors on average than members of the control group.1 The second study that included a prepaid cash incentive experiment was a survey of registered voters in which one half of the sample members received $5 and the other half did not receive an incentive; respondents also were randomly assigned to either mail or telephone administration. Tourangeau et al. (2010) compared respondents’ self-reports of voting status in two elections to records from the Aristotle database of registered voters. They found that, for both elections, the incentive did not significantly affect the proportion of respondents that misreported. However, the direction of the effect was the same for both items – in the 2004 election the incentive led to a ten percentage point increase in the prevalence of misreporting among those who had received an incentive, and in the 2006 election it led to a five percentage point increase in the prevalence of misreporting. The other two studies offered incentives contingent on completion of the survey. The first of these was a mail survey of elites, such as university professors and cabinet ministers, in 60 countries; the topic of the survey was family planning and population growth (Godwin, 1979). One third of the sample was offered a promised incentive of $25, one third was offered a promised incentive of $50, and the final third was not offered an incentive. For 28 factual questions such as, “Are contraceptives available in clinics in [country]?” Godwin compared the survey responses to published records and
1
The authors do not mention whether there were any significant differences in the sample composition of the incentive and control groups on variables such as length of time since purchase, so we cannot be certain whether the difference in response quality was driven by changes of this nature.
19
the responses of any other respondents from the same country. He grouped respondents into “low”, “medium”, and “high” accuracy groups and found that being offered an incentive significantly increased the proportion of correct responses. This was particularly true for the group that was offered $50; 50% of these respondents fell into the “high” accuracy category, as compared to 26% of those offered $25 and only 20% of those in the control group.2 The final instance was an incentive experiment in the Alcohol and Drug Services Study (ADSS); this study interviewed individuals who were recently discharged from substance abuse treatment facilities. In this experiment, there were three incentive conditions and one control group. Two of these incentive groups were offered either $15 or $25 contingent on completion of a face-to-face interview, and all three groups were offered $10 in return for submitting to a urine drug test. In their analysis of this experiment, Krenzke et al. (2005) utilized the $15/$10 group as the comparison group. The researchers reported two efforts to compare respondents’ self-reports with validation data. First, 20 survey responses, mostly asking about drug use, were compared to records from the treatment facility (Table 1.1). Next, respondents’ self-reports of drug use in the past seven days and past 24 hours were compared to the results of urine tests (Table 1.1).3 Overall, these results suggest that the $15 incentive led to limited improvements in accuracy; responses to four of twenty survey items were significantly more likely to be accurate as compared to facility records, and self-reports of drug use as compared to urine tests were significantly more likely to be accurate for one of six items.
2
The author does not discuss whether these differences may have been driven by differences in sample composition between the incentive and control groups. 3 Respondents were told that they would be subject to a drug test before they provided the self-report; therefore, respondents who may have otherwise lied about their drug use may have decided to be honest in this particular case.
20
Furthermore, offering a $25 contingent incentive led to significant reductions in accuracy for two of the four survey items where we saw improvements with a $15 incentive. Table 1.1. Effect of Incentive on Accuracy of Self-Reports (Krenzke et al., 2005)
Compared to Treatment Facility Records Significant improvement for 4 of 20 items; Significant reduction for 1 item Significant reduction for 2 of 20 items Compared to Urine Test Significant improvement for 1 of 6 items No effect
$15 Contingent Incentive vs. No Contingent Incentive $25 Contingent Incentive vs. $15 Contingent Incentive
Thus, these four studies report conflicting results. The two studies finding an effect differ from those that did not on several dimensions. First, the two studies finding an increase in accuracy were published quite a while ago (1979, 1980), while those finding no effect were published more recently (2005, 2010). Second, the two studies that found an incentive effect were both mail studies, whereas at least some of the respondents in the two studies that did not find an effect utilized an intervieweradministered mode. Finally, the studies finding an improvement in quality looked at the accuracy of non-sensitive factual questions, while the two that found no effect looked at the accuracy of somewhat sensitive topics. Because only four studies have been conducted, it is difficult to know which of these dimensions is the most important. 1.3.4 Comparison of Effort Indicators The prevalence of measurement error also may be assessed in a third way; in this method, respondents who have received an incentive again are compared with those who have not received one. However, this method examines indirect indicators of data quality, such as missing data rates, thought to reflect respondents’ level of effort. Although effort indicators are only indirect measures of data quality, respondents who put forth greater
21
effort also may provide responses that have less measurement error. This method has frequently been employed in mode comparisons; for example, researchers have found that telephone respondents are more likely to satisfice than are face-to-face respondents (Holbrook, Green, & Krosnick, 2003; but see Roberts, Jäckle, & Lynn, 2007) or Internet respondents (Chang & Krosnick, 2009) and that cell phone respondents are no more likely than landline ones to take cognitive shortcuts (Kennedy & Everett, 2011). Much of the existing literature investigating the impact of incentives on respondent effort focuses on item nonresponse or on the length of responses to openended questions. Many of these studies have concluded that incentives do not have a significant impact on the prevalence of item nonresponse (e.g., Berk et al., 1987; Berlin et al., 1992; Cantor et al., 2008; Curtin et al., 2007; Furse & Stewart, 1982; Peck & Dresch, 1981; Shettle & Mooney, 1999). This conclusion has been reached across a variety of incentive types and a multitude of survey characteristics. For example, sending prepaid cash incentives in a mail survey of cable subscribers did not significantly affect the proportion of items that respondents skipped (James & Bolstein, 1990). Dirmaier and colleagues came to the same conclusion in a mail survey of psychotherapy patients (Dirmaier, Harfst, Koch & Schulz, 2007). Similarly, offering a non-contingent voucher in the in-person Survey of Income and Program Participation (SIPP) did not have a significant impact on the proportion of cases that were considered “mostly complete” (Davern, Rockwell, Sherrod, & Campbell, 2003). Finally, offering a contingent incentive in the National Adult Literacy Survey did not have a significant effect on the proportion of items that respondents attempted (Berlin et al., 1992).
22
However, several other studies have observed a reduction in item nonresponse when incentives are utilized; again, these studies have used both prepaid and promised incentives, have been conducted in a variety of modes, and have asked respondents about a wide range of topics. For example, in a mail survey of people who had bought a major appliance at one of five stores in the Midwest, sending a prepaid cash incentive of 25 cents significantly reduced the mean number items that respondents skipped (McDaniel & Rao, 1980). Similarly, sending a prepaid debit card worth $40 in the first wave of the face-to-face Consumer Expenditure Quarterly Interview Survey significantly reduced the number of items that respondents skipped in both the first wave and subsequent waves; the use of a $20 debit card also slightly reduced item nonresponse but not significantly so (Goldenberg, McGrath, & Tan, 2009). Offering a promised incentive of either $20 or $50 significantly reduced item nonresponse in an international mail survey of elites (Godwin, 1979). Finally, offering a promised incentive of $10 in a telephone survey of Chicago residents significantly reduced the number of items that respondents skipped; this decrease was driven by a reduction in the number of “don’t know” responses (Goetz et al., 1984). None of the incentive experiments I found resulted in a significant overall increase in item nonresponse. The studies listed above provided information about how the incentive affected item nonresponse across all items for all respondents; however, it is possible that the effect of the incentive was restricted to certain subgroups of respondents or particular types of items. Only a few studies have considered either of these possibilities, and those that have done so have tended to find conflicting results. For example, Singer et al. (2000) found that receiving an incentive in the Survey of Consumers led to a significant
23
reduction in item nonresponse for two particular subgroups – older respondents and nonWhites; however, this result was not replicated in a similar experiment conducted in a subsequent administration (Curtin et al., 2007). The possibility that incentives affect open and closed items differently has been considered in two studies, with conflicting results. Hansen (1980) found that providing a prepaid incentive of either 25 cents or a ballpoint pen led to a significant increase in the proportion of open-ended items that were skipped but had no effect on closed items. However, McDaniel and Rao (1980) found that offering a prepaid incentive of 25 cents significantly reduced item nonresponse for both open-ended and closed items. Two faceto-face to face studies considered the possibility that the effect of the incentive on item nonresponse might differ by item sensitivity, again with conflicting results. Providing a prepaid monetary incentive of three to five pounds in the British Social Attitudes Survey reduced item nonresponse for non-sensitive questions but increased it for sensitive ones (Tzamourani and Lynn, 1999). However, in a study of recently-released clients of drug and alcohol treatment facilities, offering a promised incentive of $15 to $25 did not have a significant impact on the proportion of items respondents skipped, regardless of item sensitivity (Krenzke et al., 2005). In this same study, the researchers hypothesized that the effect of the incentive would be greater in the final section of the interview, when respondents were tired, but this prediction was not supported by the data. The other effort-related outcome that frequently has been analyzed in incentive studies is the quality of open-ended responses, generally operationalized as the number of words or number of ideas included in the response. As with those that have looked at item nonresponse, these studies have tended to find either an improvement in quality with
24
an incentive or no effect. For example, in a telephone follow-up to the National Election Studies, respondents who had received a prepaid incentive of either $1 or a pen provided significantly more ideas on average in response to two open-ended questions (Brehm, 1994). Similarly, respondents to a telephone survey who had been promised $10 provided significantly more words on average in response to open items (Goetz et al., 1984). In a contradictory finding, Hansen (1980) found that mail respondents who were given a prepaid incentive of either 25 cents or a pen provided significantly fewer words on average; coders also rated the incentive groups’ responses to be of lower quality on average. Interestingly, several studies that have offered more than one value of monetary incentive have found that only the larger amount has resulted in improved response quality. For example, in the British Social Attitudes Survey, respondents who were given five pounds provided significantly longer responses to open-ended items as compared to a control group – but receiving three pounds did not have a significant effect on response length (Tzamourani & Lynn, 1999). James and Bolstein (1990) conducted an incentive experiment as part of a mail survey in which respondents were given prepaid cash incentives of either 25 cents, 50 cents, $1, or $2. They found that only those respondents who had received at least 50 cents wrote significantly more words than the control group for an open-ended question. For a short-answer question where respondents were given space to write up to four comments, they also found that only those respondents who had received at least $1 provided a significantly greater number of comments. In a mail survey of elites, respondents who were promised $50 provided significantly more
25
detailed responses to an open item, but there was not a significant improvement in quality among respondents who were promised $25 (Godwin, 1979). It is rare for incentive studies to have considered data quality indicators beyond item nonresponse and responses to open-ended items; again, the studies that have done so have generally found that incentives either improve effort or have no effect. For example, the number of events recorded in a diary increased when an incentive was provided (Lynn & Sturgis, 1997). Receiving a prepaid voucher worth either $10 or $20 did not have a significant effect on the number of imputations or edits required for 40 items in the SIPP (Davern et al., 2003). In two other recent surveys, prepaid cash incentives did not affect the number of responses selected for a check-all-that-apply item, the proportion of respondents who provided at least one pair of inconsistent responses, or the proportion of respondents who provided round numerical responses (Medway et al., 2011). Finally, a few studies have found that respondents who have received incentives have been more willing to submit to requests that imply potential additional burden or may raise privacy concerns; for example, respondents who received prepaid incentives were more likely to provide additional contact information (Shettle & Mooney, 1999; Medway et al., 2011), and respondents who were offered promised incentives were more likely to agree to a urine drug test (Krenzke et al., 2005).
26
1.4
SUMMARY AND CONCLUSIONS Survey response rates have been declining in recent years. Because incentives
repeatedly have been found to increase survey response rates, they are utilized increasingly in surveys. In particular, prepaid incentives are more effective than promised ones, and monetary incentives are more effective than non-monetary ones. There is some evidence that larger incentives yield greater increases in the response rate; however, there may be diminishing returns from each additional dollar spent. Incentives may be effective at improving the representation of groups that are traditionally hard to reach, such as youth or minorities, as well as people who lack interest in the survey topic or a general interest in participating in research; however, this effect has not been observed across the board. Furthermore, there is mixed evidence as to the utility of incentives for reducing nonresponse bias. Given the widespread use of incentives, it is important to determine whether incentives affect the level of measurement error in surveys. Fewer studies have looked at measurement effects than at effects on nonresponse, but those that have looked at this issue have generally taken one of three approaches: (1) comparing response distributions, (2) comparing responses to validation data, or (3) comparing respondent effort indicators. These studies typically have concluded that incentives improve the quality of survey data or have no effect on it. To improve our understanding of incentives’ effect on measurement error, we need to move beyond the types of analyses that traditionally have been conducted. For example, comparisons of effort indicators typically have only considered the effect on item nonresponse and responses to open-ended questions; in Chapter 2, I report on the
27
impact that prepaid cash incentives had on the prevalence of a wider array of satisficing behaviors in a telephone survey. Furthermore, comparisons of response distributions usually have considered all of the survey items as one large group, without any differentiation between types of items; in Chapter 3, I hypothesize that the effect of incentives on responses may vary by item sensitivity and discuss the results of a mailsurvey prepaid cash incentive designed to test this possibility. Additionally, few studies have compared survey responses to validation data; in Chapter 3, I also report on the accuracy of responses to three survey items as compared to administrative records. Furthermore, the existing literature rarely examines whether the incentive had a differential impact on measurement error across subgroups of the sample; in these two analytical chapters, I discuss whether the effect of the incentive was restricted to individuals with particular characteristics, such as younger respondents or those with more education. Finally, existing studies typically report on the incentive’s effect on each item in isolation; in Chapter 4, I discuss whether prepaid cash incentives affected the relationships between survey responses intended to measure latent characteristics in several recent surveys by testing for measurement invariance between incentive and control group responses.
28
CHAPTER 2 SATISFICING IN TELEPHONE SURVEYS: DO PREPAID CASH INCENTIVES MAKE A DIFFERENCE? 2.1 INTRODUCTION Telephone survey response rates have declined considerably in recent years (Brick & Williams, forthcoming; Curtin et al., 2005; Steeh et al., 2001). Incentives are one tool for stemming this decline (Cantor et al., 2008; Curtin et al., 2007; Goetz et al., 1984; Moyer & Brown, 2008; Singer et al., 2000), and, as a result, survey practitioners are often eager to utilize them. However, several studies have found that offering prepaid incentives in telephone surveys can increase the cost per completed interview (Brick et al., 2005; Curtin et al., 2007; Gelman, Stevens, & Chan, 2003; but see Singer et al., 2000). Additional positive outcomes beyond increased response rates may be needed to justify these costs. If incentives motivate some respondents to participate who otherwise would have declined, they also may influence respondents’ behavior during the survey interview. Respondents seem more prone to satisfice in telephone surveys than in other modes (Chang & Krosnick, 2009; Hall, 1995; Holbrook et al., 2003), so the potential effect of incentives on respondent effort in telephone surveys is of particular interest. Existing research investigating the effect of incentives on respondent effort in telephone surveys suggests that incentives either result in increased effort or have no effect (e.g., Brehm, 1994; Goetz et al., 1984; Singer et al., 2000). However, these studies are limited in number and examine relatively few indicators of respondent effort. It would be useful to have more evidence about the effect of incentives on respondent effort in telephone surveys. 29
This chapter describes the methods and results of an experiment using a prepaid cash incentive in a telephone survey. It aims to overcome two limitations of prior research. First, the current study examines the impact of incentives on a wider array of effort indicators than has been considered in the earlier studies. Second, it assesses whether incentives’ effect on effort varies according to respondent or item characteristics, whereas most prior research has only discussed their effect on all respondents or items in the aggregate. 2.1.1 Satisficing Theory: Respondent Motivation as a Predictor of Effort Completing survey interviews can be cognitively taxing for respondents. Though researchers may hope that respondents carefully proceed through all four components of the response process (Tourangeau, Rips, & Rasinski, 2000), cognitive fatigue or lack of interest may lead them to take shortcuts when responding. Instead of responding carefully, respondents may not process survey questions thoroughly and may provide acceptable responses instead of optimal ones (Krosnick, 1991). Satisficing theory proposes a framework for understanding the conditions under which respondents take these cognitive shortcuts (Simon, 1956; Krosnick & Alwin, 1987). According to this theory, the probability that a respondent will satisfice for any given task is a function of three factors – task difficulty, respondent ability to complete the task, and respondent motivation to do so: ( ) ( ( ) ( ) )
Respondents are more likely to satisfice when the task at hand is difficult; however, the greater their ability and motivation, the less likely they are to do so (Krosnick, 1991). Several indicators of satisficing have been proposed, including response order effects, 30
straight-lining, item nonresponse, and acquiescence (Krosnick, 1991; Krosnick, 1999; Krosnick, Narayan, & Smith, 1996). Research comparing the quality of responses provided by reluctant respondents with that of the responses provided by those who participate more readily supports the hypothesis that respondents with more motivation may provide higher quality responses than those with less motivation (Cannell & Fowler, 1963; Triplett et al., 1996; Fricker & Tourangeau, 2010; Friedman, Clusen, & Hartzell, 2003; but see Kaminska et al., 2010; Yan et al., 2004). Prepaid cash incentives clearly increase respondents’ motivation to take part in surveys; if they also affect respondents’ motivation during the interview, they may alter the prevalence of satisficing behaviors.4 2.1.2 Theoretical Expectations for Effect of Incentives on Respondent Motivation Several theories have been offered to explain why prepaid incentives increase response rates. Typically, when researchers have proposed these theories, they have not discussed their implications for the effect of incentives on respondent behavior during the survey interview. Although these theories all are in agreement that incentives should increase response rates, extending them to predict the effect of incentives on respondent motivation and effort during the interview leads to inconsistent predictions about the effects of incentives. The theories suggest four possible effects of incentives on effort: (1) greater effort due to respondents’ sense of obligation to the survey researcher, (2) reduced effort due to the influx of uninterested, unmotivated respondents, (3) reduced effort due
4
Incentives also may affect the likelihood of satisficing by altering the average cognitive ability of the respondent pool. For example, incentives may reduce the proportion of respondents who are older (Dillman, 1996; Miller, 1996; Storms & Loosveldt, 2004); satisficing has been found to be more common among older respondents than among younger ones (Krosnick & Alwin, 1987). Thus, changes in the distribution of cognitive ability need to be taken into consideration when comparing the prevalence of satisficing behaviors among those who have received an incentive and those who have not received one.
31
to a focus on extrinsic motivations instead of intrinsic ones, and (4) no effect beyond the point of agreeing to participate. Consider the first of these possibilities – that incentives lead to greater effort due to respondents’ sense of obligation to the survey researcher. According to social exchange theory, prepaid incentives may invoke the norm of reciprocity (Gouldner, 1960). When sample members receive an incentive they may feel obligated to return the kindness by responding to the survey (Dillman et al., 2009). A related notion is that incentives create a positive attitude toward the sponsor; some studies find more favorable ratings of the survey sponsor, supporting the hypothesis that prepaid incentives help foster a positive relationship between the sample member and the researcher (e.g., James & Bolstein, 1990). These positive feelings toward the sponsor also could lead respondents to make greater effort than they would have otherwise. Alternatively, incentives could result in reduced effort due to an influx of uninterested, unmotivated respondents. Incentives may convince certain sample members to respond who are not drawn to other survey features, such as the topic (e.g., Groves et al., 2004). Such respondents may lack sufficient interest and motivation to provide high quality responses. Additionally, offering incentives could result in reduced effort due to a focus on extrinsic motivations instead of intrinsic ones. The sense of obligation posited by social exchange theory may actually reduce motivation if it causes respondents to focus too heavily on extrinsic reasons for completing the survey. People who are motivated by intrinsic factors, such as interest in the survey topic or enjoyment of sharing one’s opinions, may put forth more effort than those who are focused on extrinsic ones, such as
32
monetary rewards or a sense of obligation. Research on the importance of intrinsic motivation to academic success has supported this assumption (e.g., Bolkan et al., 2011; Fransson, 1977). Similarly, Rush, Phillips, and Panek (1978) report that paid subjects were “striving more for task completion rather than for success” (p. 448). Thus, respondents who see the incentive as their main, or only, motivation for participating may put forth less effort than those who take part for other reasons. A final possibility is that once sample members have committed to taking part in the survey the incentive has little to no further impact on their behavior. Agreeing to participate in the survey may be sufficient for most sample members to feel they have resolved any cognitive dissonance or met the obligations of the norm of reciprocity. If this is the case, then any further behaviors taken during data collection may not be influenced by the fact that the respondent has received an incentive. Camerer and Hogarth’s (1999) review of incentive experiments in laboratory studies found that incentives typically do not affect performance, supporting the view that incentives may not affect behavior during survey interviews. Additional empirical evidence is needed to help determine which of these expectations is the most accurate, and whether our expectations should vary according to survey design features or respondent characteristics. 2.1.3 Existing Studies Investigating the Impact of Incentives on Respondent Effort Most studies investigating the impact of incentives on respondent effort focus on item nonresponse. Several have observed a reduction in item nonresponse when incentives are utilized (Godwin, 1979; Goetz et al., 1984; Goldenberg et al., 2009; James & Bolstein, 1990; McDaniel & Rao, 1980). However, many others have concluded that
33
incentives do not have a significant impact on the rate of item nonresponse (e.g., Berk et al., 1987; Berlin et al., 1992; Cantor et al., 2008; Curtin et al., 2007; Davern et al., 2003; Furse & Stewart, 1982; Peck & Dresch, 1981; Shettle & Mooney, 1999). It is unclear what design characteristics lead to these divergent results; the studies in both groups have used prepaid and promised incentives, have been conducted in a variety of modes, and have asked respondents about a wide range of topics. None of the incentive experiments I found resulted in a significant overall increase in item nonresponse. The other effort-related outcome that frequently has been analyzed in incentive studies is the quality of open-ended responses, generally operationalized as the number of words or number of ideas included in the response. Several studies have concluded that incentives lead to an improvement in quality (Brehm, 1994; Goetz et al., 1984; Willimack et al., 1995), although multiple studies that have offered more than one value of monetary incentive have found that only the larger amount has resulted in improved response quality (Godwin, 1979; James & Bolstein, 1990; Tzamourani & Lynn, 1999). I only came across one study where the incentive led to a significant reduction in response quality (Hansen, 1980). Only a few incentive studies have considered effort indicators beyond item nonresponse and responses to open-ended items; again, the studies that have done so have generally found that incentives either improve effort or have no effect. For example, the number of events recorded in a diary was increased when an incentive was provided (Lynn & Sturgis, 1997). However, a prepaid voucher did not have a significant effect on the number of imputations or edits required for 40 items in the Survey of Income and Program Participation (Davern et al., 2003). In two other recent surveys, prepaid cash
34
incentives did not affect the number of responses selected for a check-all-that-apply item, the proportion of respondents who provided at least one pair of inconsistent responses, or the proportion of respondents who provided round numerical responses (Medway et al., 2011). Finally, a few studies have found that respondents who have received incentives have been more willing to submit to requests that imply potential additional burden; for example, respondents who received prepaid incentives were more likely to provide additional contact information (Shettle & Mooney, 1999; Medway et al., 2011), and respondents who were offered promised incentives were more likely to agree to a urine drug test (Krenzke et al., 2005). These studies provide information about how incentives affected effort across all items for all respondents; however, it is possible that the effect of the incentive is restricted to certain subgroups of respondents or particular types of items. Only a few studies have examined these possibilities. For example, Singer et al. (2000) found that incentives in the Survey of Consumers led to a significant reduction in item nonresponse within two particular subgroups – older respondents and non-Whites; however, this result was not replicated in a similar experiment in a subsequent administration (Curtin et al., 2007). Similarly, McDaniel and Rao (1980) found that a prepaid incentive significantly reduced item nonresponse for both open-ended and closed items, while Hansen (1980) found that the effect of incentives was limited to open-ended items. Finally, Tzamourani and Lynn (1999) concluded that providing a prepaid monetary incentive reduced item nonresponse for non-sensitive questions but increased it for sensitive ones, while Krenzke and colleagues (2005) found that offering a promised cash incentive did not have a significant impact on the proportion of items respondents skipped, regardless of item
35
sensitivity. In this same study, the researchers hypothesized that the effect of the incentive would be greater in the final section of the interview, when respondents were tired, but this prediction was not supported by the data. 2.1.4 Extending the Literature As this review shows, most of the existing incentive experiments have focused on item nonresponse as the primary indicator of respondent effort. Item nonresponse rates are an attractive indicator of data quality in the sense that they are easily calculated and compared across studies. Furthermore, reducing item nonresponse is desirable because it decreases the amount of imputation that must be done. However, while the level of item nonresponse is a widely used indicator of respondent effort in survey research, it only captures the absolute minimum amount of information about respondent effort; it tells researchers that the respondent took the time to provide an answer, but it tells them nothing about the actual quality of that response. Several other indicators of effort that hold respondents to a higher standard, such as response order effects and nondifferentiation, have been utilized in other areas of survey research but have not been applied to incentives research. Measuring the impact of incentives on such indicators would improve researchers’ knowledge of the degree to which incentives influence respondent effort. The current study includes measures of twelve indicators of respondent effort; the operationalization of each indicator is discussed at greater length in the Methods section, but most of them are derived from the literature on survey satisficing. Furthermore, the current study examines the possibility that the effect of incentives varies according to respondent or item characteristics. Significant effects at the subgroup level may be masked at the aggregate level. I hypothesize that an incentive will
36
increase effort among respondents who recall receiving it but not among other respondents. I also hypothesize that incentives will have a greater impact on respondents with higher levels of cognitive ability and lower levels of conscientiousness. Indicators of these characteristics are included in the current study and are discussed in further detail in the Methods section of this chapter. Finally, I hypothesize that the effect of the incentive will be greater for two types of items. First, due to cognitive fatigue, the incentive will have a greater impact on effort in the second half of the interview than it will in the first half. Second, because answers to attitude items are more affected by context than answers to factual or behavioral questions, the incentive will have a greater effect on responses to the former than on responses to the latter.
37
2.2 2.2.1
RESEARCH METHODS Sampling Frame and Experimental Conditions The data for this study come from the 2011 JPSM Practicum survey. As part of
this study, a telephone survey was conducted in the summer of 2011 by Princeton Survey Research Associates International (PSRAI). The target population was noninstitutionalized persons age 18 and older living in the continental United States. Survey Sampling International (SSI) provided a sample of 9,500 individuals. SSI creates its directory-listed files by merging directory-listed residential telephone numbers with a variety of secondary sources, such as birth records, voter registrations, and motor vehicle registrations. SSI selected the sample for this study so that the number of records selected for each state and county was in line with Census population distributions. A listed sample was chosen because it included a name, address, and phone number for each case. An address and a phone number were needed to send each sample member an advance letter and interview him or her on the telephone. Having full name information for each sample member increased the power of the incentive treatment; the advance letter was addressed to the specifically-named individual listed in the sample file, and only this individual was eligible to complete the telephone interview. This increased the likelihood that the individual who completed the interview also was the household member who had read the advance letter and received the cash incentive. 7,200 sample members were randomly selected from the initial list of 9,500 cases received from SSI. Just over one percent of the 9,500 cases did not include a first name; because of the difficulty of requesting to speak with a person for whom we did not have a first name, these cases were dropped from the file prior to selecting the sample. SSI
38
indicated that an additional one percent of the 9,500 cases were ported numbers; to avoid inadvertently autodialing wireless numbers, these cases also were removed from the file before the sample was selected. All 7,200 sample members were sent an advance letter. The letters were released in two replicates. The first batch was sent to 3,400 sample members on July 14-15, 2011, and the second was sent to 3,800 sample members on July 28-29, 2011. As part of this experiment, 40% of the sample members were randomly assigned to receive a $5 prepaid incentive with the advance letter. The other 60% of the sample received an advance letter without an incentive. Both replicates were included in the experiment. The exact wording of the advance letter is included in Appendix A. Interviews were conducted between July 18 and August 17, 2011. PSRAI made up to six attempts to reach sample members. Nine hundred interviews were completed. The median interview length was 19.5 minutes. The survey covered several topics, including health, employment, and current social issues. 2.2.2 Indicators of Respondent Effort The survey questionnaire included measures of several indicators of respondent effort. First, it included measures of the two indicators most commonly studied in prior incentive experiments: (1) item nonresponse and (2) responses to open-ended items. The survey also included measures of other traditional satisficing indicators originally proposed by Krosnick and colleagues (Krosnick, 1991; Krosnick, 1999; Krosnick et al., 1996): (3) straight-lining and non-differentiation, (4) acquiescence, and (5) response order effects. Finally, the survey included indicators that survey researchers have used to determine respondents’ level of effort in other contexts: (6) lack of attention to important
39
exclusions, (7) use of round or prototypical values for numerical responses, (8) use of estimation strategies to answer questions requiring numerical responses, (9) underreporting to filter items, (10) interview length, and (11) accuracy of survey reports as compared to frame data. After the call was completed, (12) the interviewers provided observations about each respondent’s level of effort during the interview. With the exception of accuracy of survey reports as compared to frame data, these are indirect indicators of measurement error. Using round numbers for numerical responses and providing brief responses to open-ended items does not prove that an individual’s responses are prone to extensive measurement error, but it does imply that he or she may be making less effort; by extension such individuals may also provide less accurate responses. In the section below, I provide more information about each of these indicators, including how they were measured in the questionnaire and how I analyzed the data. Exact question wordings can be found in Appendix A, while information about which items were included in each indicator is located in Appendix B. Item nonresponse. When respondents feel that a survey item is too cognitively burdensome, they may decline to provide an answer instead of formulating a response. To determine the effect of the incentive on item nonresponse in the current study, I calculated the proportion of the items that each respondent declined to answer.5 If respondents who received an incentive made greater effort than those who did not receive one, they should have skipped a smaller proportion of the items.
5
The questionnaire included several experiments and skip patterns. As a result, most of the respondents were not asked all of the survey questions. Because of this, I often report outcomes as proportions – the number of times the respondent did a certain behavior divided by the number times they had the opportunity to do so.
40
Responses to open-ended items. Open-ended items, in which respondents are asked to come up with their own response instead of choosing one from a pre-selected list of options, can be especially burdensome. The current study included three such items. To determine the effect of the incentive on responses to open-ended items, I calculated the number of words that each respondent provided in response to each item. For one of the items I also was able to calculate and compare the number of ideas that respondents provided as part of their answer. Finally, I determined the proportion of the respondents who skipped each of these items. If respondents who received an incentive made greater effort than those who did not receive one, they should have provided more words and ideas on average and fewer of them should have skipped these items. Straight-lining or non-differentiation. Respondents may be presented with a series of items that have the same response options. Straight-lining occurs when respondents select the same response for all of the items in such series, and nondifferentiation occurs when respondents select very similar responses for all of the items (Krosnick, 1991). To measure the effect of the incentive on these behaviors, I included three multi-item batteries in the questionnaire. Respondents were considered to have straight-lined if they provided the same response to all of the items in a particular battery. To determine the degree of differentiation in responses, I calculated the standard deviation of each respondent’s answers to each battery. If respondents in the incentive group made greater effort, a significantly smaller proportion of them should have straight-lined, and there should be significantly more differentiation in their responses to batteries of items.
41
Acquiescence bias. Respondents are often asked questions that require either an affirmative or negative response. These statements typically ask about things that are reasonable for respondents to believe or do. Individuals who are not processing these questions deeply may find it easier to find reasons to provide an affirmative response than to provide a negative one (Krosnick, 1991). The current study included 29 such items. I calculated the number of affirmative responses each respondent gave to these items. If respondents who received an incentive made greater effort than those who did not, they should have provided a smaller number of affirmative responses to these items. A drawback to this approach is that differences between the incentive and control group in the average number of affirmative responses could be due to true differences in the prevalence of the attitudes or behaviors asked about in these items. As a result, I included wording experiments for four of the items in which one half of the respondents was asked the original version of the question and the other half was read a question asking about the opposite attitude. For example, one half of the respondents was asked whether they agree or disagree that, “Increasing government involvement in healthcare will improve the quality of care,” while the other half was asked if this will hurt the quality of care. If respondents in the incentive group made greater effort, they should be less likely than control group respondents to agree with both versions of the item. Response order effects. Respondents who are satisficing may also demonstrate response order effects. In aural modes such as the telephone, both recency effects and primacy effects may occur (Krosnick & Alwin, 1987). To assess the effect of the incentive on response order effects, I examined the 27 items in the questionnaire that had at least four response options. I determined the proportion of these items for which for
42
each respondent did each of the following: selected one of the first two options, selected the first option, selected one of the last two options, or selected the final option. If respondents who have received an incentive made greater effort than those who did not, then they should have selected the first or last options less frequently. Again, the drawback to this approach is that differences in the observed prevalence of primacy or recency effects between the incentive and control group could potentially be due to true differences in the prevalence of the attitudes or behaviors asked about in these items. Therefore, four experiments were included in the questionnaire that varied the order of the response options; one half of the respondents received the initial order and the other half received the reverse order. If respondents who received an incentive made greater effort than those who did not receive one, then they should have selected the first or last options less often, regardless of presentation order. Lack of attention to question wording. If respondents are taking cognitive shortcuts they may not listen very carefully to the reading of survey items and may miss important details such as instructions to include or exclude certain categories from consideration in their response. In the current study, four experiments were conducted to determine the effect of the incentive on attention to question wording. For each item, respondents were randomly assigned to either the standard wording or the exclusion wording. For example, respondents in the standard condition were asked, “In the past seven days, how many servings of vegetables did you eat?” The exclusion condition respondents also were told, “Please do not include carrots, beans, or lettuce”. For each item, I determined the mean response provided in the standard and exclusion conditions and calculated the difference between these two means. If respondents who received an
43
incentive made greater effort than those who did not receive one, the difference between the standard and exclusion groups should be significantly greater among respondents in the incentive group. Use of round or prototypical values for numerical responses. Respondents often are asked to provide numerical responses to open-ended items. Respondents who are taking cognitive shortcuts may report round or prototypical numbers (such as multiples of 5 or 10) instead of making the extra effort to report an exact numerical response. The current study included 15 items that requested numerical responses. For all of the items, respondents who provided a value that was a multiple of five were considered to have given a round response. In addition, some items asked how often respondents did something in a week or a year; for these items, multiples of seven or twelve also were considered round responses. I determined the proportion of items for which each respondent provided a round response. If respondents who received an incentive made greater effort than those who did not receive one, then they should have provided round responses for a smaller proportion of the items. Estimation strategy for numerical responses. When respondents are asked to provide a number that indicates how often they perform a certain behavior, there are several approaches they can use to come up with an answer that require varying levels of effort (Tourangeau et al., 2000). Respondents who are trying to conserve cognitive energy may estimate their response instead of making the effort to recall each particular episode (Conrad, Brown, & Cashman, 1998). To assess the effect of the incentive on recall strategy, I asked respondents to indicate how they came up with their answer to one such question, in which they were asked how many times they had seen a doctor or other
44
health professional in 2010. Respondents who said they had seen a doctor at least once were then asked, “Which of the following describes how you came up with your answer: did you estimate based on a general impression; did you think about types of visits; did you think about how often you usually go to the doctor; or did you think about each individual visit?” Respondents could select multiple responses. Any response other than thinking about each individual visit suggested that respondents had estimated their response. If respondents who received an incentive made greater effort than those who did not receive one, then fewer of them should have estimated their response. Underreporting to filter items. Surveys often include groups of questions that consist of an initial filter question and a set of follow-up questions for those whose answers to the filter question indicate that the follow-up items apply to them. Interleafed formats are a relatively popular way to organize such questions in survey questionnaires. In this format, the follow-up items come immediately after the relevant filter question. With this format, however, respondents may learn that negative answers to the filter questions help them end the interview more quickly. Respondents who are trying to reduce their cognitive burden may then provide false negative responses to these filter items (Jensen, Watanabe, & Richter, 1999; Kreuter, McCulloch, Presser, & Tourangeau, 2011; Lucas et al., 1999). To determine the effect of the incentive on reports to filter items, I included a section in the questionnaire that consisted of six filter questions (and their follow-ups) in an interleafed format. Respondents were asked whether they had ever been diagnosed with six medical conditions, such as hypertension or diabetes. The order of these questions was randomized. If respondents replied affirmatively they were asked at what
45
age they had been diagnosed. This question may be burdensome enough that some respondents may have wanted to avoid answering it. As a result, they may have started responding negatively to the diagnosis questions that were presented later in the list. If respondents who received an incentive made greater effort than those who did not receive one, then presentation order should have a significantly smaller impact on their likelihood of providing an affirmative response. Interview length. Respondents who are fatigued may try to rush through the survey interview. Such respondents may listen to the survey questions less carefully and make less effort when formulating a response. For example, Malhotra (2008) found that overall completion time is a significant negative predictor of primacy effects for some subgroups of respondents. To determine the effect of the incentive on interview length, the start and end time of each interview was recorded. If respondents who received an incentive made greater effort than those who did not receive one, then their mean response time should be longer than that of the control group. Accuracy of survey reports as compared to frame data. All of the other indicators of respondent effort are indirect indicators of measurement error. In comparing respondent reports to data from the frame, we have a more direct indicator of measurement error – whether the self-report matches the frame, and, if it does not, the size of the discrepancy between the two pieces of information. To determine whether the incentive affected the accuracy of self-reports, I included one question that asked respondents for information that was also available on the frame – how many years they have lived in their current residence. Although there is likely some error in the frame data, we can assume that these errors are equally present for the incentive and control
46
groups due to the random assignment of sample members to experimental conditions. I compared survey reports to the frame data to determine the proportion of respondents whose answers matched the frame exactly. I also calculated the mean discrepancy between self-reports and frame data in the two groups. If respondents who received an incentive made greater effort than those who did not receive one, there should be a smaller number of discrepancies in the reports of incentive respondents and the mean discrepancy in reports should be smaller. Interviewer reports of effort. Through their interaction with the respondent throughout the interview, interviewers can formulate an impression of the amount of effort made by respondents. These impressions can incorporate effort indicators that are not captured by survey responses – such as respondents’ decision to consult records during the interview or indications that the respondent is distracted by other things happening in the room. To determine whether the incentive affected interviewers’ impression of respondent effort, the interviewers rated each respondent’s effort after the call was completed; they recorded the extent to which “the respondent answered the survey questions to the best of their ability,” on a five point scale from “not at all” to “very often”. If respondents who received an incentive made greater effort than those who did not receive one, interviewers should have rated their effort as being greater than that that of the control respondents. 2.2.3 Respondent Characteristics Prior studies investigating the impact of incentives on data quality have found weak effects or no effects at all. This absence of findings may be due to the fact that existing studies have considered the effect of incentives on all respondents in the
47
aggregate. The effect of incentives on effort may be limited to certain subgroups of respondents. Analyzing all respondents at once may mask the effect the incentive has on these subgroups. The current study’s questionnaire included measures of several respondent characteristics hypothesized to impact whether an incentive would affect the level of effort put forth by respondents. Cognitive ability. I hypothesized that receiving an incentive would have little effect on the prevalence of satisficing among respondents with lower cognitive ability – because of their lower cognitive ability, these respondents cannot help satisficing; in contrast, the incentive should have a greater impact on the behavior of respondents who have fewer cognitive constraints. This study included three indicators of cognitive ability. The first was age; respondents over the age of 65 were considered to have lower cognitive ability. The second was education; respondents who reported having a high school diploma or less were considered to have lower cognitive ability. The final indicator was interviewer reports of respondent difficulty in answering the survey questions. After the interview was completed, the interviewers were asked to report how often the respondent had trouble understanding the survey questions on a five-point scale from “not at all” to “very often”; respondents who had trouble understanding the questions somewhat often, pretty, often, or very often were considered to have lower cognitive ability. Conscientiousness. I hypothesized that the more internally motivated respondents were, the smaller the effect of the incentive would be on their effort during the interview. In the current study, I used self-reported conscientiousness as a proxy for intrinsic motivation. Conscientiousness is one of the five personality traits on which individuals
48
are said to vary on the “Big 5” personality inventory; people who rate highly on conscientiousness tend to be thorough, self-disciplined, and pay attention to details (Costa & McCrae, 1992). Respondents who are conscientious should be motivated to complete the questionnaire at a high level of quality regardless of whether they have received an incentive; conversely, the motivation and dedication of respondents who are not very conscientious may be affected by whether or not they have received an incentive. The questionnaire included 10 items intended to measure conscientiousness; these items were adapted from a measure developed by Buchanan, Johnson, and Goldberg (2005). Respondents were asked to what extent they agreed or disagreed that each item described them on a five-point scale from “strongly disagree” to “strongly agree”. Example items include, “I am always prepared,” and, “I do just enough work to get by”. I calculated the mean score that respondents gave across the items, with higher means indicating greater levels of conscientiousness. Incentive recall. Finally, I hypothesized that the incentive would have little impact on the behavior of respondents who did not recall receiving it. Although the advance letter was addressed to a specific individual and this same person was asked to complete the telephone interview, it remained possible that some of the respondents in the incentive group were not aware of the incentive. To measure incentive recall, respondents first were asked, “A letter describing this study may have been sent to your home recently. Do you remember seeing the letter?” Respondents who recalled the letter were asked an open-ended follow-up, “Do you happen to remember if there was anything else in the envelope with the letter?” If respondents said “yes”, interviewers were instructed to probe to determine what the respondent remembered being in the envelope.
49
PSRAI also kept track of advance letters that were returned because they were undeliverable; the respondents to whom these letters were sent were considered to be unaware of the advance letter and incentive.
50
2.3
RESULTS This section presents the results of the incentive experiment described above.
First, I discuss the effect of the incentive on the response rate, cost per complete, and sample composition. Then, I review the effect of the incentive on several indicators of respondent effort. Finally, I discuss whether the effect differed depending on respondent characteristics. 2.3.1 Outcome Rates Table 2.1 provides an overview of the outcomes of the data collection effort. The overall response rate was 15.7% (AAPOR RR1). The response rate was significantly higher in the incentive condition (22.8%) than it was in the control condition (10.9%). The cooperation rate also was significantly higher in the incentive condition than in the control condition (39.2% and 20.0%, respectively), while the refusal rate was significantly lower (32.4% and 41.0%, respectively). Although not anticipated, the contact rate also was significantly higher in the incentive condition than it was in the control condition (58.1% and 54.7%, respectively). This may have been the result of significantly increased advance letter recall in the incentive group (86.5% and 66.0%, respectively); the advance letter mentioned the name of the survey research firm that would be calling, and improved recall of the letter may have led more of the incentive sample members to recognize the name when it appeared on caller ID.
51
Table 2.1. Outcome Rates, by Incentive Condition
Response Rate (RR1) Refusal Rate (REF1) Contact Rate (CON1) Sample size Cooperation Rate (COOP1) Sample size Remembered advance letter Sample size
A
Total 15.7% 37.6% 56.1% (7,199)A 28.0% (3,216) 77.9% (900)
Incentive 22.8% 32.4% 58.1% (2,880) 39.2% (1,337) 86.5% (524)
Control 10.9% 41.0% 54.7% (4,319) 20.0% (1,879) 66.0% (376)
?2(1) 145.58*** 44.43*** 6.42* 142.61*** 53.38***
Due to an error with the call scheduling software, one sample case was never called. *p
Response rate (also known as completion rate or return rate) in survey research refers to the number of people who answered the survey divided by the number of people in the sample. It is usually expressed in the form of a percentage.
ABSTRACT
Title of Dissertation:
BEYOND RESPONSE RATES: THE EFFECT OF PREPAID INCENTIVES ON MEASUREMENT ERROR Rebecca Medway, Doctor of Philosophy, 2012
Dissertation directed by:
Dr. Roger Tourangeau
As response rates continue to decline, survey researchers increasingly offer incentives as a way to motivate sample members to take part in their surveys. Extensive prior research demonstrates that prepaid incentives are an effective tool for doing so. If prepaid incentives influence behavior at the stage of deciding whether or not to participate, they also may alter the way that respondents behave while completing surveys. Nevertheless, most research has focused narrowly on the effect that incentives have on response rates. Survey researchers should have a better empirical basis for assessing the potential tradeoffs associated with the higher responses rates yielded by prepaid incentives. This dissertation describes the results of three studies aimed at expanding our understanding of the impact of prepaid incentives on measurement error. The first study explored the effect that a $5 prepaid cash incentive had on twelve indicators of respondent effort in a national telephone survey. The incentive led to significant reductions in item nonresponse and interview length. However, it had little effect on the other indicators, such as response order effects and responses to open-ended items. The second study evaluated the effect that a $5 prepaid cash incentive had on responses to
sensitive questions in a mail survey of registered voters. The incentive resulted in a significant increase in the proportion of highly undesirable attitudes and behaviors to which respondents admitted and had no effect on responses to less sensitive items. While the incentive led to a general pattern of reduced nonresponse bias and increased measurement bias for the three voting items where administrative data was available for the full sample, these effects generally were not significant. The third study tested for measurement invariance in incentive and control group responses to four multi-item scales from three recent surveys that included prepaid incentive experiments. There was no evidence of differential item functioning; however, full metric invariance could not be established for one of the scales. Generally, these results suggest that prepaid incentives had minimal impact on measurement error. Thus, these findings should be reassuring for survey researchers considering the use of prepaid incentives to increase response rates.
BEYOND RESPONSE RATES: THE EFFECT OF PREPAID INCENTIVES ON MEASUREMENT ERROR
By Rebecca Medway
Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2012
Advisory Committee: Dr. Roger Tourangeau, Chair Professor Frauke Kreuter Professor Stanley Presser Professor John Robinson Professor Eleanor Singer
© Copyright by Rebecca Medway 2012
ACKNOWLEDGMENTS The research presented in this dissertation was made possible through the support of several organizations. I am indebted to JPSM for allowing me to include my research in the 2011 JPSM Practicum survey. The results of this study became a focal point of both Chapters 2 and 4 of my dissertation. Additionally, I am grateful to the U.S. Census Bureau for awarding me a Dissertation Fellowship, which allowed me to conduct the mail survey featured in Chapters 3 and 4 of this dissertation. I also am grateful to the Survey Research Center at the University of Michigan for providing me access to data from the Survey of Consumers. There also are several individuals whose help was invaluable in creating this dissertation. First, I’d like to thank the members of my committee, particularly my Chair Roger Tourangeau, for their helpful feedback and unwavering support over the past two years. And finally, thank you to my parents, Marc and Cathy: for nurturing a love of learning from my earliest days and for always pushing me to be the best version of myself – both in and out of the classroom.
ii
TABLE OF CONTENTS
List of Tables ..................................................................................................................... iv List of Figures ................................................................................................................... vii Chapters 1. Incentives and Survey Research ..........................................................................1 2. Satisficing in Telephone Surveys: Do Prepaid Cash Incentives Make a Difference? .........................................................................................................29 3. The Effect of Prepaid Cash Incentives on Responses to Sensitive Items ..........90 4. Testing For Measurement Invariance in the Responses of Incentive and Control Group Respondents .............................................................................124 5. Summary and Conclusions ..............................................................................159 Appendices A. Survey Materials, JPSM Practicum Survey ...................................................168 B. Survey Items Used for Each Effort Indicator (Chapter 2) .............................204 C. Maryland Mail Survey Materials ...................................................................205 D. Item Sensitivity Ratings (Chapter 3)..............................................................218 E. Consumer Sentiment Items ............................................................................222 F. Additional Analyses for Chapter 4 ................................................................224 References ........................................................................................................................239
iii
LIST OF TABLES 1.1. Effect of Incentive on Accuracy of Self-Reports ........................................................21 2.1. Outcome Rates, by Incentive Condition .....................................................................52 2.2. Costs (Dollars), by Incentive Condition .....................................................................53 2.3. Sample Composition, by Incentive Condition ............................................................55 2.4. Item Nonresponse, by Incentive Condition ................................................................57 2.5. Mean Number of Words Provided, by Incentive Condition .......................................59 2.6. Number of Unique Ideas Provided for Consent Explanation, by Incentive Condition................................................................................................................59 2.7. Straight-lining and Non-differentiation, by Incentive Condition ...............................60 2.8. Mean Number of Affirmative Responses, by Incentive Condition ............................62 2.9. Odds Ratios from Logistic Regressions Predicting Acquiescence .............................63 2.10. Proportion of Items for Which Respondents Displayed Response Order Effects, by Incentive Condition .............................................................................64 2.11. Proportion of Respondents Selecting First Two Options in Original Order, by Incentive Condition...........................................................................................65 2.12. Logistic Regressions Predicting Selection of First Two Options in Original Order ......................................................................................................................66 2.13. Mean Proportion of Items for which Respondents Provided Round or Prototypical Responses, by Incentive Condition ...................................................67 2.14. Proportion of Respondents Reporting Each Strategy, by Incentive Condition ........69 2.15. Logistic Regressions Predicting Affirmative Response to Filter Items ....................71 2.16. Mean Seconds per Question, by Incentive Condition ...............................................73 2.17. Accuracy of Length of Residence Reports, by Incentive Condition ........................74 2.18. Interviewer Ratings of How Often Respondent Answered Questions to Best of Ability, by Incentive Condition .................................................................74 2.19. Mean Proportion of Items for Which Respondent Satisficed, by Incentive Condition................................................................................................................75 2.20. Prevalence of Cognitive Ability Indicators, by Incentive Condition ........................77 2.21. Advance Letter and Incentive Awareness, by Incentive Condition ..........................78 2.22. Linear Models Predicting Respondent Effort ...........................................................80 2.23. Linear Model Predicting Satisficing .........................................................................83 3.1. Sample Design ............................................................................................................97 3.2. Mailing Schedule ........................................................................................................97 3.3. Disposition Codes and Response Rate, by Incentive Condition ...............................101 3.4. Costs (Dollars), by Incentive Condition ...................................................................102 3.5. Demographics, by Incentive Condition ....................................................................103 3.6. Means from Linear Models Predicting Impression Management Scores .................104 iv
3.7. Means from Linear Models Predicting Proportion of Undesirable Responses.........105 3.8. Means from Linear Models Predicting Proportion of Items Skipped .......................107 3.9. Estimated Percentage of Voters and Bias Estimates, by Incentive Condition ..........109 3.10. Estimates from Logistic Regressions Predicting Survey Participation...................113 3.11. Estimates from Logistic Regressions Predicting Inaccurate Voting Reports .........115 3.12. Estimates from Logistic Regressions Predicting Survey Participation...................117 3.13. Estimates from Logistic Regressions Predicting Inaccurate Voting Reports .........119 4.1. Proportion of Cases with Complete Data, by Incentive Condition ...........................140 4.2. Mean Index Score, by Incentive Condition ..............................................................141 4.3. Cronbach’s Alpha, by Incentive Condition ..............................................................142 4.4. Differential Item Functioning: Patriotism .................................................................144 4.5. Differential Item Functioning: Conscientiousness....................................................145 4.6. Differential Item Functioning: Impression Management .........................................146 4.7. Differential Item Functioning: Consumer Sentiment: $10 vs. 0 ...............................147 4.8. Differential Item Functioning: Consumer Sentiment: $5 vs. 0 .................................147 4.9. Differential Item Functioning: Consumer Sentiment: $10 vs. $5 .............................148 4.10. Fit Indices for Initial Measurement Models, by Incentive Condition .....................149 4.11. Fit Indices for Configural Invariance Models .........................................................150 4.12. Fit Indices for Metric Invariance Models ...............................................................151 4.13. Modification Indices Produced by Metric Invariance Models ...............................153 4.14. Fit Indices for Metric Invariance Models with One Equality Constraint Removed ..............................................................................................................153 4.15. Modification Indices After Releasing One Equality Constraint .............................154 4.16. Unstandardized Factor Loadings: Consumer Sentiment.........................................155 4.17. Standardized Factor Loadings: Consumer Sentiment .............................................155 5.1. Key Findings .............................................................................................................161 D.1. Mean Respondent Sensitivity Ratings .....................................................................219 D.2. Grouping Items by Sensitivity .................................................................................221 F.1A. Correlation Matrix for Patriotism Items ................................................................225 F.1B. Correlation Matrix for Conscientiousness Items ...................................................225 F.1C. Correlation Matrix for Impression Management Items .........................................225 F.1D. Correlation Matrix for Consumer Sentiment Items: $10 vs. Control ....................226 F.1E. Correlation Matrix for Consumer Sentiment Items: $5 vs. Control .......................226 F.1F. Correlation Matrix for Consumer Sentiment Items: $10 vs. $5 .............................226 F.2A. Covariance Matrix for Patriotism Items: Incentive Condition ..............................227 F.2B. Covariance Matrix for Patriotism Items: Control Condition .................................227 F.2C. Covariance Matrix for Conscientiousness Items: Incentive Condition .................227 F.2D. Covariance Matrix for Conscientiousness Items: Control Condition ....................228 v
F.2E. Covariance Matrix for Impression Management Items: Incentive Condition........228 F.2F. Covariance Matrix for Impression Management Items: Control Condition ..........228 F.2G. Covariance Matrix for Consumer Sentiment Items: $10 Condition ......................229 F.2H. Covariance Matrix for Consumer Sentiment Items: $5 Condition ........................229 F.2I. Covariance Matrix for Consumer Sentiment Items: Control Condition .................229 F.3A. Cronbach’s Alpha Coefficient, by Incentive Condition ........................................230 F.3B. Differential Item Functioning: Patriotism ..............................................................231 F.3C. Differential Item Functioning: Conscientiousness .................................................232 F.3D. Differential Item Functioning: Impression Management (n=1010) ......................233 F.3E. Differential Item Functioning: Consumer Sentiment: $10 vs. 0 (n=529) ..............234 F.3F. Differential Item Functioning: Consumer Sentiment: $5 vs. 0 (n=514) ................234 F.3G. Differential Item Functioning: Consumer Sentiment: $10 vs. $5 (n=579) ............235 F.3H. Fit Indices for CFA Models, by Incentive Condition: Patriotism..........................235 F.3I. Fit Indices for CFA Models, by Incentive Condition: Conscientiousness ..............236 F.3J. Fit Indices for CFA Models, by Incentive Condition: Impression Management ....236 F.3K. Fit Indices for CFA Models, by Incentive Condition: Consumer Sentiment ........237 F.3L. Modification Indices Produced by Metric Invariance Models: Consumer Sentiment ..........................................................................................................237 F.3M. Fit Indices for Metric Invariance Models with One Equality Constraint Removed: Consumer Sentiment .......................................................................238 F.3N. Modification Indices After Releasing One Equality Constraint: Consumer Sentiment ..........................................................................................................238
vi
LIST OF FIGURES 2.1. Demographic Characteristics, by Survey ....................................................................54 2.2. Item Nonresponse for Open-Ended Items, by Incentive Condition ............................58 2.3. Difference between No-Exclusion Mean Response and Exclusion Mean Response, by Incentive Condition .........................................................................70 3.1. Mean Proportion Undesirable Behaviors and Attitudes, by Incentive Condition and Item Sensitivity ............................................................................106 3.2. Mean Item Nonresponse, by Incentive Condition and Item Sensitivity ...................107 3.3. Proportion of Respondents that Provided Inaccurate Voting Reports, by Incentive Condition ..............................................................................................108 3.4. Proportion of Nonvoters Participating in Survey, by Incentive Condition ...............112 3.5. Proportion of Voters Participating in Survey, by Incentive Condition .....................112 3.6. Proportion of Nonvoters Providing Inaccurate Voting Report, by Incentive Condition..............................................................................................................114 3.7. Proportion of Voters Providing Inaccurate Voting Report, by Incentive Condition..............................................................................................................114 3.8. Proportion of Sample Members that Participated, by Voting History and Incentive Condition ..............................................................................................116 3.9. Proportion of Sample Members that Provided Inaccurate Voting Reports for 2010 Election, by Voting History and Incentive Condition ................................118 3.10. Proportion of Sample Members that Provided Inaccurate Voting Reports for 2008 Election, by Voting History and Incentive Condition ................................118 3.11. Proportion of Sample Members that Provided Inaccurate Voting Reports for 2004 Election, by Voting History and Incentive Condition ................................118
vii
CHAPTER 1 INCENTIVES AND SURVEY RESEARCH 1.1 INTRODUCTION As survey response rates continue to decline, incentives are increasingly used as a way to motivate sample members to participate (Cantor, O’Hare, & O’Connor, 2008; Kulka, Eyerman, & McNeeley, 2005; Singer, Van Hoewyk, & Maher, 2000). Extensive research shows that incentives can increase response rates (e.g., Church, 1993; Hopkins & Gullickson, 1992; Singer, Van Hoewyk, Gebler, Raghunathan, & McGonagle, 1999; Trussell & Lavrakas, 2004); they clearly convince some sample members to participate who otherwise would not have done so. If they influence behavior at the stage of deciding whether or not to participate, it is reasonable to believe that incentives also may alter the way that respondents act during the survey interview. Thus, it is important to determine whether the use of incentives influences the magnitude of measurement error in survey estimates. Nevertheless, as pointed out by Singer and Ye (forthcoming), the majority of incentives research has focused narrowly on the effect that incentives have on response rates. Groves (2008) voices similar concerns and urges researchers to, “re-conceptualize the focus away from response rates”. Likewise, Cantor et al. (2008) speak to the need to improve our understanding of the impact that incentives have on data quality. Incentives conceivably could lead to an increase or a decrease in measurement error. On one hand, they could reduce measurement error if they create a sense of obligation to the researcher that causes respondents to make greater effort and provide more thorough, thoughtful responses to questions. Such a result would be reassuring to
1
survey practitioners who are enticed by the promise of higher response rates but lack sufficient empirical evidence of other benefits to justify the costs that can be associated with incentives. On the other hand, incentives could increase measurement error if they convince otherwise uninterested sample members to participate who lack intrinsic motivation to do so. As Brehm (1994) argues, “If we happen to get 100 percent of our respondents, but they all told us lies, we get 100 percent garbage” (p. 59). Survey practitioners should have a better empirical basis for assessing the potential tradeoffs associated with the higher responses rates yielded by prepaid incentives. This dissertation aims to expand our understanding of the impact of prepaid incentives on measurement error. In this chapter, I begin by reviewing the existing literature assessing the effect of incentives on both nonresponse and measurement error. In the three analytical chapters that follow, I address the following questions in turn: Do incentives affect the level of effort that respondents put into completing surveys? Do they influence self-presentation concerns, thereby altering responses to sensitive questions? Finally, does measurement invariance exist between responses received with an incentive and those received without one?
2
1.2 1.2.1
INCENTIVES AND NONRESPONSE Declining Survey Response Rates Survey response rates have declined considerably over the past several decades
(Brick & Williams, forthcoming; Steeh, Kirgis, Cannon, & DeWitt, 2001). For example, the response rate for the National Immunization Survey decreased by fourteen percentage points between 1995 and 2004 (Battaglia et al., 2008), while the response rates for the Consumer Expenditure Diary Survey and the National Health Interview Survey declined by twelve and eight percentage points, respectively, in the 1990s (Atrostic, Bates, Burt & Silberstein, 2001). De Leeuw and de Heer (2002) demonstrate that this is an international phenomenon; reviewing a multi-national sample of household surveys, they report that response rates have decreased by an average of half a percentage point per year over the past twenty years. Furthermore, the speed of this decline may be increasing; the response rate for the Survey of Consumer Attitudes decreased by one and a half percentage points per year from 1996 to 2003 – double the average annual decline observed from 1979 to 1996 (Curtin, Presser, & Singer, 2005). Low response rates can be problematic for several reasons. First, although the response rate is not always a good predictor of nonresponse bias (Groves, 2006), lower response rates may increase the potential for nonresponse bias. Nonresponse bias is a function of both the response rate and the difference between respondents and nonrespondents on survey estimates; if those individuals who respond are not representative of the larger sample on the variables of interest, the estimates will be biased (Groves & Couper, 1998). Furthermore, low response rates may increase survey costs, as they mean that larger initial samples are required to attain the number of
3
respondents necessary to achieve desired levels of precision in survey estimates (Groves, Dillman, Eltinge, & Little, 2002). Survey nonresponse generally can be broken into two major components: inability to reach sample members (“noncontacts”) and failure to persuade them to complete the survey once they have been contacted (“refusals”). Research on surveys in modes where we can more easily disentangle these two components, such as face-to-face and telephone, repeatedly demonstrates that refusals account for a considerably larger proportion of nonresponse than do noncontacts (Brick & Williams, forthcoming; Curtin et al., 2005; Smith, 1995). Typical reasons provided for refusing include being too busy, not being interested in the survey topic, privacy concerns (such as not wanting to share personal information with a stranger), or negative reactions to aspects of the survey (such as its length) (Brehm, 1993; Bates, Dalhamer & Singer, 2008; Couper, Singer, Conrad, & Groves, 2008). Incentives may be an effective tool for reducing some of these refusals – either as an expression of gratitude for respondents’ time or as a way of overcoming a lack of interest in the survey topic. In fact, Couper et al. (2008) found that, following altruistic desires to be helpful or to influence policy, receiving money was one of the most common reasons provided for agreeing to respond to a (hypothetical) survey request. The results of several studies suggest that incentives’ effect on the response rate is largely a function of reduced refusal rates (Holbrook, Krosnick, & Pfent, 2008; Shettle & Mooney, 1999; Tourangeau, Groves, & Redline, 2010; Willimack, Schuman, Pennell, & Lepkowski, 1995). However, many other studies have not disentangled the effect of incentives on noncontact from their effect on refusals (Singer & Ye, forthcoming) –
4
possibly because such a large proportion of the existing experiments are part of mail surveys, where it can be difficult to determine whether nonresponse is caused by lack of contact or lack of willingness to participate. 1.2.2 Effect of Incentives on Response Rates Survey practitioners are searching continually for ways to combat declining response rates. Several tools, such as pre-notification, multiple follow-up contacts, and incentives, have proven effective and have become part of common survey practice. For example, in a meta-analysis of 251 mail surveys, Edwards and colleagues found that offering cash incentives doubled the odds of response, while pre-notification and followup contacts each multiplied the odds of response by about 1.5 (Edwards et al., 2002). Numerous experimental studies have demonstrated that incentives are an effective tool for increasing survey response rates (e.g., James & Bolstein, 1990; Shettle & Mooney, 1999; Petrolia & Bhattacharjee, 2009; Yammarino, Skinner, & Childers, 1991; Yu & Cooper, 1983). Several meta-analyses have shown that the successfulness of incentives spans all survey modes. For example, Church (1993) found that offering an incentive in mail surveys increases the response rate by an average of 13 percentage points. Similarly, incentives multiply the odds of response to Internet surveys by 1.3 on average (Göritz, 2006). Finally, a meta-analysis of both prepaid and promised incentive experiments in interviewer-administered surveys confirmed that incentives have a positive, but smaller, impact on response rates in these types of surveys as well; in these experiments, each dollar that was given to respondents increased the response rate by about one-third of a percentage point on average (Singer et al., 1999).
5
Certain types of incentives have proven to be more effective than others. Prepaid incentives tend to be more successful than promised ones contingent on completion of the survey (Armstrong, 1975; Berk, Mathiowetz, Ward, & White, 1987; Church, 1993; James & Bolstein, 1992; Petrolia & Bhattacharjee, 2009), and monetary incentives tend to be more effective than non-monetary ones (Hansen, 1980; Petrolia & Bhattacharjee, 2009; Warriner, Goyder, Gjertsen, Hohner, & McSpurren, 1996). As these findings imply, prepaid monetary incentives generally have the greatest impact on response rates. Two separate meta-analyses of incentive experiments in mail surveys both concluded that prepaid cash incentives increase mail survey response rates by 19 percent points on average (Church, 1993; Hopkins & Gullickson, 1992). Replicating Singer et al.’s (1999) finding that incentives have a smaller impact in interviewer-administered surveys, Cantor et al. (2008) found that prepaid cash incentives of up to $10 led to a median increase of six percentage points in RDD surveys. While some studies observe a linear relationship between the value of the incentive and the increase in the response rate (e.g., Church, 1993; Trussell & Lavrakas, 2004; Yu & Cooper, 1983), others conclude that increases in the incentive value may have diminishing influence on the response rate (Cantor et al., 2008; Fox, Crask, & Kim, 1988; James & Bolstein, 1992). Finally, although this dissertation generally focuses on the use of incentives at initial contact in cross-sectional surveys, incentives also have proven effective in other contexts. For example, incentives may reduce attrition in longitudinal studies (Creighton, King, & Martin, 2007; Goetz, Tyler, & Cook, 1984), and they may be an effective tool for refusal conversion (Brick, Montaquila, Hagedorn, Roth, & Chapman, 2005).
6
1.2.3
Theoretical Explanations for the Effectiveness of Incentives Multiple theories of survey response provide potential explanations for
incentives’ success at increasing response rates. For example, utility theory suggests that individuals weigh the costs and benefits of completing a task and will take action when the benefits of doing so exceed the costs (Groves & Couper, 1998). Offering an incentive is one way that researchers can make the perceived benefits of taking part in survey research greater than the perceived costs. Under such a framework, respondents may see the incentive as payment or reimbursement for their time and effort (Biner & Kidd, 1994). Conceptualizing the incentive as an economic exchange helps to explain why larger incentives have at times been found to be more effective than smaller ones (e.g., Trussell & Lavrakas, 2004). Other researchers have suggested that the effectiveness of incentives is not due to an economic exchange but a social one. Under social exchange theory (Blau, 1964; Homans, 1961), rewards and costs remain important decision-making factors, and individuals still choose to take action only when they feel it is in their self-interest to do so. However, social exchange is different from economic exchange in two main ways (Dillman, Smyth, & Christian, 2009). First, in social exchange, the definition of rewards and costs are more flexible; namely, the rewards do not have to be monetary. Second, the importance of trust is much greater in social exchanges. Social exchanges typically are not bound by contracts, and so individuals have to trust that the other party will provide a reward in the future that will be worth whatever cost they must bear. Actors in such exchanges are able to trust one another due to several rules and norms of exchange by which they can assume the other party will abide. One of the
7
central rules of social exchange is the norm of reciprocity; this rule suggests that when an individual takes an action that benefits you, you are expected to respond in kind (Gouldner, 1960). Incentives can be seen as a benefit that the survey sponsor provides to the sample member; when sample members receive an incentive they may feel obligated to return the kindness by responding to the survey. This may explain the effectiveness of prepaid incentives (Dillman et al., 2009). However, the mixed success of promised incentives suggests that sample members do not trust survey researchers enough to incur the costs of participation without having received their reward in advance. Other researchers have suggested a related explanation for the effectiveness of prepaid cash incentives, based on cognitive dissonance theory (Festinger, 1957). According to this theory, once respondents have received a prepaid incentive, the idea of keeping it without completing the survey creates a feeling of dissonance (Furse & Stewart, 1982). Sample members have two options for resolving this unpleasant feeling. The first is to dispose of the incentive; however, Furse and Stewart (1982) argue that most people will not choose this option because throwing money away also makes them feel uncomfortable, and because sending the money back to the researcher may involve almost as much effort as simply completing the survey. Therefore, most people will choose the second option – participating in the survey. Finally, leverage-saliency theory suggests that the impact of various design features on the participation decision differs across sample members (Groves, Singer, & Corning, 2000). According to this theory, the influence of each feature on an individual’s decision to respond depends on three factors: (1) how important the feature is to the sample member (leverage), (2) whether the sample member sees this as a positive or
8
negative feature (valence), and (3) the degree to which the feature is highlighted in the survey request (salience). For example, some sample members may choose to respond because they are interested in the survey topic described in the survey cover letter. Other sample members may lack such an interest but may be convinced to participate by a cash incentive included in the envelope. Thus, incentives may convince certain sample members to respond who are not drawn to other survey features such as the topic, and they may have little or no effect on other sample members’ willingness to participate (e.g., Baumgartner & Rathbun, 1996). 1.2.4 Effect of Incentives on Sample Composition and Nonresponse Error Several experimental studies in both mail and interviewer-administered modes have found that incentives, whether prepaid or promised, do not have much of an effect on sample composition (e.g., Brick et al., 2005; Cantor et al., 2008; Furse & Stewart, 1982; Goetz et al., 1984; James & Bolstein, 1990; Shettle & Mooney, 1999; Warriner et al., 1996; Willimack et al., 1995). However, the results of other experiments suggest that incentives can have two types of effects on sample composition. First, incentives may improve representation of traditionally underrepresented groups, such as young people (Dillman, 1996; Miller, 1996; Storms & Loosveldt, 2004), minorities (Berlin et al., 1992; Mack, Huggins, Keathley, & Sundukchi, 1998), and those with lower incomes (Mack et al., 1998) or less education (Berlin et al., 1992; Nederhof, 1983; Petrolia & Bhattacharjee, 2009). Second, the use of incentives may alter the characteristics of the respondent pool along dimensions other than the typical demographic variables measured in surveys. For example, as leverage-saliency theory might predict, incentives may help attract
9
respondents who are less interested in the survey topic (Baumgartner & Rathbun, 1996; Coogan & Rosenberg, 2004; Petrolia & Bhattacharjee, 2009). However, in a series of prepaid cash incentive experiments embedded in mail and telephone surveys, Groves and colleagues found only mixed support for this hypothesis (Groves et al., 2006; Groves, Presser, & Dipko, 2004). Additionally, Moyer and Brown (2008) actually found the reverse effect: promising a cash incentive for completing the National Cancer Institute’s telephone-administered Health Information National Trends Survey (HINTS)
significantly increased the proportion of respondents who had had cancer. The use of incentives also may reduce the proportion of respondents with certain personality traits or values, such as altruism or selflessness, due to an influx of more selfish respondents. Altruistic or selfless sample members are likely to respond to surveys even without an incentive, while incentives may serve as a motivating factor for sample members low in these traits (Storms & Loosveldt, 2004). For example, in a mail followup to the Detroit Area Study (DAS), Groves et al. (2000) found that a $5 prepaid cash incentive had a significantly greater impact on the response rate among DAS respondents who had reported low levels of community involvement than it did among those who had reported being more involved. Medway and colleagues found that offering a $5 prepaid incentive increased the proportion of respondents to a mail survey who had not volunteered in the past year – although this same effect was not found in an equivalent experiment conducted as part of a telephone survey (Medway, Tourangeau, Viera, Turner, & Marsh, 2011). To the extent that incentives improve representation of groups that are underrepresented when incentives are not used, they may lead to a reduction in
10
nonresponse error. This seems particularly likely in cases where incentives improve representation of individuals who lack interest in the survey topic. For example, Tourangeau et al. (2010) found that offering a prepaid cash incentive of $5 improved representation of nonvoters and reduced the nonresponse bias in reports of voting behavior in two recent elections by about six percentage points – although these differences did not reach statistical significance. However, improved representation of demographic groups that traditionally are underrepresented in surveys will reduce nonresponse bias only if these groups also differ from better-represented groups on key survey variables. For example, in an experiment that assigned sample members to receive either $5 cash or a pass to a local park, Ryu, Couper, and Marans (2005) found that the two types of incentives resulted in differences in respondents’ education level, marital status, and work status; however, they did not find differences in the other response distributions for the two groups. Finally, in their meta-analysis of nonresponse bias analysis studies, Groves and Peytcheva (2008) reported that, overall, the use of an incentive did not have a significant impact on the magnitude of nonresponse bias – though very few of the studies included in their analysis made use of incentives.
11
1.3
INCENTIVES AND MEASUREMENT ERROR Measurement error is any inaccuracy in survey responses that is due to the
process of measurement; this type of error can be differentiated from nonresponse error, discussed earlier, which arises from the failure to get some sample members to respond in the first place. Measurement error exists when the measured value in a survey differs from the corresponding unobserved “true” value (Borhnstedt, 2010), although it may be difficult, or even impossible, for the researcher to know this true value. Several potential sources of measurement error have been identified in the literature, including the interviewer, the respondent, and features of the survey design, such as mode of administration or question wording (Groves, 1989). Offering an incentive is an additional design decision that could have repercussions for the magnitude of measurement error in the resulting estimates. However, this possibility has received relatively little attention in the literature as compared to the effect of incentives on nonresponse. 1.3.1 Theory-Based Expectations for Effect of Incentives on Measurement Error Incentives conceivably have the potential to either increase or decrease measurement error through their influence on respondent behavior. The theories used to explain why incentives convince sample members to respond have conflicting implications for the effect of incentives on the quality of the answers provided during the interview. For example, according to social exchange theory, offering prepaid incentives is potentially the first step toward building a positive relationship between the researcher and the respondent; giving sample members a reward before receiving their responses implies that the researcher trusts and respects them. If respondents are motivated by a sense of community with the researchers, they may feel more comfortable while completing the survey, and, as a result, they may put forth more effort than they would 12
have otherwise. They also may be more willing to respond honestly to questions that are typically subject to social desirability biases. For example, a review of 74 incentive experiments in laboratory studies suggested that self-presentation concerns were reduced among the participations who had received incentives (Camerer & Hogarth, 1999). However, this feeling of having a positive relationship with the researcher could also lead respondents to focus too heavily on pleasing the researcher; as a result, respondents may provide more positive ratings, either generally across all items or specifically for items referring to the survey sponsor, than they would have otherwise. Offering respondents incentives also could affect their motivations for completing the survey; in particular, it may lead them to focus on extrinsic motivations instead of intrinsic ones. For example, according to social exchange theory, incentives may create a sense of obligation toward the researcher, and this feeling may be what motivates sample members to respond. Another possibility, as suggested by leverage-saliency theory, is that incentives may convince otherwise uninterested sample members to respond. In both cases, respondents are focused on an extrinsic motivation, as opposed to an intrinsic one. It seems reasonable that people who are motivated by extrinsic factors such as monetary rewards may put forth less effort than those who are focused on intrinsic ones, such as interest in the survey topic or enjoyment of sharing one’s opinions. Research on the importance of intrinsic motivation to academic success has supported this assumption (e.g., Bolkan, Goodboy, & Griffin, 2011; Fransson, 1977). Research on the quality of responses provided by reluctant respondents, who can be assumed to have low levels of motivation to participate, suggests that such respondents sometimes provide lower quality data than do more eager respondents
13
(Cannell and Fowler, 1963; Triplett, Blair, Hamilton, & Kang, 1996; Fricker & Tourangeau, 2010); however, other studies have failed to find a clear relationship between reluctance and data quality (Kaminska, McCutcheon, & Billiet, 2010; Yan, Tourangeau, & Arens, 2004). A final possibility is that, once sample members have committed to taking part in the survey, the incentive has little to no further impact on their behavior. Social exchange theory suggests that sample members are driven to respond by a sense of obligation to the researcher, while cognitive dissonance theory suggests they are driven by the desire to avoid the dissonance associated with refusal once they have accepted the incentive. If agreeing to participate in the survey satisfies these needs, then any further behaviors taken during data collection may not be influenced by the fact that the respondent has received an incentive. In support of the non-importance of incentives on respondent behavior while completing the survey, Camerer and Hogarth’s (1999) review of incentive experiments in laboratory studies found that incentives typically do not affect performance in such studies. 1.3.2 Comparison of Incentive and Control Group Response Distributions In the incentives literature, the presence of measurement error is typically assessed in one of three ways. The first is to compare the response distributions of two or more groups of respondents who have been randomly assigned to different experimental conditions. Differences between the groups’ responses suggest that there may be a greater amount of error in one of the groups. However, it can be difficult to know whether these differences are caused by a change in who responds (nonresponse error) or by a change in how they respond (measurement error). Furthermore, in the absence of some gold
14
standard to which we can compare the survey responses, we cannot easily tell which of the groups exhibits more error. There is some evidence that offering incentives can affect survey response distributions. Generally, these differences have been observed for attitudinal items in studies that have offered prepaid cash incentives - suggesting that incentives can lead to more positive survey responses. For example, respondents who received a prepaid cash incentive in a mail survey offered more positive comments about the sponsor in openended items than did those who did not receive an incentive; the researchers argue this occurred because receiving the incentive led to increased favorability toward the sponsor (James & Bolstein, 1990). In the telephone-administered Survey of Consumers, offering a $5 prepaid cash incentive had a significant effect on responses to four of seventeen key attitudinal questions; the authors suggest this happened because receiving the incentive put the respondents in a good mood (Singer et al., 2000). Brehm (1994) also found that offering a prepaid cash incentive led to more positive responses to several political attitude questions in a telephone survey. The use of prepaid cash incentives led to greater reported levels of concern about social issues for six of ten items in a mail survey, though the incentives did not increase respondents’ willingness to pay to improve the condition of these social issues (Wheeler, Lazo, Heberling, Fisher, & Epp, 1997). Finally, respondents who had been promised a $1 reduction in their hotel rate in exchange for completing a questionnaire were less likely to provide negative comments about their stay, as compared to a control group (Trice, 1984). I am aware of only two studies where incentives were found to affect response distributions to non-demographic factual items. In these cases, providing prepaid cash
15
incentives resulted in lower estimates of community involvement (Groves et al., 2000) and volunteering (Medway et al., 2011). However, it is impossible to know whether these differences were caused by changes in sample composition (those who volunteer their time for community activities also may be the type of people who are willing to do surveys without incentives, while those who do not do so may require an incentive to motivate them to take part in survey research) or by an increased obligation to be honest about not taking part in these socially desirable behaviors. Several other studies have concluded that incentives do not affect response distributions. Offering a prepaid cash incentive led to significantly different responses for only five percent of the questions in a mail study (Shettle & Mooney, 1999). Similarly, overall, James and Bolstein (1990) did not find significant differences in the response distributions of 28 closed questions in a mail survey when prepaid cash incentives were offered. Offering a prepaid non-monetary incentive did not have a significant effect on responses to ten items in the face-to-face DAS (Willimack et al., 1995). Finally, offering a contingent cash incentive between $10 and $40 did not affect response distributions in two government-sponsored face-to-face studies on substance abuse (the National Survey on Drug Use & Health (NSDUH) and the Alcohol and Drug Services Study (ADSS)) (Eyerman, Bowman, Butler, & Wright, 2005; Krenzke, Mohadjer, Ritter, & Gadzuk, 2005). It is not clear why incentives affect response distributions in some surveys and not in others. One reason may be that researchers have been inconsistent across studies in their selection of items to analyze. For example, some studies focus only on responses to the key survey questions (e.g., Curtin, Singer, & Presser, 2007; Singer et al., 2000), while
16
others consider all of the survey items as one large group (e.g., Shettle & Mooney, 1999). It is difficult to know whether restricting the analysis (or not doing so) is what led to these divergent results. Moving forward, the literature would benefit from a more systematic examination of the importance of various item characteristics. For example, does it matter if the item is a “key” measure that is directly related to the stated survey topic? Are attitude questions more likely to be affected by incentives than factual ones? Does the sensitivity of the item matter? How about placement in the questionnaire? Many other recent incentive experiments fail to discuss the potential effect of incentives on response distributions. In others, the possibility of an effect is mentioned but quickly dismissed without analyzing the data; this decision is, at times, based on the results of a handful of older studies that found offering incentives did not affect survey responses. However, these older studies exhibit features that prevent their results from generalizing to all surveys offering incentives. For example, several of them used very specialized, highly educated populations and surveyed them about topics that were of specific interest to them (Goodstadt, Chung, Kronitz, & Cook, 1977; Hansen, 1980; Mizes, Fleece, & Roos, 1984). Furthermore, several of these studies differed from more recent studies in that they were able to achieve response rates of over 60%, even for the groups that did not receive an incentive (Goodstadt et al., 1977; Mizes et al., 1984; Nederhof, 1983). 1.3.3 Comparison of Survey Responses to Validation Data A weakness of comparing response distributions is that, even when differences are observed between incentive and control group responses, it often is impossible to tell which group’s responses exhibit less error. A second approach, which overcomes this
17
limitation, is to compare survey responses to validation data – often administrative records. In incentives research, this means that the relative accuracy of responses provided by those who have received an incentive can be compared to that of respondents who have not received one. This method can be challenging to implement because of the difficulty of obtaining access to administrative records; however, a fair number of studies have successfully used validation to data to demonstrate that respondents often provide inaccurate answers. For example, this method has been used to demonstrate underreporting of socially desirable behaviors, such as voting (Traugott & Katosh, 1979), or respondents’ difficulty recalling certain types of events, such as their children’s vaccination history (Luman, Ryman, & Sablan, 2009). This method also has been used to demonstrate the impact of design features, such as survey mode, on the accuracy of survey responses to sensitive questions (Kreuter, Presser, & Tourangeau, 2008). However, using a record check to determine the accuracy of survey responses has rarely been done in conjunction with incentive experiments; in fact, in their review of the incentives literature, Singer and Ye (forthcoming) specifically point to the lack of research investigating the impact of incentives on the validity of survey responses. I am aware of only four incentive studies that have compared the relative accuracy of reports given by respondents who have received an incentive and those who have not received one. Two of these studies offered prepaid cash incentives. The first was a mail survey of people who had bought a major appliance at one of five stores in the Midwest; sample members were randomly assigned to receive 25 cents prepaid or no incentive. McDaniel and Rao (1980) asked respondents factual questions about their
18
purchase (such as model name, price paid, and date of purchase) and compared respondents’ reports with store records. They found that respondents who had received the prepaid incentive committed significantly fewer errors on average than members of the control group.1 The second study that included a prepaid cash incentive experiment was a survey of registered voters in which one half of the sample members received $5 and the other half did not receive an incentive; respondents also were randomly assigned to either mail or telephone administration. Tourangeau et al. (2010) compared respondents’ self-reports of voting status in two elections to records from the Aristotle database of registered voters. They found that, for both elections, the incentive did not significantly affect the proportion of respondents that misreported. However, the direction of the effect was the same for both items – in the 2004 election the incentive led to a ten percentage point increase in the prevalence of misreporting among those who had received an incentive, and in the 2006 election it led to a five percentage point increase in the prevalence of misreporting. The other two studies offered incentives contingent on completion of the survey. The first of these was a mail survey of elites, such as university professors and cabinet ministers, in 60 countries; the topic of the survey was family planning and population growth (Godwin, 1979). One third of the sample was offered a promised incentive of $25, one third was offered a promised incentive of $50, and the final third was not offered an incentive. For 28 factual questions such as, “Are contraceptives available in clinics in [country]?” Godwin compared the survey responses to published records and
1
The authors do not mention whether there were any significant differences in the sample composition of the incentive and control groups on variables such as length of time since purchase, so we cannot be certain whether the difference in response quality was driven by changes of this nature.
19
the responses of any other respondents from the same country. He grouped respondents into “low”, “medium”, and “high” accuracy groups and found that being offered an incentive significantly increased the proportion of correct responses. This was particularly true for the group that was offered $50; 50% of these respondents fell into the “high” accuracy category, as compared to 26% of those offered $25 and only 20% of those in the control group.2 The final instance was an incentive experiment in the Alcohol and Drug Services Study (ADSS); this study interviewed individuals who were recently discharged from substance abuse treatment facilities. In this experiment, there were three incentive conditions and one control group. Two of these incentive groups were offered either $15 or $25 contingent on completion of a face-to-face interview, and all three groups were offered $10 in return for submitting to a urine drug test. In their analysis of this experiment, Krenzke et al. (2005) utilized the $15/$10 group as the comparison group. The researchers reported two efforts to compare respondents’ self-reports with validation data. First, 20 survey responses, mostly asking about drug use, were compared to records from the treatment facility (Table 1.1). Next, respondents’ self-reports of drug use in the past seven days and past 24 hours were compared to the results of urine tests (Table 1.1).3 Overall, these results suggest that the $15 incentive led to limited improvements in accuracy; responses to four of twenty survey items were significantly more likely to be accurate as compared to facility records, and self-reports of drug use as compared to urine tests were significantly more likely to be accurate for one of six items.
2
The author does not discuss whether these differences may have been driven by differences in sample composition between the incentive and control groups. 3 Respondents were told that they would be subject to a drug test before they provided the self-report; therefore, respondents who may have otherwise lied about their drug use may have decided to be honest in this particular case.
20
Furthermore, offering a $25 contingent incentive led to significant reductions in accuracy for two of the four survey items where we saw improvements with a $15 incentive. Table 1.1. Effect of Incentive on Accuracy of Self-Reports (Krenzke et al., 2005)
Compared to Treatment Facility Records Significant improvement for 4 of 20 items; Significant reduction for 1 item Significant reduction for 2 of 20 items Compared to Urine Test Significant improvement for 1 of 6 items No effect
$15 Contingent Incentive vs. No Contingent Incentive $25 Contingent Incentive vs. $15 Contingent Incentive
Thus, these four studies report conflicting results. The two studies finding an effect differ from those that did not on several dimensions. First, the two studies finding an increase in accuracy were published quite a while ago (1979, 1980), while those finding no effect were published more recently (2005, 2010). Second, the two studies that found an incentive effect were both mail studies, whereas at least some of the respondents in the two studies that did not find an effect utilized an intervieweradministered mode. Finally, the studies finding an improvement in quality looked at the accuracy of non-sensitive factual questions, while the two that found no effect looked at the accuracy of somewhat sensitive topics. Because only four studies have been conducted, it is difficult to know which of these dimensions is the most important. 1.3.4 Comparison of Effort Indicators The prevalence of measurement error also may be assessed in a third way; in this method, respondents who have received an incentive again are compared with those who have not received one. However, this method examines indirect indicators of data quality, such as missing data rates, thought to reflect respondents’ level of effort. Although effort indicators are only indirect measures of data quality, respondents who put forth greater
21
effort also may provide responses that have less measurement error. This method has frequently been employed in mode comparisons; for example, researchers have found that telephone respondents are more likely to satisfice than are face-to-face respondents (Holbrook, Green, & Krosnick, 2003; but see Roberts, Jäckle, & Lynn, 2007) or Internet respondents (Chang & Krosnick, 2009) and that cell phone respondents are no more likely than landline ones to take cognitive shortcuts (Kennedy & Everett, 2011). Much of the existing literature investigating the impact of incentives on respondent effort focuses on item nonresponse or on the length of responses to openended questions. Many of these studies have concluded that incentives do not have a significant impact on the prevalence of item nonresponse (e.g., Berk et al., 1987; Berlin et al., 1992; Cantor et al., 2008; Curtin et al., 2007; Furse & Stewart, 1982; Peck & Dresch, 1981; Shettle & Mooney, 1999). This conclusion has been reached across a variety of incentive types and a multitude of survey characteristics. For example, sending prepaid cash incentives in a mail survey of cable subscribers did not significantly affect the proportion of items that respondents skipped (James & Bolstein, 1990). Dirmaier and colleagues came to the same conclusion in a mail survey of psychotherapy patients (Dirmaier, Harfst, Koch & Schulz, 2007). Similarly, offering a non-contingent voucher in the in-person Survey of Income and Program Participation (SIPP) did not have a significant impact on the proportion of cases that were considered “mostly complete” (Davern, Rockwell, Sherrod, & Campbell, 2003). Finally, offering a contingent incentive in the National Adult Literacy Survey did not have a significant effect on the proportion of items that respondents attempted (Berlin et al., 1992).
22
However, several other studies have observed a reduction in item nonresponse when incentives are utilized; again, these studies have used both prepaid and promised incentives, have been conducted in a variety of modes, and have asked respondents about a wide range of topics. For example, in a mail survey of people who had bought a major appliance at one of five stores in the Midwest, sending a prepaid cash incentive of 25 cents significantly reduced the mean number items that respondents skipped (McDaniel & Rao, 1980). Similarly, sending a prepaid debit card worth $40 in the first wave of the face-to-face Consumer Expenditure Quarterly Interview Survey significantly reduced the number of items that respondents skipped in both the first wave and subsequent waves; the use of a $20 debit card also slightly reduced item nonresponse but not significantly so (Goldenberg, McGrath, & Tan, 2009). Offering a promised incentive of either $20 or $50 significantly reduced item nonresponse in an international mail survey of elites (Godwin, 1979). Finally, offering a promised incentive of $10 in a telephone survey of Chicago residents significantly reduced the number of items that respondents skipped; this decrease was driven by a reduction in the number of “don’t know” responses (Goetz et al., 1984). None of the incentive experiments I found resulted in a significant overall increase in item nonresponse. The studies listed above provided information about how the incentive affected item nonresponse across all items for all respondents; however, it is possible that the effect of the incentive was restricted to certain subgroups of respondents or particular types of items. Only a few studies have considered either of these possibilities, and those that have done so have tended to find conflicting results. For example, Singer et al. (2000) found that receiving an incentive in the Survey of Consumers led to a significant
23
reduction in item nonresponse for two particular subgroups – older respondents and nonWhites; however, this result was not replicated in a similar experiment conducted in a subsequent administration (Curtin et al., 2007). The possibility that incentives affect open and closed items differently has been considered in two studies, with conflicting results. Hansen (1980) found that providing a prepaid incentive of either 25 cents or a ballpoint pen led to a significant increase in the proportion of open-ended items that were skipped but had no effect on closed items. However, McDaniel and Rao (1980) found that offering a prepaid incentive of 25 cents significantly reduced item nonresponse for both open-ended and closed items. Two faceto-face to face studies considered the possibility that the effect of the incentive on item nonresponse might differ by item sensitivity, again with conflicting results. Providing a prepaid monetary incentive of three to five pounds in the British Social Attitudes Survey reduced item nonresponse for non-sensitive questions but increased it for sensitive ones (Tzamourani and Lynn, 1999). However, in a study of recently-released clients of drug and alcohol treatment facilities, offering a promised incentive of $15 to $25 did not have a significant impact on the proportion of items respondents skipped, regardless of item sensitivity (Krenzke et al., 2005). In this same study, the researchers hypothesized that the effect of the incentive would be greater in the final section of the interview, when respondents were tired, but this prediction was not supported by the data. The other effort-related outcome that frequently has been analyzed in incentive studies is the quality of open-ended responses, generally operationalized as the number of words or number of ideas included in the response. As with those that have looked at item nonresponse, these studies have tended to find either an improvement in quality with
24
an incentive or no effect. For example, in a telephone follow-up to the National Election Studies, respondents who had received a prepaid incentive of either $1 or a pen provided significantly more ideas on average in response to two open-ended questions (Brehm, 1994). Similarly, respondents to a telephone survey who had been promised $10 provided significantly more words on average in response to open items (Goetz et al., 1984). In a contradictory finding, Hansen (1980) found that mail respondents who were given a prepaid incentive of either 25 cents or a pen provided significantly fewer words on average; coders also rated the incentive groups’ responses to be of lower quality on average. Interestingly, several studies that have offered more than one value of monetary incentive have found that only the larger amount has resulted in improved response quality. For example, in the British Social Attitudes Survey, respondents who were given five pounds provided significantly longer responses to open-ended items as compared to a control group – but receiving three pounds did not have a significant effect on response length (Tzamourani & Lynn, 1999). James and Bolstein (1990) conducted an incentive experiment as part of a mail survey in which respondents were given prepaid cash incentives of either 25 cents, 50 cents, $1, or $2. They found that only those respondents who had received at least 50 cents wrote significantly more words than the control group for an open-ended question. For a short-answer question where respondents were given space to write up to four comments, they also found that only those respondents who had received at least $1 provided a significantly greater number of comments. In a mail survey of elites, respondents who were promised $50 provided significantly more
25
detailed responses to an open item, but there was not a significant improvement in quality among respondents who were promised $25 (Godwin, 1979). It is rare for incentive studies to have considered data quality indicators beyond item nonresponse and responses to open-ended items; again, the studies that have done so have generally found that incentives either improve effort or have no effect. For example, the number of events recorded in a diary increased when an incentive was provided (Lynn & Sturgis, 1997). Receiving a prepaid voucher worth either $10 or $20 did not have a significant effect on the number of imputations or edits required for 40 items in the SIPP (Davern et al., 2003). In two other recent surveys, prepaid cash incentives did not affect the number of responses selected for a check-all-that-apply item, the proportion of respondents who provided at least one pair of inconsistent responses, or the proportion of respondents who provided round numerical responses (Medway et al., 2011). Finally, a few studies have found that respondents who have received incentives have been more willing to submit to requests that imply potential additional burden or may raise privacy concerns; for example, respondents who received prepaid incentives were more likely to provide additional contact information (Shettle & Mooney, 1999; Medway et al., 2011), and respondents who were offered promised incentives were more likely to agree to a urine drug test (Krenzke et al., 2005).
26
1.4
SUMMARY AND CONCLUSIONS Survey response rates have been declining in recent years. Because incentives
repeatedly have been found to increase survey response rates, they are utilized increasingly in surveys. In particular, prepaid incentives are more effective than promised ones, and monetary incentives are more effective than non-monetary ones. There is some evidence that larger incentives yield greater increases in the response rate; however, there may be diminishing returns from each additional dollar spent. Incentives may be effective at improving the representation of groups that are traditionally hard to reach, such as youth or minorities, as well as people who lack interest in the survey topic or a general interest in participating in research; however, this effect has not been observed across the board. Furthermore, there is mixed evidence as to the utility of incentives for reducing nonresponse bias. Given the widespread use of incentives, it is important to determine whether incentives affect the level of measurement error in surveys. Fewer studies have looked at measurement effects than at effects on nonresponse, but those that have looked at this issue have generally taken one of three approaches: (1) comparing response distributions, (2) comparing responses to validation data, or (3) comparing respondent effort indicators. These studies typically have concluded that incentives improve the quality of survey data or have no effect on it. To improve our understanding of incentives’ effect on measurement error, we need to move beyond the types of analyses that traditionally have been conducted. For example, comparisons of effort indicators typically have only considered the effect on item nonresponse and responses to open-ended questions; in Chapter 2, I report on the
27
impact that prepaid cash incentives had on the prevalence of a wider array of satisficing behaviors in a telephone survey. Furthermore, comparisons of response distributions usually have considered all of the survey items as one large group, without any differentiation between types of items; in Chapter 3, I hypothesize that the effect of incentives on responses may vary by item sensitivity and discuss the results of a mailsurvey prepaid cash incentive designed to test this possibility. Additionally, few studies have compared survey responses to validation data; in Chapter 3, I also report on the accuracy of responses to three survey items as compared to administrative records. Furthermore, the existing literature rarely examines whether the incentive had a differential impact on measurement error across subgroups of the sample; in these two analytical chapters, I discuss whether the effect of the incentive was restricted to individuals with particular characteristics, such as younger respondents or those with more education. Finally, existing studies typically report on the incentive’s effect on each item in isolation; in Chapter 4, I discuss whether prepaid cash incentives affected the relationships between survey responses intended to measure latent characteristics in several recent surveys by testing for measurement invariance between incentive and control group responses.
28
CHAPTER 2 SATISFICING IN TELEPHONE SURVEYS: DO PREPAID CASH INCENTIVES MAKE A DIFFERENCE? 2.1 INTRODUCTION Telephone survey response rates have declined considerably in recent years (Brick & Williams, forthcoming; Curtin et al., 2005; Steeh et al., 2001). Incentives are one tool for stemming this decline (Cantor et al., 2008; Curtin et al., 2007; Goetz et al., 1984; Moyer & Brown, 2008; Singer et al., 2000), and, as a result, survey practitioners are often eager to utilize them. However, several studies have found that offering prepaid incentives in telephone surveys can increase the cost per completed interview (Brick et al., 2005; Curtin et al., 2007; Gelman, Stevens, & Chan, 2003; but see Singer et al., 2000). Additional positive outcomes beyond increased response rates may be needed to justify these costs. If incentives motivate some respondents to participate who otherwise would have declined, they also may influence respondents’ behavior during the survey interview. Respondents seem more prone to satisfice in telephone surveys than in other modes (Chang & Krosnick, 2009; Hall, 1995; Holbrook et al., 2003), so the potential effect of incentives on respondent effort in telephone surveys is of particular interest. Existing research investigating the effect of incentives on respondent effort in telephone surveys suggests that incentives either result in increased effort or have no effect (e.g., Brehm, 1994; Goetz et al., 1984; Singer et al., 2000). However, these studies are limited in number and examine relatively few indicators of respondent effort. It would be useful to have more evidence about the effect of incentives on respondent effort in telephone surveys. 29
This chapter describes the methods and results of an experiment using a prepaid cash incentive in a telephone survey. It aims to overcome two limitations of prior research. First, the current study examines the impact of incentives on a wider array of effort indicators than has been considered in the earlier studies. Second, it assesses whether incentives’ effect on effort varies according to respondent or item characteristics, whereas most prior research has only discussed their effect on all respondents or items in the aggregate. 2.1.1 Satisficing Theory: Respondent Motivation as a Predictor of Effort Completing survey interviews can be cognitively taxing for respondents. Though researchers may hope that respondents carefully proceed through all four components of the response process (Tourangeau, Rips, & Rasinski, 2000), cognitive fatigue or lack of interest may lead them to take shortcuts when responding. Instead of responding carefully, respondents may not process survey questions thoroughly and may provide acceptable responses instead of optimal ones (Krosnick, 1991). Satisficing theory proposes a framework for understanding the conditions under which respondents take these cognitive shortcuts (Simon, 1956; Krosnick & Alwin, 1987). According to this theory, the probability that a respondent will satisfice for any given task is a function of three factors – task difficulty, respondent ability to complete the task, and respondent motivation to do so: ( ) ( ( ) ( ) )
Respondents are more likely to satisfice when the task at hand is difficult; however, the greater their ability and motivation, the less likely they are to do so (Krosnick, 1991). Several indicators of satisficing have been proposed, including response order effects, 30
straight-lining, item nonresponse, and acquiescence (Krosnick, 1991; Krosnick, 1999; Krosnick, Narayan, & Smith, 1996). Research comparing the quality of responses provided by reluctant respondents with that of the responses provided by those who participate more readily supports the hypothesis that respondents with more motivation may provide higher quality responses than those with less motivation (Cannell & Fowler, 1963; Triplett et al., 1996; Fricker & Tourangeau, 2010; Friedman, Clusen, & Hartzell, 2003; but see Kaminska et al., 2010; Yan et al., 2004). Prepaid cash incentives clearly increase respondents’ motivation to take part in surveys; if they also affect respondents’ motivation during the interview, they may alter the prevalence of satisficing behaviors.4 2.1.2 Theoretical Expectations for Effect of Incentives on Respondent Motivation Several theories have been offered to explain why prepaid incentives increase response rates. Typically, when researchers have proposed these theories, they have not discussed their implications for the effect of incentives on respondent behavior during the survey interview. Although these theories all are in agreement that incentives should increase response rates, extending them to predict the effect of incentives on respondent motivation and effort during the interview leads to inconsistent predictions about the effects of incentives. The theories suggest four possible effects of incentives on effort: (1) greater effort due to respondents’ sense of obligation to the survey researcher, (2) reduced effort due to the influx of uninterested, unmotivated respondents, (3) reduced effort due
4
Incentives also may affect the likelihood of satisficing by altering the average cognitive ability of the respondent pool. For example, incentives may reduce the proportion of respondents who are older (Dillman, 1996; Miller, 1996; Storms & Loosveldt, 2004); satisficing has been found to be more common among older respondents than among younger ones (Krosnick & Alwin, 1987). Thus, changes in the distribution of cognitive ability need to be taken into consideration when comparing the prevalence of satisficing behaviors among those who have received an incentive and those who have not received one.
31
to a focus on extrinsic motivations instead of intrinsic ones, and (4) no effect beyond the point of agreeing to participate. Consider the first of these possibilities – that incentives lead to greater effort due to respondents’ sense of obligation to the survey researcher. According to social exchange theory, prepaid incentives may invoke the norm of reciprocity (Gouldner, 1960). When sample members receive an incentive they may feel obligated to return the kindness by responding to the survey (Dillman et al., 2009). A related notion is that incentives create a positive attitude toward the sponsor; some studies find more favorable ratings of the survey sponsor, supporting the hypothesis that prepaid incentives help foster a positive relationship between the sample member and the researcher (e.g., James & Bolstein, 1990). These positive feelings toward the sponsor also could lead respondents to make greater effort than they would have otherwise. Alternatively, incentives could result in reduced effort due to an influx of uninterested, unmotivated respondents. Incentives may convince certain sample members to respond who are not drawn to other survey features, such as the topic (e.g., Groves et al., 2004). Such respondents may lack sufficient interest and motivation to provide high quality responses. Additionally, offering incentives could result in reduced effort due to a focus on extrinsic motivations instead of intrinsic ones. The sense of obligation posited by social exchange theory may actually reduce motivation if it causes respondents to focus too heavily on extrinsic reasons for completing the survey. People who are motivated by intrinsic factors, such as interest in the survey topic or enjoyment of sharing one’s opinions, may put forth more effort than those who are focused on extrinsic ones, such as
32
monetary rewards or a sense of obligation. Research on the importance of intrinsic motivation to academic success has supported this assumption (e.g., Bolkan et al., 2011; Fransson, 1977). Similarly, Rush, Phillips, and Panek (1978) report that paid subjects were “striving more for task completion rather than for success” (p. 448). Thus, respondents who see the incentive as their main, or only, motivation for participating may put forth less effort than those who take part for other reasons. A final possibility is that once sample members have committed to taking part in the survey the incentive has little to no further impact on their behavior. Agreeing to participate in the survey may be sufficient for most sample members to feel they have resolved any cognitive dissonance or met the obligations of the norm of reciprocity. If this is the case, then any further behaviors taken during data collection may not be influenced by the fact that the respondent has received an incentive. Camerer and Hogarth’s (1999) review of incentive experiments in laboratory studies found that incentives typically do not affect performance, supporting the view that incentives may not affect behavior during survey interviews. Additional empirical evidence is needed to help determine which of these expectations is the most accurate, and whether our expectations should vary according to survey design features or respondent characteristics. 2.1.3 Existing Studies Investigating the Impact of Incentives on Respondent Effort Most studies investigating the impact of incentives on respondent effort focus on item nonresponse. Several have observed a reduction in item nonresponse when incentives are utilized (Godwin, 1979; Goetz et al., 1984; Goldenberg et al., 2009; James & Bolstein, 1990; McDaniel & Rao, 1980). However, many others have concluded that
33
incentives do not have a significant impact on the rate of item nonresponse (e.g., Berk et al., 1987; Berlin et al., 1992; Cantor et al., 2008; Curtin et al., 2007; Davern et al., 2003; Furse & Stewart, 1982; Peck & Dresch, 1981; Shettle & Mooney, 1999). It is unclear what design characteristics lead to these divergent results; the studies in both groups have used prepaid and promised incentives, have been conducted in a variety of modes, and have asked respondents about a wide range of topics. None of the incentive experiments I found resulted in a significant overall increase in item nonresponse. The other effort-related outcome that frequently has been analyzed in incentive studies is the quality of open-ended responses, generally operationalized as the number of words or number of ideas included in the response. Several studies have concluded that incentives lead to an improvement in quality (Brehm, 1994; Goetz et al., 1984; Willimack et al., 1995), although multiple studies that have offered more than one value of monetary incentive have found that only the larger amount has resulted in improved response quality (Godwin, 1979; James & Bolstein, 1990; Tzamourani & Lynn, 1999). I only came across one study where the incentive led to a significant reduction in response quality (Hansen, 1980). Only a few incentive studies have considered effort indicators beyond item nonresponse and responses to open-ended items; again, the studies that have done so have generally found that incentives either improve effort or have no effect. For example, the number of events recorded in a diary was increased when an incentive was provided (Lynn & Sturgis, 1997). However, a prepaid voucher did not have a significant effect on the number of imputations or edits required for 40 items in the Survey of Income and Program Participation (Davern et al., 2003). In two other recent surveys, prepaid cash
34
incentives did not affect the number of responses selected for a check-all-that-apply item, the proportion of respondents who provided at least one pair of inconsistent responses, or the proportion of respondents who provided round numerical responses (Medway et al., 2011). Finally, a few studies have found that respondents who have received incentives have been more willing to submit to requests that imply potential additional burden; for example, respondents who received prepaid incentives were more likely to provide additional contact information (Shettle & Mooney, 1999; Medway et al., 2011), and respondents who were offered promised incentives were more likely to agree to a urine drug test (Krenzke et al., 2005). These studies provide information about how incentives affected effort across all items for all respondents; however, it is possible that the effect of the incentive is restricted to certain subgroups of respondents or particular types of items. Only a few studies have examined these possibilities. For example, Singer et al. (2000) found that incentives in the Survey of Consumers led to a significant reduction in item nonresponse within two particular subgroups – older respondents and non-Whites; however, this result was not replicated in a similar experiment in a subsequent administration (Curtin et al., 2007). Similarly, McDaniel and Rao (1980) found that a prepaid incentive significantly reduced item nonresponse for both open-ended and closed items, while Hansen (1980) found that the effect of incentives was limited to open-ended items. Finally, Tzamourani and Lynn (1999) concluded that providing a prepaid monetary incentive reduced item nonresponse for non-sensitive questions but increased it for sensitive ones, while Krenzke and colleagues (2005) found that offering a promised cash incentive did not have a significant impact on the proportion of items respondents skipped, regardless of item
35
sensitivity. In this same study, the researchers hypothesized that the effect of the incentive would be greater in the final section of the interview, when respondents were tired, but this prediction was not supported by the data. 2.1.4 Extending the Literature As this review shows, most of the existing incentive experiments have focused on item nonresponse as the primary indicator of respondent effort. Item nonresponse rates are an attractive indicator of data quality in the sense that they are easily calculated and compared across studies. Furthermore, reducing item nonresponse is desirable because it decreases the amount of imputation that must be done. However, while the level of item nonresponse is a widely used indicator of respondent effort in survey research, it only captures the absolute minimum amount of information about respondent effort; it tells researchers that the respondent took the time to provide an answer, but it tells them nothing about the actual quality of that response. Several other indicators of effort that hold respondents to a higher standard, such as response order effects and nondifferentiation, have been utilized in other areas of survey research but have not been applied to incentives research. Measuring the impact of incentives on such indicators would improve researchers’ knowledge of the degree to which incentives influence respondent effort. The current study includes measures of twelve indicators of respondent effort; the operationalization of each indicator is discussed at greater length in the Methods section, but most of them are derived from the literature on survey satisficing. Furthermore, the current study examines the possibility that the effect of incentives varies according to respondent or item characteristics. Significant effects at the subgroup level may be masked at the aggregate level. I hypothesize that an incentive will
36
increase effort among respondents who recall receiving it but not among other respondents. I also hypothesize that incentives will have a greater impact on respondents with higher levels of cognitive ability and lower levels of conscientiousness. Indicators of these characteristics are included in the current study and are discussed in further detail in the Methods section of this chapter. Finally, I hypothesize that the effect of the incentive will be greater for two types of items. First, due to cognitive fatigue, the incentive will have a greater impact on effort in the second half of the interview than it will in the first half. Second, because answers to attitude items are more affected by context than answers to factual or behavioral questions, the incentive will have a greater effect on responses to the former than on responses to the latter.
37
2.2 2.2.1
RESEARCH METHODS Sampling Frame and Experimental Conditions The data for this study come from the 2011 JPSM Practicum survey. As part of
this study, a telephone survey was conducted in the summer of 2011 by Princeton Survey Research Associates International (PSRAI). The target population was noninstitutionalized persons age 18 and older living in the continental United States. Survey Sampling International (SSI) provided a sample of 9,500 individuals. SSI creates its directory-listed files by merging directory-listed residential telephone numbers with a variety of secondary sources, such as birth records, voter registrations, and motor vehicle registrations. SSI selected the sample for this study so that the number of records selected for each state and county was in line with Census population distributions. A listed sample was chosen because it included a name, address, and phone number for each case. An address and a phone number were needed to send each sample member an advance letter and interview him or her on the telephone. Having full name information for each sample member increased the power of the incentive treatment; the advance letter was addressed to the specifically-named individual listed in the sample file, and only this individual was eligible to complete the telephone interview. This increased the likelihood that the individual who completed the interview also was the household member who had read the advance letter and received the cash incentive. 7,200 sample members were randomly selected from the initial list of 9,500 cases received from SSI. Just over one percent of the 9,500 cases did not include a first name; because of the difficulty of requesting to speak with a person for whom we did not have a first name, these cases were dropped from the file prior to selecting the sample. SSI
38
indicated that an additional one percent of the 9,500 cases were ported numbers; to avoid inadvertently autodialing wireless numbers, these cases also were removed from the file before the sample was selected. All 7,200 sample members were sent an advance letter. The letters were released in two replicates. The first batch was sent to 3,400 sample members on July 14-15, 2011, and the second was sent to 3,800 sample members on July 28-29, 2011. As part of this experiment, 40% of the sample members were randomly assigned to receive a $5 prepaid incentive with the advance letter. The other 60% of the sample received an advance letter without an incentive. Both replicates were included in the experiment. The exact wording of the advance letter is included in Appendix A. Interviews were conducted between July 18 and August 17, 2011. PSRAI made up to six attempts to reach sample members. Nine hundred interviews were completed. The median interview length was 19.5 minutes. The survey covered several topics, including health, employment, and current social issues. 2.2.2 Indicators of Respondent Effort The survey questionnaire included measures of several indicators of respondent effort. First, it included measures of the two indicators most commonly studied in prior incentive experiments: (1) item nonresponse and (2) responses to open-ended items. The survey also included measures of other traditional satisficing indicators originally proposed by Krosnick and colleagues (Krosnick, 1991; Krosnick, 1999; Krosnick et al., 1996): (3) straight-lining and non-differentiation, (4) acquiescence, and (5) response order effects. Finally, the survey included indicators that survey researchers have used to determine respondents’ level of effort in other contexts: (6) lack of attention to important
39
exclusions, (7) use of round or prototypical values for numerical responses, (8) use of estimation strategies to answer questions requiring numerical responses, (9) underreporting to filter items, (10) interview length, and (11) accuracy of survey reports as compared to frame data. After the call was completed, (12) the interviewers provided observations about each respondent’s level of effort during the interview. With the exception of accuracy of survey reports as compared to frame data, these are indirect indicators of measurement error. Using round numbers for numerical responses and providing brief responses to open-ended items does not prove that an individual’s responses are prone to extensive measurement error, but it does imply that he or she may be making less effort; by extension such individuals may also provide less accurate responses. In the section below, I provide more information about each of these indicators, including how they were measured in the questionnaire and how I analyzed the data. Exact question wordings can be found in Appendix A, while information about which items were included in each indicator is located in Appendix B. Item nonresponse. When respondents feel that a survey item is too cognitively burdensome, they may decline to provide an answer instead of formulating a response. To determine the effect of the incentive on item nonresponse in the current study, I calculated the proportion of the items that each respondent declined to answer.5 If respondents who received an incentive made greater effort than those who did not receive one, they should have skipped a smaller proportion of the items.
5
The questionnaire included several experiments and skip patterns. As a result, most of the respondents were not asked all of the survey questions. Because of this, I often report outcomes as proportions – the number of times the respondent did a certain behavior divided by the number times they had the opportunity to do so.
40
Responses to open-ended items. Open-ended items, in which respondents are asked to come up with their own response instead of choosing one from a pre-selected list of options, can be especially burdensome. The current study included three such items. To determine the effect of the incentive on responses to open-ended items, I calculated the number of words that each respondent provided in response to each item. For one of the items I also was able to calculate and compare the number of ideas that respondents provided as part of their answer. Finally, I determined the proportion of the respondents who skipped each of these items. If respondents who received an incentive made greater effort than those who did not receive one, they should have provided more words and ideas on average and fewer of them should have skipped these items. Straight-lining or non-differentiation. Respondents may be presented with a series of items that have the same response options. Straight-lining occurs when respondents select the same response for all of the items in such series, and nondifferentiation occurs when respondents select very similar responses for all of the items (Krosnick, 1991). To measure the effect of the incentive on these behaviors, I included three multi-item batteries in the questionnaire. Respondents were considered to have straight-lined if they provided the same response to all of the items in a particular battery. To determine the degree of differentiation in responses, I calculated the standard deviation of each respondent’s answers to each battery. If respondents in the incentive group made greater effort, a significantly smaller proportion of them should have straight-lined, and there should be significantly more differentiation in their responses to batteries of items.
41
Acquiescence bias. Respondents are often asked questions that require either an affirmative or negative response. These statements typically ask about things that are reasonable for respondents to believe or do. Individuals who are not processing these questions deeply may find it easier to find reasons to provide an affirmative response than to provide a negative one (Krosnick, 1991). The current study included 29 such items. I calculated the number of affirmative responses each respondent gave to these items. If respondents who received an incentive made greater effort than those who did not, they should have provided a smaller number of affirmative responses to these items. A drawback to this approach is that differences between the incentive and control group in the average number of affirmative responses could be due to true differences in the prevalence of the attitudes or behaviors asked about in these items. As a result, I included wording experiments for four of the items in which one half of the respondents was asked the original version of the question and the other half was read a question asking about the opposite attitude. For example, one half of the respondents was asked whether they agree or disagree that, “Increasing government involvement in healthcare will improve the quality of care,” while the other half was asked if this will hurt the quality of care. If respondents in the incentive group made greater effort, they should be less likely than control group respondents to agree with both versions of the item. Response order effects. Respondents who are satisficing may also demonstrate response order effects. In aural modes such as the telephone, both recency effects and primacy effects may occur (Krosnick & Alwin, 1987). To assess the effect of the incentive on response order effects, I examined the 27 items in the questionnaire that had at least four response options. I determined the proportion of these items for which for
42
each respondent did each of the following: selected one of the first two options, selected the first option, selected one of the last two options, or selected the final option. If respondents who have received an incentive made greater effort than those who did not, then they should have selected the first or last options less frequently. Again, the drawback to this approach is that differences in the observed prevalence of primacy or recency effects between the incentive and control group could potentially be due to true differences in the prevalence of the attitudes or behaviors asked about in these items. Therefore, four experiments were included in the questionnaire that varied the order of the response options; one half of the respondents received the initial order and the other half received the reverse order. If respondents who received an incentive made greater effort than those who did not receive one, then they should have selected the first or last options less often, regardless of presentation order. Lack of attention to question wording. If respondents are taking cognitive shortcuts they may not listen very carefully to the reading of survey items and may miss important details such as instructions to include or exclude certain categories from consideration in their response. In the current study, four experiments were conducted to determine the effect of the incentive on attention to question wording. For each item, respondents were randomly assigned to either the standard wording or the exclusion wording. For example, respondents in the standard condition were asked, “In the past seven days, how many servings of vegetables did you eat?” The exclusion condition respondents also were told, “Please do not include carrots, beans, or lettuce”. For each item, I determined the mean response provided in the standard and exclusion conditions and calculated the difference between these two means. If respondents who received an
43
incentive made greater effort than those who did not receive one, the difference between the standard and exclusion groups should be significantly greater among respondents in the incentive group. Use of round or prototypical values for numerical responses. Respondents often are asked to provide numerical responses to open-ended items. Respondents who are taking cognitive shortcuts may report round or prototypical numbers (such as multiples of 5 or 10) instead of making the extra effort to report an exact numerical response. The current study included 15 items that requested numerical responses. For all of the items, respondents who provided a value that was a multiple of five were considered to have given a round response. In addition, some items asked how often respondents did something in a week or a year; for these items, multiples of seven or twelve also were considered round responses. I determined the proportion of items for which each respondent provided a round response. If respondents who received an incentive made greater effort than those who did not receive one, then they should have provided round responses for a smaller proportion of the items. Estimation strategy for numerical responses. When respondents are asked to provide a number that indicates how often they perform a certain behavior, there are several approaches they can use to come up with an answer that require varying levels of effort (Tourangeau et al., 2000). Respondents who are trying to conserve cognitive energy may estimate their response instead of making the effort to recall each particular episode (Conrad, Brown, & Cashman, 1998). To assess the effect of the incentive on recall strategy, I asked respondents to indicate how they came up with their answer to one such question, in which they were asked how many times they had seen a doctor or other
44
health professional in 2010. Respondents who said they had seen a doctor at least once were then asked, “Which of the following describes how you came up with your answer: did you estimate based on a general impression; did you think about types of visits; did you think about how often you usually go to the doctor; or did you think about each individual visit?” Respondents could select multiple responses. Any response other than thinking about each individual visit suggested that respondents had estimated their response. If respondents who received an incentive made greater effort than those who did not receive one, then fewer of them should have estimated their response. Underreporting to filter items. Surveys often include groups of questions that consist of an initial filter question and a set of follow-up questions for those whose answers to the filter question indicate that the follow-up items apply to them. Interleafed formats are a relatively popular way to organize such questions in survey questionnaires. In this format, the follow-up items come immediately after the relevant filter question. With this format, however, respondents may learn that negative answers to the filter questions help them end the interview more quickly. Respondents who are trying to reduce their cognitive burden may then provide false negative responses to these filter items (Jensen, Watanabe, & Richter, 1999; Kreuter, McCulloch, Presser, & Tourangeau, 2011; Lucas et al., 1999). To determine the effect of the incentive on reports to filter items, I included a section in the questionnaire that consisted of six filter questions (and their follow-ups) in an interleafed format. Respondents were asked whether they had ever been diagnosed with six medical conditions, such as hypertension or diabetes. The order of these questions was randomized. If respondents replied affirmatively they were asked at what
45
age they had been diagnosed. This question may be burdensome enough that some respondents may have wanted to avoid answering it. As a result, they may have started responding negatively to the diagnosis questions that were presented later in the list. If respondents who received an incentive made greater effort than those who did not receive one, then presentation order should have a significantly smaller impact on their likelihood of providing an affirmative response. Interview length. Respondents who are fatigued may try to rush through the survey interview. Such respondents may listen to the survey questions less carefully and make less effort when formulating a response. For example, Malhotra (2008) found that overall completion time is a significant negative predictor of primacy effects for some subgroups of respondents. To determine the effect of the incentive on interview length, the start and end time of each interview was recorded. If respondents who received an incentive made greater effort than those who did not receive one, then their mean response time should be longer than that of the control group. Accuracy of survey reports as compared to frame data. All of the other indicators of respondent effort are indirect indicators of measurement error. In comparing respondent reports to data from the frame, we have a more direct indicator of measurement error – whether the self-report matches the frame, and, if it does not, the size of the discrepancy between the two pieces of information. To determine whether the incentive affected the accuracy of self-reports, I included one question that asked respondents for information that was also available on the frame – how many years they have lived in their current residence. Although there is likely some error in the frame data, we can assume that these errors are equally present for the incentive and control
46
groups due to the random assignment of sample members to experimental conditions. I compared survey reports to the frame data to determine the proportion of respondents whose answers matched the frame exactly. I also calculated the mean discrepancy between self-reports and frame data in the two groups. If respondents who received an incentive made greater effort than those who did not receive one, there should be a smaller number of discrepancies in the reports of incentive respondents and the mean discrepancy in reports should be smaller. Interviewer reports of effort. Through their interaction with the respondent throughout the interview, interviewers can formulate an impression of the amount of effort made by respondents. These impressions can incorporate effort indicators that are not captured by survey responses – such as respondents’ decision to consult records during the interview or indications that the respondent is distracted by other things happening in the room. To determine whether the incentive affected interviewers’ impression of respondent effort, the interviewers rated each respondent’s effort after the call was completed; they recorded the extent to which “the respondent answered the survey questions to the best of their ability,” on a five point scale from “not at all” to “very often”. If respondents who received an incentive made greater effort than those who did not receive one, interviewers should have rated their effort as being greater than that that of the control respondents. 2.2.3 Respondent Characteristics Prior studies investigating the impact of incentives on data quality have found weak effects or no effects at all. This absence of findings may be due to the fact that existing studies have considered the effect of incentives on all respondents in the
47
aggregate. The effect of incentives on effort may be limited to certain subgroups of respondents. Analyzing all respondents at once may mask the effect the incentive has on these subgroups. The current study’s questionnaire included measures of several respondent characteristics hypothesized to impact whether an incentive would affect the level of effort put forth by respondents. Cognitive ability. I hypothesized that receiving an incentive would have little effect on the prevalence of satisficing among respondents with lower cognitive ability – because of their lower cognitive ability, these respondents cannot help satisficing; in contrast, the incentive should have a greater impact on the behavior of respondents who have fewer cognitive constraints. This study included three indicators of cognitive ability. The first was age; respondents over the age of 65 were considered to have lower cognitive ability. The second was education; respondents who reported having a high school diploma or less were considered to have lower cognitive ability. The final indicator was interviewer reports of respondent difficulty in answering the survey questions. After the interview was completed, the interviewers were asked to report how often the respondent had trouble understanding the survey questions on a five-point scale from “not at all” to “very often”; respondents who had trouble understanding the questions somewhat often, pretty, often, or very often were considered to have lower cognitive ability. Conscientiousness. I hypothesized that the more internally motivated respondents were, the smaller the effect of the incentive would be on their effort during the interview. In the current study, I used self-reported conscientiousness as a proxy for intrinsic motivation. Conscientiousness is one of the five personality traits on which individuals
48
are said to vary on the “Big 5” personality inventory; people who rate highly on conscientiousness tend to be thorough, self-disciplined, and pay attention to details (Costa & McCrae, 1992). Respondents who are conscientious should be motivated to complete the questionnaire at a high level of quality regardless of whether they have received an incentive; conversely, the motivation and dedication of respondents who are not very conscientious may be affected by whether or not they have received an incentive. The questionnaire included 10 items intended to measure conscientiousness; these items were adapted from a measure developed by Buchanan, Johnson, and Goldberg (2005). Respondents were asked to what extent they agreed or disagreed that each item described them on a five-point scale from “strongly disagree” to “strongly agree”. Example items include, “I am always prepared,” and, “I do just enough work to get by”. I calculated the mean score that respondents gave across the items, with higher means indicating greater levels of conscientiousness. Incentive recall. Finally, I hypothesized that the incentive would have little impact on the behavior of respondents who did not recall receiving it. Although the advance letter was addressed to a specific individual and this same person was asked to complete the telephone interview, it remained possible that some of the respondents in the incentive group were not aware of the incentive. To measure incentive recall, respondents first were asked, “A letter describing this study may have been sent to your home recently. Do you remember seeing the letter?” Respondents who recalled the letter were asked an open-ended follow-up, “Do you happen to remember if there was anything else in the envelope with the letter?” If respondents said “yes”, interviewers were instructed to probe to determine what the respondent remembered being in the envelope.
49
PSRAI also kept track of advance letters that were returned because they were undeliverable; the respondents to whom these letters were sent were considered to be unaware of the advance letter and incentive.
50
2.3
RESULTS This section presents the results of the incentive experiment described above.
First, I discuss the effect of the incentive on the response rate, cost per complete, and sample composition. Then, I review the effect of the incentive on several indicators of respondent effort. Finally, I discuss whether the effect differed depending on respondent characteristics. 2.3.1 Outcome Rates Table 2.1 provides an overview of the outcomes of the data collection effort. The overall response rate was 15.7% (AAPOR RR1). The response rate was significantly higher in the incentive condition (22.8%) than it was in the control condition (10.9%). The cooperation rate also was significantly higher in the incentive condition than in the control condition (39.2% and 20.0%, respectively), while the refusal rate was significantly lower (32.4% and 41.0%, respectively). Although not anticipated, the contact rate also was significantly higher in the incentive condition than it was in the control condition (58.1% and 54.7%, respectively). This may have been the result of significantly increased advance letter recall in the incentive group (86.5% and 66.0%, respectively); the advance letter mentioned the name of the survey research firm that would be calling, and improved recall of the letter may have led more of the incentive sample members to recognize the name when it appeared on caller ID.
51
Table 2.1. Outcome Rates, by Incentive Condition
Response Rate (RR1) Refusal Rate (REF1) Contact Rate (CON1) Sample size Cooperation Rate (COOP1) Sample size Remembered advance letter Sample size
A
Total 15.7% 37.6% 56.1% (7,199)A 28.0% (3,216) 77.9% (900)
Incentive 22.8% 32.4% 58.1% (2,880) 39.2% (1,337) 86.5% (524)
Control 10.9% 41.0% 54.7% (4,319) 20.0% (1,879) 66.0% (376)
?2(1) 145.58*** 44.43*** 6.42* 142.61*** 53.38***
Due to an error with the call scheduling software, one sample case was never called. *p