Evaluation of public employees performance is essential to induce higher work efforts. We use an experiment in two provinces of china to explore how to design such evaluation. Results show that the incentive effect of evaluation can be larger if the employee does not know ex-ante who the evaluator will be, thus reducing attempts at personally influencing the evaluator and enhancing instead job achievements.
Subjective performance evaluation is widely used by organizations to incentivize their employees. This is especially true in the public sector, as most civil service jobs are inherently multidimensional and vaguely defined, making it almost impossible to set precise and objective measures of work performance.Economists have long conjectured that “leader subjective evaluations” could induce “influence activities”: under such evaluations, agents might make efforts to please the evaluating leader, rather than perform tasks that could benefit the organization (Milgrom and Roberts 1988; Milgrom 1988). But despite the prevalence of subjective evaluations and rich theoretical insights on the topic, there has been little rigorous empirical evidence on the existence and implications of influence activities. We have worked to fill this gap by conducting a large-scale field experiment among rural civil servants in China.
In our experiment, we collaborated with two provincial governments in China, and randomized two different subjective performance evaluation schemes across more than 3,700 CGCSs. The experimental design is illustrated in Figure 1. For every CGCS in our sample, we randomly selected one of the two township leaders as the evaluator, leaving the other township leader randomized as the non-evaluator. For a CGCS randomized into the “revealed scheme,” we revealed the identity of the evaluator at the beginning of the evaluation year, so the CGCS knew exactly where to direct influence activities in order to improve the evaluation outcomes. For a CGCS randomized into the “masked scheme,” we masked the identity of the evaluator until the end of the evaluation year, so the CGCS did not know which leader to influence in order to improve the evaluation outcomes. The two township leaders themselves were never informed about whether they were chosen as the evaluator or non-evaluator.
In the revealed scheme, since the evaluator and non-evaluator were randomly assigned, in the absence of any evaluator-specific influence activities, both leaders should on average be equally positive about the performance of the CGCS. However, as shown in Figure 2, when we collected the CGCS performance assessments from both leaders in the exit survey, the randomly chosen evaluator was significantly more positive about the CGCS, compared to his/her(randomized) non-evaluating counterpart. This suggests that the CGCS was indeed willing and able to engage in evaluator-specific influence activities to improve the evaluation outcomes. In comparison, in the masked scheme, the CGCS no longer knew where to direct influence activities, and subsequently we observed no asymmetry in the assessments of the two leaders.
Note: “Evaluator edge” is defined as the difference between “evaluator assessment score” and “non-evaluator assessment score,” both variables being on a scale of 1 to 7. If both leaders were equally positive about the CGCS, “evaluator edge” should equal 0.
If the CGCSs under the revealed scheme did engage in evaluator-specific influences, some of these activities would be noticed by their colleagues in the same office, who observe CGCS behavior on a daily basis. Therefore, to corroborate the existence of influence activities, at the end we asked the colleagues to guess which of the two leaders would be more positive about the CGCS. Since we never informed the colleagues about who we chose as the evaluator, if they observed no influence activities by the CGCS, they should on average think that both leaders are equally likely to be more positive about the CGCS. However, as shown in Figure 3, in the revealed scheme, colleagues were more likely to (correctly) predict that the randomly chosen evaluator would be more positive about CGCS performance. In comparison, under the masked scheme, when the CGCS no longer knew which leader totry to influence, colleagues could no longer predict which leader would be more positive. These results suggest that the CGCS’s influence activities to improve evaluation outcomes could be partially observed by their colleagues.
When we switched from the revealed scheme to the masked scheme, from the CGCS’s perspective, leader-specific influence activities became riskier and less beneficial, because the leader who was influenced might not be the evaluator. This incentivized the CGCS to reallocate efforts from “leader-specific influence activities” to “common productive dimensions” that would be appreciated by both leaders. Therefore, we expected that there would be an improvement in work performance when we masked evaluator identity. Consistent with this hypothesis, we found that the average colleague assessment of CGCS performance was significantly higher in the masked scheme than in the revealed scheme (Figure 4), and so is the average leader assessment. In addition, we found that CGCSs under the masked scheme earned higher monthly performance pays. Since performance pays are directly determined by simple objective indicators such as overtime work and nighttime shifts, this finding suggests that the improved work performance could also be benchmarked with objective measures. In the paper, we further rule out other potential alternative explanations for our findings.
Note: The outcome is the average assessment score of CGCS performance given by colleagues, on a scale of 1 to 7.
Our paper providesthe first rigorous empirical evidence of the existence and implications of influence activities. We found that state employees a reable touse evaluator-specific influence to affect evaluation outcomes, and that this process is partly observed by their co-workers. Moreover, our findings have important policy implications. We found that by randomizing the identity of the evaluating supervisor, which has minimal implementation cost, the government could significantly improve the job performance of its employees. Tailored versions of our intervention could potentially be applied to the more than 50 million state employees in China, and might be useful in other contexts where high-stakes rewards depend on subjective evaluations in the corresponding hierarchy.
de Janvry, Alain, Guojun He, Elisabeth Sadoulet, Shaoda Wang, and Qiong Zhang. 2019. Influence Activities and Bureaucratic Performance: Evidence from a Large-Scale Field Experiment in China. New York: Mimeo.
Milgrom, Paul. 1988. “Employment Contracts, Influence Activities, and Efficient Organization Design.” Journal of Political Economy 96 (1): 42–60. https://doi.org/10.1086/261523.
Milgrom, Paul, and John Roberts. 1988. “An Economic Approach to Influence Activities in Organizations.” American Journal of Sociology 94: S154–S179. https://doi.org/10.1086/228945.