Working Papers

A Classroom Observer Like Me: The Effect of Demographic Congruence Between Teachers and Raters on Observation Scores (Job Market Paper: Current Draft)

Abstract: Over the past decade, U.S. states and districts have come to rely on teacher evaluations – and in particular, classroom observations – as a key lever for teacher accountability. However, important questions remain about the extent to which observation-based ratings are influenced by factors beyond teachers’ control. In this paper, I use longitudinal data from a large district in the southeastern United States to examine the effects of demographic congruence between teachers and observers on teachers’ observation scores. In the district I study, teachers receive multiple rounds of classroom observations each school year, and teachers may be observed by different raters across rounds. I exploit the data generated by this classroom observation scheme to identify demographic interactions, estimating models that include both teacher-year fixed effects and observer-round-year fixed effects. I find that teachers, on average, experience small positive effects on their scores from sharing race or gender with their observer. The results raise fairness concerns for teachers whose demographics are not reflected by any of their raters, and they implore those who use observation scores in decision-making to carefully consider the circumstances and context under which the scores were generated.

Can a "Big Data" Commercial Screening Tool Help Select Better Teachers? (with Matthew A. Lenard)


Teacher Skill Development: Evidence from Performance Ratings by Principals (with Matthew A. Kraft and John P. Papay). 2020. Journal of Policy Analysis and Management, 39(2): 315-347. Also available as EdWorkingPaper No. 19-97.

Abstract: We examine the dynamic nature of teacher skill development using panel data on principals’ subjective performance ratings of teachers. Past research on teacher productivity improvement has focused primarily on one important but narrow measure of performance: teachers’ value-added to student achievement on standardized tests. Unlike value-added, subjective performance ratings provide detailed information about specific skill dimensions and are available for teachers in non-tested grades and subjects. Using a within-teacher returns-to-experience framework, we find, on average, large and rapid improvements in teachers’ instructional practices throughout their first 10 years on the job as well as substantial differences in improvement rates across individual teachers. We also document that subjective performance ratings contain important information about teacher effectiveness. In the district we study, principals appear to differentiate teacher performance throughout the full distribution instead of just in the tails. Furthermore, prior performance ratings and gains in these ratings provide additional information about teachers’ ability to improve test scores that is not captured by prior value-added scores. Taken together, our study provides new insights on teacher performance improvement and variation in teacher development across instructional skills and individual teachers.