Can Racial Bias in Policing Be Credibly Estimated Using Data Contaminated by Post-Treatment Selection?

Publication Year
0
Abstract

Studies of racial bias in policing often rely on data contaminated by selection issues, e.g. using records of stops or arrests—which may themselves be a product of racial bias—to estimate discrimination in subsequent actions like use of force. This feature raises the threat of post-treatment-selection bias, which recent work shows can lead to severe underestimates of discrimination. However, prominent studies continue to ignore this issue, employing standard regression techniques with contaminated data. In this paper, we formally analyze the key identifying assumption undergirding these studies, “subset ignorability,” and show it corresponds to the measure-zero set of knifeedge conditions in which differing biases happen to sum to zero. Because there is no substantive reason to believe such accidental cancellation would occur, we conclude this approach is not reliable in applied research, and we emphasize the need for continued caution and increased rigor in high-stakes analyses of discriminatory policing with contaminated data.

Notes
<p>Studies of racial bias in policing often rely on data contaminated by selection issues, e.g. using records of stops or arrests—which may themselves be a product of racial bias—to estimate discrimination in subsequent actions like use of force. This feature raises the threat of post-treatment-selection bias, which recent work shows can lead to severe underestimates of discrimination. However, prominent studies continue to ignore this issue, employing standard regression techniques with contaminated data. In this paper, we formally analyze the key identifying assumption undergirding these studies, “subset ignorability,” and show it corresponds to the measure-zero set of knifeedge conditions in which differing biases happen to sum to zero. Because there is no substantive reason to believe such accidental cancellation would occur, we conclude this approach is not reliable in applied research, and we emphasize the need for continued caution and increased rigor in high-stakes analyses of discriminatory policing with contaminated data.</p>
Full text