Algorithmic Fairness and Bias
Saturday, Jan. 4, 2020 10:15 AM - 12:15 PM (PST)
- Chair: Bo Cowgill, Columbia University
Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics
AbstractWhy does "algorithmic bias" occur? The two most frequently cited reasons are "biased programmers" and "biased training data." We quantify the effects of these using a field experiment on a diverse group of AI practitioners. In our experiment, machine learning programmers are asked to predict math literacy scores for a representative sample of OECD residents. One group is given perfectly representative training data, and the other is given a "dataset of convenience" -- a biased training sample containing who confirm to common expectations about who is good at math. Using this field experiment, we quantify the benefits of employing programmers who are diversity-aware vs obtaining more representative training data. We also measure the effectiveness of training interventions to reduce algorithmic bias, including both reminders and technical guidance.
Regulating Discrimination in the Presence of Algorithms
AbstractThe ambiguity of human decision-making often makes it extraordinarily hard for the legal system to know whether anyone has actually discriminated. So, to fully understand how algorithms affect discrimination, we must also understand how they affect the problem of detecting discrimination. In this paper, we present a model that illustrates how the introduction of algorithmic decision-making changes the problem of detecting discrimination in markets. We focus on a regulator that wishes to detect discriminatory hiring practices. At its core, the regulator faces an asymmetric information problem – if the regulator observes that a firm hires few minority workers, the regulator cannot know whether this is due to unobserved differences in productivity or discriminatory hiring practices. We then provide two main results on how algorithmic decision-making interacts with this asymmetric information problem. First, without proper safeguards in place, the introduction of algorithmic decision-making may make it more difficult for the regulator to detect discriminatory practices. In this case, the algorithm becomes a black box that can serve to justify existing discriminatory practices. Second, if the regulator can ``audit’’ the firm’s algorithm by querying its outputs, algorithmic decision-making makes it easier for the regulator to detect discrimination as it provides a new source of transparency that is otherwise not available. Our results highlight that the rise of algorithmic decision-making does not necessarily worsen or alleviate discrimination – its effects depend crucially on the existing regulatory environment.
Algorithmic Risk Assessment in the Hands of Humans
AbstractWe evaluate the impacts of adopting algorithmic predictions of future offending (risk assessments) as an aid to judicial discretion in felony sentencing. We find that judges' decisions are influenced by the risk score, leading to longer sentences for defendants with higher scores and shorter sentences for those with lower scores. However, we find no robust evidence that this reshuffling led to a decline in recidivism, and, over time, judges appeared to use the risk scores less. Risk assessment's failure to reduce recidivism is at least partially explained by judicial discretion in its use. Judges systematically grant leniency to young defendants, despite their high risk of reoffending. This is in line with a long standing practice of treating youth as a mitigator in sentencing, due to lower perceived culpability. Such a conflict in goals may have led prior studies to overestimate the extent to which judges make prediction errors. Since one of the most important inputs to the risk score is effectively off-limits, risk assessment's expected benefits are curtailed. We find no evidence that risk assessment affected racial disparities statewide, although there was a relative increase in sentences for black defendants in courts that appeared to use risk assessment most. We conduct simulations to evaluate how race and age disparities would have changed if judges had fully complied with the sentencing recommendations associated with the algorithm. Racial disparities might have increased slightly, but the largest change would have been higher relative incarceration rates for defendants under the age of 23. In the context of contentious public discussions about algorithms, our results highlight the importance of thinking about how man and machine interact.
University of Toronto
Massachusetts Institute of Technology
Harvard Business School
- J7 - Labor Discrimination
- C6 - Mathematical Methods; Programming Models; Mathematical and Simulation Modeling