Auditing and Regulating AI Systems
Friday, Jan. 7, 2022 10:00 AM - 12:00 PM (EST)
- Chair: Adair Morse, University of California-Berkeley
Characterizing Fairness Over the Set of Good Models Under Selective Labels
AbstractAlgorithmic risk assessments are increasingly used to make and inform decisions in a wide variety of high-stakes settings. In practice, there is often a multitude of predictive models that deliver similar overall performance, an empirical phenomenon commonly known as the “Rashomon Effect.” While many competing models may perform similarly overall, they may have different properties over various subgroups, and therefore have drastically different predictive fairness properties. In this paper, we develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance, or “the set of good models.” We provide tractable algorithms to compute the range of attainable group-level predictive disparities and the disparity minimizing model over the set of good models. We extend our framework to address the empirically relevant challenge of selectively labeled data in the setting where the selection decision and outcome are unconfounded given the observed data features. We illustrate our methods in two empirical applications. In a real world credit-scoring task, we build a model with lower predictive disparities than the benchmark model, and demonstrate the benefits of properly accounting for the selective labels problem. In a recidivism risk prediction task, we audit an existing risk score, and find that it generates larger predictive disparities than any model in the set of good models.
Mitigating Bias in Algorithmic Hiring: Evaluating Claims and Practices
AbstractThere has been rapidly growing interest in the use of algorithms in hiring, especially as a means to address or mitigate bias. Yet, to date, little is known about how these methods are used in practice. How are algorithmic assessments built, validated, and examined for bias? In this work, we document and analyze the claims and practices of companies offering algorithms for employment assessment. In particular, we identify vendors of algorithmic pre-employment assessments (i.e., algorithms to screen candidates), document what they have disclosed about their development and validation procedures, and evaluate their practices, focusing particularly on efforts to detect and mitigate bias. Our analysis considers both technical and legal perspectives. Technically, we consider the various choices vendors make regarding data collection and prediction targets, and explore the risks and trade-offs that these choices pose. We also discuss how algorithmic de-biasing techniques interface with, and create challenges for, antidiscrimination law.
Adaptive maximization of social welfare
AbstractWe consider the problem of repeatedly choosing policy parameters in order to maximize social welfare, defined as the weighted sum of private utility and public revenue.
The outcomes of earlier policy choices inform later choices.
In contrast to multi-armed bandit models, utility is not observed, but needs to be indirectly inferred as equivalent variation.
In contrast to standard optimal tax theory, response functions need to be learned through policy choices.
- G0 - General
- C1 - Econometric and Statistical Methods and Methodology: General