« Back to Results

Machine Learning, Prediction Errors, and Causal Inference

Paper Session

Sunday, Jan. 4, 2026 2:30 PM - 4:30 PM (EST)

Philadelphia Convention Center, 204-B
Hosted By: American Economic Association
  • Chair: Matthew Gordon, Paris School of Economics

Remote Control: Debiasing Machine Learning Predictions for Causal Inference

Matthew Gordon
,
Paris School of Economics
Megan Ayers
,
Yale University
Eliana Stone
,
Yale University
Luke Sanford
,
Yale University

Abstract

Advances in machine learning and the increasing availability of high-dimensional data have led to the proliferation of social science research that uses the predictions of machine learning models as proxies for measures of human activity or environmental outcomes. However, prediction errors can lead to bias when estimating regression coefficients. In this paper, we show how this bias can arise, and demonstrate the use of an adversarial machine learning algorithm in order to debias predictions. These methods are applicable to any setting where machine learned predictions are the dependent variable in a regression. We conduct simulations and empirical exercises using ground-truth and satellite data on forest cover in Africa. Using the predictions from a standard machine learning model leads to biased parameter estimates, while the predictions from the adversarial model give precise estimates of the true effects. Finally, we replicate a study of the effects of artisanal gold mining on deforestation in Africa, and we find that after correcting for bias using a novel sample of hand-labeled points, standard confidence intervals can not rule out a null effect, even though our confidence intervals are 19% smaller than those obtained using alternative bias correction methods.

Program Evaluation with Remotely Sensed Outcomes

Ashesh Rambachan
,
Massachusetts Institute of Technology
Rahul Singh
,
Harvard University
Davide Viviano
,
Harvard University

Abstract

While traditional program evaluations typically rely on surveys to measure outcomes, certain economic outcomes such as living standards or environmental quality may be infeasible or costly to collect. As a result, recent empirical work estimates treatment effects using remotely sensed variables (RSVs), such mobile phone activity or satellite images, instead of ground-truth outcome measurements. Common practice predicts the economic outcome from the RSV, using an auxiliary sample of labeled RSVs, and then uses such predictions as the outcome in the experiment. We prove that this approach leads to biased estimates of treatment effects when the RSV is a post-outcome variable. We nonparametrically identify the treatment effect, using an assumption that reflects the logic of recent empirical research: the conditional distribution of the RSV remains stable across both samples, given the outcome and treatment. Our results do not require researchers to know or consistently estimate the relationship between the RSV, outcome, and treatment, which is typically mis-specified with unstructured data. We form a representation of the RSV for downstream causal inference by predicting the outcome and predicting the treatment, with better predictions leading to more precise causal estimates. We re-evaluate the efficacy of a large-scale public program in India, showing that the program’s measured effects on local consumption and poverty can be replicated using satellite imagery.

Inference for Regression with Variables Generated by AI or Machine Learning

Laura Battaglia
,
University of Oxford
Timothy Christensen
,
Yale University
Stephen Hansen
,
University College London
Szymon Sacher
,
Meta

Abstract

It has become common practice for researchers to use AI-powered information retrieval algorithms or other machine learning methods to estimate variables of economic interest, then use these estimates as covariates in a regression model. We show both theoretically and empirically that naively treating AI- and ML-generated variables as “data” leads to biased estimates and invalid inference. We propose two methods to correct bias and perform valid inference: (i) an explicit bias correction with bias-corrected confidence intervals, and (ii) joint maximum likelihood estimation of the regression model and the variables of interest. Through several applications, we demonstrate that the common approach generates substantial bias, while both corrections perform well.

Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling

Kerri Lu
,
Massachusetts Institute of Technology
Dan Kluger
,
Massachusetts Institute of Technology
Tijana Zrnic
,
Stanford University
Sherrie Wang
,
Massachusetts Institute of Technology
Stephen Bates
,
Massachusetts Institute of Technology

Abstract

Machine learning models are increasingly used to produce predictions that serve as input data in subsequent statistical analyses. For example, computer vision predictions of economic and environmental indicators based on satellite imagery are used in downstream regressions; similarly, language models are widely used to approximate human ratings and opinions in social science research. However, failure to properly account for errors in the machine learning predictions renders standard statistical procedures invalid. Prior work uses what we call the Predict-Then-Debias estimator to give valid confidence intervals when machine learning algorithms impute missing variables, assuming a small complete sample from the population of interest. We expand the scope by introducing bootstrap confidence intervals that apply when the complete data is a nonuniform (i.e., weighted, stratified, or clustered) sample and to settings where an arbitrary subset of features is imputed. Importantly, the method can be applied to many settings without requiring additional calculations. We prove that these confidence intervals are valid under no assumptions on the quality of the machine learning model and are no wider than the intervals obtained by methods that do not use machine learning predictions.

Discussant(s)
Sylvia Klosin
,
University of California-Davis
Paul Goldsmith-Pinkham
,
Yale University
Simon Ramirez Amaya
,
University of California-Berkeley
Ed Rubin
,
University of Oregon
JEL Classifications
  • C4 - Econometric and Statistical Methods: Special Topics
  • Q0 - General