« Back to Results

Machine Learning, Prediction Errors, and Causal Inference

Paper Session

Sunday, Jan. 4, 2026 2:30 PM - 4:30 PM (EST)

Philadelphia Convention Center, 204-B

Hosted By: American Economic Association

Chair: Matthew Gordon, Paris School of Economics

Remote Control: Debiasing Machine Learning Predictions for Causal Inference

Matthew Gordon

Paris School of Economics

Megan Ayers

Yale University

Eliana Stone

Yale University

Luke Sanford

Yale University

Abstract

Advances in machine learning and the increasing availability of high-dimensional data have led to the proliferation of social science research that uses the predictions of machine learning models as proxies for measures of human activity or environmental outcomes. However, prediction errors can lead to bias when estimating regression coefficients. In this paper, we show how this bias can arise, and demonstrate the use of an adversarial machine learning algorithm in order to debias predictions. These methods are applicable to any setting where machine learned predictions are the dependent variable in a regression. We conduct simulations and empirical exercises using ground-truth and satellite data on forest cover in Africa. Using the predictions from a standard machine learning model leads to biased parameter estimates, while the predictions from the adversarial model give precise estimates of the true effects. Finally, we replicate a study of the effects of artisanal gold mining on deforestation in Africa, and we find that after correcting for bias using a novel sample of hand-labeled points, standard confidence intervals can not rule out a null effect, even though our confidence intervals are 19% smaller than those obtained using alternative bias correction methods.

Program Evaluation with Remotely Sensed Outcomes

Ashesh Rambachan

Massachusetts Institute of Technology

Rahul Singh

Harvard University

Davide Viviano

Harvard University

View Abstract

Abstract

While traditional program evaluations typically rely on surveys to measure outcomes, certain economic outcomes such as living standards or environmental quality may be infeasible or costly to collect. As a result, recent empirical work estimates treatment effects using remotely sensed variables (RSVs), such mobile phone activity or satellite images, instead of ground-truth outcome measurements. Common practice predicts the economic outcome from the RSV, using an auxiliary sample of labeled RSVs, and then uses such predictions as the outcome in the experiment. We prove that this approach leads to biased estimates of treatment effects when the RSV is a post-outcome variable. We nonparametrically identify the treatment effect, using an assumption that reflects the logic of recent empirical research: the conditional distribution of the RSV remains stable across both samples, given the outcome and treatment. Our results do not require researchers to know or consistently estimate the relationship between the RSV, outcome, and treatment, which is typically mis-specified with unstructured data. We form a representation of the RSV for downstream causal inference by predicting the outcome and predicting the treatment, with better predictions leading to more precise causal estimates. We re-evaluate the efficacy of a large-scale public program in India, showing that the program’s measured effects on local consumption and poverty can be replicated using satellite imagery.

Inference for Regression with Variables Generated by AI or Machine Learning

Laura Battaglia

University of Oxford

Timothy Christensen

Yale University

Stephen Hansen

University College London

Szymon Sacher

Abstract

It has become common practice for researchers to use AI-powered information retrieval algorithms or other machine learning methods to estimate variables of economic interest, then use these estimates as covariates in a regression model. We show both theoretically and empirically that naively treating AI- and ML-generated variables as “data” leads to biased estimates and invalid inference. We propose two methods to correct bias and perform valid inference: (i) an explicit bias correction with bias-corrected confidence intervals, and (ii) joint maximum likelihood estimation of the regression model and the variables of interest. Through several applications, we demonstrate that the common approach generates substantial bias, while both corrections perform well.

Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling

Kerri Lu

Massachusetts Institute of Technology

Dan Kluger

Massachusetts Institute of Technology

Tijana Zrnic

Stanford University

Sherrie Wang

Massachusetts Institute of Technology

Stephen Bates

Massachusetts Institute of Technology

View Abstract

Abstract

Machine learning models are increasingly used to produce predictions that serve as input data in subsequent statistical analyses. For example, computer vision predictions of economic and environmental indicators based on satellite imagery are used in downstream regressions; similarly, language models are widely used to approximate human ratings and opinions in social science research. However, failure to properly account for errors in the machine learning predictions renders standard statistical procedures invalid. Prior work uses what we call the Predict-Then-Debias estimator to give valid confidence intervals when machine learning algorithms impute missing variables, assuming a small complete sample from the population of interest. We expand the scope by introducing bootstrap confidence intervals that apply when the complete data is a nonuniform (i.e., weighted, stratified, or clustered) sample and to settings where an arbitrary subset of features is imputed. Importantly, the method can be applied to many settings without requiring additional calculations. We prove that these confidence intervals are valid under no assumptions on the quality of the machine learning model and are no wider than the intervals obtained by methods that do not use machine learning predictions.

Discussant(s)

Sylvia Klosin

University of California-Davis

Paul Goldsmith-Pinkham

Yale University

Simon Ramirez Amaya

University of California-Berkeley

Ed Rubin

University of Oregon

JEL Classifications

C4 - Econometric and Statistical Methods: Special Topics
Q0 - General

This website uses cookies.

Machine Learning, Prediction Errors, and Causal Inference

Sunday, Jan. 4, 2026 2:30 PM - 4:30 PM (EST)

Remote Control: Debiasing Machine Learning Predictions for Causal Inference

Abstract

Program Evaluation with Remotely Sensed Outcomes

Abstract

Inference for Regression with Variables Generated by AI or Machine Learning

Abstract

Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling

Abstract

Discussant(s)

JEL Classifications