« Back to Results

Machine Learning for Policy Research

Paper Session

Friday, Jan. 5, 2018 2:30 PM - 4:30 PM

Pennsylvania Convention Center, 204-C
Hosted By: American Economic Association
  • Chair: Susan Athey, Stanford University

Choosing Among Machine Learning Estimators In Empirical Economics

Alberto Abadie
Massachusetts Institute of Technology
Maximilian Kasy
Harvard University


Empirical economists often confront problems that require estimation of many parameters. Examples include estimation of teacher, location, or worker and firm fixed effects, causal effects for many subgroups, and forecasting with many predictors. In such settings, machine learning methods based on shrinkage, such as ridge or lasso, may considerably increase the precision of the estimation. Which shrinkage estimator is best in practice depends on the distribution of the parameters being estimated in each particular instance. We demonstrate how different estimators are preferable depending on the application, and discuss a general-purpose method for constructing shrinkage estimators geared toward specific empirical applications.

Efficient Policy Learning

Susan Athey
Stanford University
Stefan Wager
Stanford University


We consider the problem of using observational data to learn treatment assignment policies that satisfy certain constraints specified by a practitioner, such as budget, fairness, or functional form constraints. This problem has previously been studied in economics, statistics, and computer science, and several regret-consistent methods have been proposed. However, several key analytical components are missing, including a characterization of optimal methods for policy learning, and sharp bounds for minimax regret. In this paper, we derive lower bounds for the minimax regret of policy learning under constraints, and propose a method that attains this bound asymptotically up to a constant factor. Whenever the class of policies under consideration has a bounded Vapnik-Chervonenkis dimension, we show that the problem of minimax-regret policy learning can be asymptotically reduced to first efficiently evaluating how much each candidate policy improves over a randomized baseline, and then maximizing this value estimate. Our analysis relies on uniform generalizations of classical semiparametric efficiency results for average treatment effect estimation, paired with sharp concentration bounds for weighted empirical risk minimization that may be of independent interest.

Discovering Heterogeneous Effects Using Generic Machine Learning Tools

Victor Chernozhukov
Massachusetts Institute of Technology
Esther Duflo
Massachusetts Institute of Technology


We propose inference methods and tools for characterizing heterogenous treatment effects in randomized experiments and observational studies, which are applicable in conjunction with any high-quality modern prediction method from machine learning. We provide point and interval estimators, where the latter quantify the uncertainty associated with point estimates. We demonstrate the utility of the approach in a variety of empirical examples.

Deep IV: A Flexible Approach for Counterfactual Prediction

Matt Taddy
Microsoft Research


Counterfactual prediction requires understanding
causal relationships between so-called treatment
and outcome variables. This paper provides a
recipe for augmenting deep learning methods to
accurately characterize such relationships in the
presence of instrument variables (IVs)—sources
of treatment randomization that are conditionally
independent from the outcomes. Our IV specification
resolves into two prediction tasks that can be solved
with deep neural nets: a first-stage network
for treatment prediction and a second-stage
network whose loss function involves integration
over the conditional treatment distribution. This
Deep IV framework allows us to take advantage
of off-the-shelf supervised learning techniques
to estimate causal effects by adapting the loss
function. Experiments show that it outperforms
existing machine learning approaches
John N. Friedman
Brown University
Keisuke Hirano
Pennsylvania State University
JEL Classifications
  • C1 - Econometric and Statistical Methods and Methodology: General
  • C8 - Data Collection and Data Estimation Methodology; Computer Programs