« Back to Results

Machine Learning Methods for Heterogeneous Treatment Effects

Paper Session

Friday, Jan. 7, 2022 12:15 PM - 2:15 PM (EST)

Hosted By: Econometric Society
  • Chair: Rocío Titiunik, Princeton University

Causal Inference in Possibly Nonlinear Factor Models

Yingjie Feng
,
Tsinghua University

Abstract

This paper develops a general causal inference method for treatment effects models with noisily measured confounders. The key feature is that a large set of noisy measurements are linked with the underlying latent confounders through an unknown, possibly nonlinear factor structure. The main building block is a local principal subspace approximation procedure that combines K-nearest neighbors matching and principal component analysis. Estimators of many causal parameters, including average treatment effects and counterfactual distributions, are constructed based on doubly-robust score functions. Large-sample properties of these estimators are established, which only require relatively mild conditions on the principal subspace approximation. The results are illustrated with an empirical application studying the effect of political connections on stock returns of financial firms, and a Monte Carlo experiment. The main technical and methodological results regarding the general local principal subspace approximation method may be of independent interest.

De-Biased Machine Learning of Global and Local Parameters Using Regularized Riesz Representers

Victor Chernozhukov
,
Massachusetts Institute of Technology
Whitney Newey
,
Massachusetts Institute of Technology
Rahul Singh
,
Massachusetts Institute of Technology

Abstract

Abstract. We provide novel adaptive inference methods, based on novel L1 regularization methods, for regular (semi-parametric) and non-regular (nonparametric) linear functionals of the conditional expectation function. Examples of regular functionals include average treatment effects, policy effects from covariate distribution shifts and stochastic transformations, and average derivatives. Examples of non-regular functionals include the local linear functionals defined as local averages that approximate perfectly localized quantities: average treatment effects, average policy effects, and average derivatives, conditional on a covariate subvector fixed at a point. Our construction relies on building Neyman orthogonal equations for the target parameter that are approximately invariant to small perturbations of the nuisance parameters. To achieve this property we include the linear Riesz representer for the functionals in the equations as the additional nuisance parameter.

We use L1-regularized methods to learn approximations to the linear representer and
the regression function, in settings where dimension of (possibly overcomplete) dictionary
of basis functions p is much larger than n. We then estimate the linear functional by the
solution to the empirical analog of the orthogonal equations. Our key result is that under
weak assumptions the estimator of the functional concentrates in a root-n neighborhood of
the target with deviations controlled by the Gaussian law. The key regularity condition
requires that the operator norm L of the functional is small compared to root-n; that is the
functional does not lose its regularity too fast. The L diverges for local functionals and
even global functionals under weak identification. Further conditions are needed to control
bias if perfectly localized quantities are the target. For L1 regularization methods, our
construction and analysis yield weak “double sparsity robustness”: either the approximation
to the regression function or the approximation to the representer can be “completely dense”
as long as the other one is sufficiently “sparse”. Our main results are non-asymptotic and
imply asymptotic uniform validity over large classes of models, translating into honest
confidence bands for both global and local parameters. As far as we know, this is the first
non-asymptotic Gaussian approximation result for de-biased machine learning.

Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments

Kyle Colangelo
,
University of California-Irvine
Ying-Ying Lee
,
University of California-Irvine

Abstract

We propose a nonparametric inference method for causal effects of continuous treatment variables, under unconfoundedness and in the presence of high-dimensional or nonparametric nuisance parameters. Our double debiased machine learning (DML) estimators for the average dose-response function (or the average structural function) and the partial effects are asymptotically normal with nonparametric convergence rates. The nuisance estimators for the conditional expectation function and the conditional density can be nonparametric or ML methods. Utilizing a kernel-based doubly robust moment function and cross-fitting, we give high-level conditions under which the nuisance estimators do not affect the first-order large sample distribution of the DML estimators. We further provide sufficient low-level conditions for kernel and series estimators, as well as modern ML methods - generalized random forests and deep neural networks. We justify the use of kernel to localize the continuous treatment at a given value by the Gateaux derivative. We implement various ML methods in Monte Carlo simulations and an empirical application on a job training program evaluation.

Unconditional Quantile Regression with High-Dimensional Data

Yuya Sasaki
,
Vanderbilt University
Takuya Ura
,
University of California-Davis
Yichong Zhang
,
Singapore Management University

Abstract

This paper considers estimation and inference for heterogeneous counterfactual effects with high-dimensional data. We propose a novel robust score for debiased estimation of the unconditional quantile regression (Firpo, Fortin, and Lemieux, 2009) as a measure of heterogeneous counterfactual marginal effects. We propose a multiplier bootstrap inference and develop asymptotic theories to guarantee the size control in large sample. Simulation studies support our theories. Applying the proposed method to Job Corps survey data, we find that extending the duration of exposures to the Job Corps training program will be effective especially for the targeted subpopulations of lower potential wage earners.
JEL Classifications
  • C1 - Econometric and Statistical Methods and Methodology: General