# Machine Learning in Econometrics

Paper Session

#### Friday, Jan. 6, 2017 10:15 AM – 12:15 PM

Hyatt Regency Chicago, Crystal B

- Chair: Victor Chernozhukov, Massachusetts Institute of Technology

### Testing-Based Forward Model Selection

#### Abstract

This work introduces a theoretical foundation for a procedure called `testing-based forward model selection' in regression problems. Forward selection is a general term referring to a model selection procedure which inductively selects covariates that add predictive power into a working statistical model. This paper considers the use of testing procedures, derived from traditional statistical hypothesis testing, as a criterion for deciding which variable to include next and when to stop including variables. Probabilistic bounds for prediction error and number of selected covariates are proved for the proposed procedure. The general result is illustrated by an example with heteroskedastic data where Huber-Eicker-White standard errors are used to construct tests. The performance of the testing-based forward selection is compared to Lasso and Post-Lasso in simulation studies. Finally, the use of testing-based forward selection is illustrated with an application to estimating the effects of institution quality on aggregate economic output.### L2-Boosting for Economic Applications

#### Abstract

Boosting is one of the most significant developments in machine learning. This paper studies the statistical properties of L2-Boosting, which is tailored for regression, in a high-dimensional setting. Moreover, we introduce so-called "post-Boosting". This is a post-selection estimator which applies ordinary least squares to the variables selected in the first stage by L2-Boosting. Another variant is orthogonal boosting where after each step an orthogonal projection is conducted. We analyse those variants and apply them to economic applications like IV estimation and treatment effect estimation in a high-dimensional setting. We derive results for inference in those settings and highlight the performance of Boosting in real applications and compare the results to Lasso.### Core Determining Class: Construction, Approximation and Inference

#### Abstract

The relations between unobserved events and observed outcomes in partially identiﬁed models can be characterized by a bipartite graph. We estimate the probability measure on the events given observations of the outcomes based on the graph. The feasible set of the probability measure on the events is deﬁned by a set of linear inequality constraints. The number of inequalities is often much larger than the number of observations. The set of irredundant inequalities is known as the Core Determining Class. We propose an algorithm that explores the structure of the graph to construct the exact Core Determining Class when data noise is not taken into consideration. We prove that if the graph and the measure on the observed outcomes are non-degenerate, the Core Determining Class does not depend on the probability measure of the outcomes but only on the structure of the graph. For more general problem of selecting linear inequalities under noise, we investigate the sparse assumptions on the full set of inequalities, i.e., only a few inequalities are truly binding. We show that the sparse assumptions are equivalen to certain sparse conditions on the dual problems. We propose a procedure similar to the Dantzig Selector to select the truly informative constraints. We analyze the properties of the procedure and show that the feasible set deﬁned by the selected constraints is a nearly sharp estimator of the true feasible set. Under our sparse assumptions, we prove that such a procedure can signiﬁcantly reduce the number of inequalities without throwing away too much information. We apply the procedure to the Core Determining Class problem and obtain a stronger theorem taking advantage of the structure of the bipartite graph.### Estimating Average Treatment Effects in Settings with Many Covariates: Supplementary Analyses and Remaining Challenges

#### Abstract

There is a large literature in econometrics and statistics on semiparametric estimation of average treatment effects under the assumption of unconfounded treatment assignment. Recently this literature has focused on the setting with many covariates where regularization of some kind is required. In this article we discuss some of the lessons from the earlier literature and their relevance for the many covariate setting.##### Discussant(s)

Panagiotis Toulis

, University of Chicago

Adam M. Rosen

, University College London

Hai Wang

, Singapore Management University

##### JEL Classifications

- C1 - Econometric and Statistical Methods and Methodology: General
- C3 - Multiple or Simultaneous Equation Models; Multiple Variables