Econometrics of Randomized Experiments

Paper Session

Sunday, Jan. 8, 2017 3:15 PM – 5:15 PM

Hyatt Regency Chicago, Water Tower
Hosted By: Econometric Society
  • Chair: Alexander Torgovitsky, Northwestern University

Extrapolation of Instrumental Variables Estimators Towards Externally Valid Parameters

Alexander Torgovitsky
,
Northwestern University
Magne Mogstad
,
University of Chicago
Andres Santos
,
University of California-San Diego

Abstract

We reconsider the empirical content of instrumental variables (IV) estimators under the common instrument monotonicity assumption that there is a weakly separable selection equation. It is well-known that if there is unobserved heterogeneity in the effect of the treatment on the outcome of interest, then the IV estimand can be inter- preted as a weighted average of causal effects. It is also well-known that these weights depend on the specific instrument being used and that, as a result, the IV estimand may not be equal to an externally valid parameter that answers an analyst’s counter- factual question of interest. We show that, despite this, the IV estimand still contains substantial empirical content for externally valid parameters. Our argument uses the insight of Heckman and Vytlacil (2005) that both externally valid parameters and the IV estimand can be expressed as weighted averages of the same underlying marginal treatment effects (MTEs). Since both sets of weights are known or identified, knowl- edge of the IV estimand generally places some restrictions on the MTEs, and hence on the logically permissible values of externally valid parameters. We provide a simple computational method for deriving these implied bounds.

Inference With Covariate-Adaptive Randomization

Azeem M. Shaikh
,
University of Chicago
Federico Andres Bugni
,
Duke University
Ivan Canay
,
Northwestern University

Abstract

This paper studies inference for the average treatment effect in randomized controlled trials with covariate-adaptive randomization. Here, by covariate-adaptive randomization, we mean randomization schemes that first stratify according to baseline covariates and then assign treatment status so as to achieve "balance" within each stratum. Such schemes include, for example, Efron's biased-coin design and stratified block randomization. When testing the null hypothesis that the average treatment effect equals a pre-specified value in such settings, we first show that the usual two-sample $t$-test is conservative in the sense that it has limiting rejection probability under the null hypothesis no greater than and typically strictly less than the nominal level. In a simulation study, we find that the rejection probability may in fact be dramatically less than the nominal level. We show further that these same conclusions remain true for a naive permutation test, but that a modified version of the permutation test yields a test that is non-conservative in the sense that its limiting rejection probability under the null hypothesis equals the nominal level for a wide variety of randomization schemes. The modified version of the permutation test has the additional advantage that it has rejection probability exactly equal to the nominal level for some distributions satisfying the null hypothesis and some randomization schemes. Finally, we show that the usual $t$-test (on the coefficient on treatment assignment) in a linear regression of outcomes on treatment assignment and indicators for each of the strata yields a non-conservative test as well under even weaker assumptions on the randomization scheme. In a simulation study, we find that the non-conservative tests have substantially greater power than the usual two-sample $t$-test.

Optimal Data Collection for Randomized Control Trials

Sokbae Lee
,
Institute for Fiscal Studies
Pedro Carneiro
,
University College London
Daniel Wilhelm
,
University College London

Abstract

In a randomized control trial, the precision of an average treatment effect estimator can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. We propose the use of pre-experimental data such as a census, or a household survey, to inform the choice of both the sample size and the covariates to be collected. Our procedure seeks to minimize the resulting average treatment effect estimator's mean squared error, subject to the researcher's budget constraint. We rely on a modification of an orthogonal greedy algorithm that is conceptually simple and easy to implement in the presence of a large number of potential covariates, and does not require any tuning parameters. In two empirical applications, we show that our procedure can lead to substantial gains of up to 58%, measured either in terms of reductions in data collection costs or in terms of improvements in the precision of the treatment effect estimator.

Combining RCTs and Selection Models for External Validity

Edward J. Vytlacil
,
New York University

Abstract

Randomized Control Trials (RCTs) and observational data are typically viewed as substitutes for the evaluation of treatment effects. The estimates from RCTs are commonly seen as the ``gold standard'' for evaluating treatment effects, to be relied upon exclusively when available, while evidence from observational data is commonly seen as a poor substitute for experimental evidence and is to be used, if at all, only when evidence from an RCT is unavailable. With few exceptions, the literature for combining evidence from RCTs and observational studies when both are available has done so not to better evaluate treatment effects, but rather as a way to evaluate the validity of the nonexperimental approaches applied to the observational data.
In contrast, in this paper we develop a methodology to combine evidence from an RCT with results from observational studies to leverage strengths from both approaches. In particular, this study considers the nonparametric selection model/Local Instrumental Variables approach of Heckman and Vytlacil (2005) applied to observational data, combined with analysis from an RCT. We demonstrate that, by combining the two approaches on the two types of data, one can obtain a deeper understanding of the connection between selection and treatment effects than would be possible with either approach in isolation. In addition, combining the two approaches allows greater external validity than would be possible with the RCT alone; more robust analysis than would be possible with the selection model alone; and solves the problem of identification-at-infinity within selection models.
JEL Classifications
  • C0 - General