« Back to Results

The Future of Economic Research Under Rising Risks and Costs of Information Disclosure

Paper Session

Saturday, Jan. 5, 2019 2:30 PM - 4:30 PM

Atlanta Marriott Marquis, A706
Hosted By: American Economic Association & Committee on Economic Statistics
  • Chair: Ben Casselman, New York Times

Why the Economics Profession Cannot Cede the Discussion of Privacy Protection to Computer Scientists

John Abowd
,
U.S. Census Bureau

Abstract

Economists rely heavily on designed data and administrative records from governmental agencies to do critical analytical research. These studies are often done under the supervision of a statistical agency exercising its dual mandate to disseminate information and to protect the privacy and confidentiality of respondent data. We have long recognized that there is tension between these mandates. Cryptographers established in the early 2000s that there is a hard limit to the amount of fully accurate information that can be published from any finite confidential database (Dinur and Nissim 2003)—a budget constraint stated in terms of confidential information leakage. There quickly followed new methods of confidentiality protection, known as formal privacy in computer science. Although the database reconstruction theorem was well-understood in cryptography, its implications for the work of statistical agencies were largely unexplored before the U.S. Census Bureau announced its research program (Census Scientific Advisory Committee Meeting, September 2016) and its decision to implement differential privacy, the leading variant of formal privacy models, for the 2020 Census of Population (Census Scientific Advisory Committee Meeting, September 2017). The Commission on Evidence-based Policymaking also explicitly recommended that statistical agencies embrace privacy-enhancing data analysis methods in its report (September 2017). Because these methods enforce an explicit production function relation between privacy protection and statistical accuracy, they must be implemented in a manner that is fully cognizant of the analysis being performed with the released information. Moreover, an explicit choice—outside the domain of computer science, but integral to economics—must be made: what is the optimal accuracy-privacy protection point for a given problem. This social choice is constrained by the formal privacy technology introduced by cryptographers. The preference mapping, on the other hand, must be expressed based on the uses of the published information and the attendant confidentiality risk. Social scientists have behaved as if they could always have maximum accuracy in every published statistic. We must now re-design many of our analysis protocols to accommodate the constraints of provably effective privacy protection. We aren’t the only ones. Google, Apple, Microsoft, and many other information technology giants face the same

Differential Privacy and Census Data: Implications for Social and Economic Research

Steven Ruggles
,
University of Minnesota

Abstract

The Census Bureau has announced a new set of standards and methods for disclosure control in public use data products. The new approach, known as differential privacy, represents a radical departure from current practice. In its pure form, differential privacy techniques may make the release of useful microdata impossible and severely limit the utility of tabular small-area data. Adoption of differential privacy will have far-reaching consequences for research. It is possible—even likely—that scientists, planners, and the public will lose the free access we have enjoyed for six decades to reliable public Census Bureau data describing American social and economic change. We believe that the differential privacy approach is inconsistent with the statutory obligations, history, and core mission of the Census Bureau.

Reconciling Access and Privacy: Building a Sustainable Model for the Future

Katharine G. Abraham
,
University of Maryland

Abstract

In recent decades, social science research has benefited greatly from access to survey and Census data collected and disseminated by the Federal statistical agencies. The research made possible by this access has generated invaluable insights. An important factor in the government’s ability to collect data from individuals and businesses is the promise it gives to data subjects that their information will be kept private. The statistical agencies are vigilant in their efforts to honor this promise. Given the explosion of outside data about all of us that increasingly is available in electronic form, however, the risk that information contained in data products released by the federal government could compromise the privacy of data subjects has grown. It seems to me to be an unavoidable conclusion that, in order to honor the promises of privacy made to data subjects, current modes for disseminating information based on survey and Census data will need to be rethought. The challenge will be how to do this in a way that also enables effective use of the data. Tiered access is likely to be a central element of any new model for data access, with the information needs of many data users satisfied through publicly disseminated tabulations and the capacity to provide necessary behind-the-firewall access to other data users expanded. This same mix of approaches also can be used to increase access to administrative records for research purposes. While the broad outlines of what a new system will look like seem clear, many important practical questions about its implementation remain to be addressed.

Privacy-Protection for Economics Research in Small Cells

Raj Chetty
,
Stanford University
John Friedman
,
Brown University
Nathaniel Hendren
,
Harvard University

Abstract

Public release of disaggregated statistics holds tremendous value for research and policy but presents challenges for privacy protection. This paper presents a new approach to protecting privacy when publishing statistics from data in small cells. This noise-infusion algorithm draws from ideas in the privacy literature and is straightforward to apply in almost any setting, even for complex statistical models. After describing the details of our approach, we discuss its benefits relative to common privacy-protection procedures, such as count-based suppression
JEL Classifications
  • C4 - Econometric and Statistical Methods: Special Topics