« Back to Results

Big Data: At Scale Methods and Applications

Paper Session

Friday, Jan. 6, 2023 8:00 AM - 10:00 AM (CST)

Hilton Riverside, Fulton
Hosted By: American Economic Association
  • Chair: Matthew Harding, University of California-Irvine

Homophily and Community Structure at Scale: An Application to a Large Professional Network

Juan Nelson Martinez Dahbura
,
Sansan, Inc.
Shota Komatsu
,
Sansan, Inc.
Takanori Nishida
,
Sansan, Inc.
Angelo Mele
,
Johns Hopkins University

Abstract

Professional networks affect labor market outcomes, efficiency and knowledge diffusion. We study a large business card exchange network from Eight, a contact and career management app popular in Japan. Our empirical analysis is guided by a structural model of equilibrium network formation, with observable and unobservable heterogeneity, estimated via a two-steps approach that reduces computational challenges. In the first step, we recover the unobservable types; in the second step we estimate the structural parameters, conditioning on estimated unobservables. Our results highlight the role of shared contacts and homophily in observables and unobservables in shaping the network of professional contacts.

Online Advertising Auctions for Diverse Customers

Nils Breitmar
,
Columbia University
Matthew Harding
,
University of California-Irvine
Carlos Lamarche
,
University of Kentucky

Abstract

The online advertising industry has become a major economic force: digital advertising spent is projected to surpass $560bn in 2022.  In this paper we explore the estimation of online advertising auction models in a real world setting where the publisher has only limited information on the bidding process. We use data on billions of sequential auctions and introduce new econometric methods to address binned data. We estimate the model using data from a social network for the LGBTQIA+ community with 35 million users worldwide. We discuss revenue and consumer welfare implications in an environment addressing the needs of a diverse customer base in the presence of restrictions on both consumer information and observable auction features.

Born to be (Sub)Prime

Helena Bach
,
University of Geneva
Pietro Campa
,
University of Geneva
Giacomo De Giorgi
,
University of Geneva
Jaromir Nosal
,
Boston College
Davide Pietrobon
,
University of Geneva

Abstract

We document the presence of large lifecycle persistence in the U.S. consumer credit market. We construct profiles of the evolution of individuals' credit outcomes for a considerable part of their life using a panel representative of the U.S. population with credit history. Differences in credit scores around the time of entry in the credit market are persistent and predict distinctive life-cycle trajectories in crucial credit outcomes, such as mortgages and revolving credit. The initial inequality persists over almost 20 years.

Delivering Public Services in Data-Scarce Environments: Using Open Data to Solve the Facility-Location Problem at Scale 

Jonathan Hersh
,
Chapman University
Joshua Anderson
,
University of Pittsburgh
Luis Inaki Alberro Encinas
,
World Bank
Tina George
,
World Bank
Kushal Kumar Reddy
,
Cornell University

Abstract

In data-poor environments, governments may struggle to deliver public services if there is a lack of knowledge about the location of households to whom they intend to deliver public services. Examples of this include vaccine distribution, infrastructure provisioning, school site selection, and the delivery of aid during times of crisis. We present a simple and scalable method of selecting sites to optimize the delivery of public services in environments where governments do not have reliable information about the location of households or distribution centers. To identify the location of households, we use publicly available data on households derived from satellite imagery with global coverage. To identify the locations of possible distribution centers, we show how candidate distribution locations may be derived from OpenStreetMap, an open-source mapping website. Knowing the location of households and possible distribution sites, the problem becomes one of choosing the optimal distribution sites from the full set of candidate sites, a problem known as the Facility-Location Problem. We present as a test case of this method, the problem of finding optimal registration sites to distribute foundational IDs in Western Africa. Finally, we discuss welfare implications for adapting this method across a variety of countries and contexts.
JEL Classifications
  • G3 - Corporate Finance and Governance