Economic Applications of Machine Learning
Friday, Jan. 5, 2018 8:00 AM - 10:00 AM
- Chair: Daniel Björkegren, Brown University
Behavior Revealed in Mobile Phone Usage Predicts Loan Repayment
AbstractMany households in developing countries lack formal financial histories, making it difficult for banks to allocate capital, and for potential borrowers to obtain loans. However, many unbanked households have mobile phones, and even prepaid phones generate rich data about their behavior. This project shows that behavioral signatures in mobile phone data predict default with accuracy approaching that of credit scoring methods that rely on financial histories. The method is demonstrated using call records matched to loan outcomes for a sample of borrowers in a Caribbean country. Individuals in the highest quartile of risk by our measure are 5.5 times more likely to default than those in the lowest quartile. We obtain this performance despite the fact that our sample is poor and uses phones infrequently. We outline several ways our method could be practically implemented.
Estimating Poverty and Wealth From Mobile Phone Data
AbstractAccurate estimates of population demographics are a critical input to social and economic research. Here, we show that it is possible to predict the wealth of an individual based on the analysis of his past history of mobile phone calls, and that phone-based predictions of millions of citizens can be aggregated into accurate national statistics. The approach is first demonstrated on a sample of 856 phone survey respondents in Rwanda, and separately validated through 1,234 face-to-face interviews in Afghanistan. In resource-constrained environments where censuses and household surveys are rare, this creates an option for gathering timely information on population statistics at a tiny fraction of the cost of traditional methods.
Forecasting Economic Activity With Yelp Data
AbstractMeasuring and forecasting economic activity is a central component of policymaking and policy research. Statistics released by government agencies such as the Bureau of Labor Statistics and Census Bureau have been the backbone of much of this work, providing insight about a wide set of policy questions. While valuable, these sources have important limitations - they are are published at low frequency with large reporting lags that can stretch back two years, and lack consistent data at granular levels of analysis such as cities or neighborhoods. While more granular data can be made available to researchers, there is often an additional waiting period of one to two years to receive this access.
These factors impose practical limitations on the data’s ability to shed light on real-time trends and policy. Pairing user-generated data on local business activity from Yelp with government data sources including the Quarterly Census of Employment and Wages (QCEW) and housing price data, we examine the potential and limitations of using Yelp data to improve the measurement of real time economic activity, as well as economic forecasts. We investigate the ways in which Yelp data can provide a useful complement to QCEW, by making reliable predictions on local business patterns well before the release of official statistics. However, the ability to make meaningful predictions lies not only in data gathering but also in data cleaning and model selection – we explore these decisions as well. Lastly, we expand this analysis by forecasting other economic outcomes, such as housing prices at the local level.
Harvard Business School
Harvard Business School
- C1 - Econometric and Statistical Methods and Methodology: General
- O3 - Innovation; Research and Development; Technological Change; Intellectual Property Rights