Special Session: Current State of AI in Finance
Paper Session
Monday, Jan. 5, 2026 10:15 AM - 12:15 PM (EST)
- Chair: Gerard Hoberg, University of Southern California
The Memorization Problem: Can We Trust LLMs’ Economic Forecasts?
Abstract
Large language models (LLMs) cannot be trusted for economic forecasts during periods covered by their training data. Counterfactual forecasting ability is non-identified when the model has seen the realized values: any observed output is consistent with both genuine skill and memorization. Any evidence of memorization represents only a lower bound on encoded knowledge. We demonstrate LLMs have memorized economic and financial data, recalling exact values before their knowledge cutoff. Instructions to respect historical boundaries fail to prevent recall-level accuracy, and masking fails as LLMs reconstruct entities and dates from minimal context. Post-cutoff, we observe no recall. Memorization extends to embeddings.Technology and Labor Markets: Past, Present, and Future; Evidence from Two Centuries of Innovation
Abstract
We use recent advances in natural language processing and large language models to construct novel measures of technology exposure for workers that span almost two centuries. Combining our measures with Census data on occupation employment, we show that technological progress over the 20th century has led to economically meaningful shifts in labor demand across occupations: it has consistently increased demand for occupations with higher education requirements, occupations that pay higher wages, and occupations with a greater fraction of female workers. Using these insights and a calibrated model, we then explore different scenarios for how advances in artificial intelligence (AI) are likely to impact employment trends in the medium run. The model predicts a reversal of past trends, with AI favoring occupations that are lower-educated, lower-paid, and more male-dominated.The Household Impact of Generative AI: Evidence from Internet Browsing Behavior
Abstract
This paper studies the impact of generative AI on U.S. households using detailed Internet browsing data from over 200,000 households' home devices during 2021--2024. Our analyses of households' de facto adoption and usage of ChatGPT reveal several new findings. First, based on households' browsing of other websites during ChatGPT usage sessions, we find that households tend to use ChatGPT for productive non-market activities, such as education or job searches, rather than for leisure. Second, based on households' pre-ChatGPT browsing patterns, we show that adopting ChatGPT reduces households' overall productive activities and increases leisure activities on home devices. Together, these findings suggest that generative AI increases households' time spent on leisure by making productive activities more efficient. Finally, we find a substantial "generative AI divide" among households, as high-income and younger households adopt generative AI more than low-income and older households.JEL Classifications
- G0 - General