Generative AI and Machine Learning
Paper Session
Sunday, Jan. 4, 2026 10:15 AM - 12:15 PM (EST)
- Chair: Junbo Wang, Louisiana State University
Out of the (Black)Box: AI as Conditional Probability
Abstract
The core technology powering modern Large Language Models (LLMs) estimates the distribution of probable answers conditional on the prompt. Using a financial news and returns dataset, we find that these conditional probabilities are interpretable and contain valuable economic information. Conversely, measures of declared confidence used in the literature are opaque, structurally biased, unstable, and more model-dependent, indicating that LLMs cannot assess their own confidence. Using conditional probabilities, we analyze LLM biases and provide insights into the internal mechanisms driving model decisions. Our results indicate that conditional probabilities provide a reliable and transparent reflection of LLM priors, particularly for economic applications.What Does ChatGPT Make of Historical Stock Returns? Extrapolation and Miscalibration in LLM Stock Return Forecasts
Abstract
We examine how large language models (LLMs) interpret historical stock returns and price charts when prompted to forecast returns over short horizons. While stock returns exhibit short-term reversals, LLM forecasts overextrapolate, placing excessive weight on recent performance. Simulations indicate that LLM extrapolation is stronger for less persistent series, similar to humans, and difficult to eliminate through prompt engineering. LLM forecasts also appear optimistic relative to historical and future returns. When prompted for 80% confidence interval predictions, LLM forecasts are better calibrated than survey evidence. The findings suggest LLMs manifest common behavioral biases but are better at gauging risks than humans.The Power of the Common Task Framework
Abstract
"The “Common Task Framework” (CTF) is a collaborative and competitive processin which researchers solve a task using shared data, a predefined success metric, and a leaderboard. Using an economic model, we show that the CTF incentivizes effort, increases innovation, and curbs misrepresentation by reducing research costs and improving comparability. Historical examples from computer science underscore its effectiveness. To demonstrate its broader applicability, we propose a CTF for financial economics: a platform open to all researchers designed to identify the pricing kernel and systematically evaluate asset pricing models, from traditional factor-based approaches to modern machine learning techniques."
Discussant(s)
Alejandro Lopez-Lira
,
University of Florida
Rohit Allena
,
University of Houston
Shumiao Ouyang
,
University of Oxford
Winston Wei Dou
,
University of Pennsylvania
JEL Classifications
- G1 - General Financial Markets