« Back to Results

Artificial Intelligence and Human Beliefs

Paper Session

Monday, Jan. 5, 2026 1:00 PM - 3:00 PM (EST)

Philadelphia Marriott Downtown, Grand Ballroom C&D
Hosted By: American Economic Association
  • Chair: Yannai A. Gonczarowski, Harvard University

Human Misperception of Generative-AI Alignment: A Laboratory Experiment

Kevin He
,
University of Pennsylvania
Ran Shorrer
,
Pennsylvania State University
Mengjia Xia
,
University of Pennsylvania

Abstract

We conduct an incentivized laboratory experiment to study people’s perception of generative artificial intelligence (GenAI) alignment in the context of economic decision-making. Using a panel of economic problems spanning the domains of risk, time preference, social preference, and strategic interactions, we ask human subjects to make choices for themselves and to predict the choices made by GenAI on behalf of a human user. We find that people overestimate the degree of alignment between GenAI’s choices and human choices. In every problem, human subjects’ average prediction about GenAI’s choice is substantially closer to the average human-subject choice than it is to the GenAI choice. At the individual level, different subjects’ predictions about GenAI’s choice in a given problem are highly correlated with their own choices in the same problem. We explore the implications of people overestimating GenAI alignment in a simple theoretical model.

Human Learning about AI

Bnaya Dreyfuss
,
Harvard University
Raphaël Raux
,
Harvard University

Abstract

We study how humans form expectations about the performance of artificial intelligence (AI) and consequences for AI adoption. Our main hypothesis is that people project human-relevant task features onto AI. People then over-infer from AI failures on human-easy tasks, and from AI successes on human-difficult tasks. Lab experiments provide strong evidence for projection of human difficulty onto AI, predictably distorting subjects’ expectations. Resulting adoption can be sub-optimal, as failing human-easy tasks need not imply poor overall performance in the case of AI. A field experiment with an AI giving parenting advice shows evidence for projection of human textual similarity. Users strongly infer from answers that are equally uninformative but less humanly-similar to expected answers, significantly reducing trust and engagement. Results suggest AI “anthropomorphism” can backfire by increasing projection and de-aligning human expectations and AI performance.

The ABC's of Who Benefits from Working with AI: Ability, Beliefs, and Calibration

Andrew Caplin
,
New York University
David Deming
,
Harvard University
Shangwen Li
,
New York University
Daniel J. Martin
,
University of California-Santa Barbara
Philip Marx
,
Louisiana State University

Abstract

We use a controlled experiment to show that ability and belief calibration jointly determine the benefits of working with Artificial Intelligence (AI). AI improves performance more for people with low baseline ability. However, holding ability constant, AI assistance is more valuable for people who are calibrated, meaning they have accurate beliefs about their own ability. People who know they have low ability gain the most from working with AI. In a counterfactual analysis, we show that eliminating miscalibration would cause AI to reduce performance inequality nearly twice as much as it already does.

Are Foundation Models Foundational? Using Synthetic Tasks Reveal World Models

Ashesh Rambachan
,
Massachusetts Institute of Technology

Abstract

Foundation models are premised on the idea that sequence prediction can uncover deeper domain understanding, much like how Kepler's predictions of planetary motion led to the discovery of fundamental physical laws. However, evaluating whether these models truly capture underlying principles remains a challenge. We present a framework for testing foundation models by analyzing how they adapt to synthetic tasks that share mechanisms with their training domain. We introduce metrics that measure a model's inductive bias toward known models of reality. Across multiple domains, we find that models can excel at their training tasks yet fail to develop inductive biases toward the true mechanisms when being adapted to new tasks. For instance, models trained on orbital trajectories fail to consistently apply Newtonian mechanics when adapted to new physics tasks. Our analysis reveals that rather than learning parsimonious representations that accord with reality, these models often develop task-specific heuristics that don't generalize.
JEL Classifications
  • D8 - Information, Knowledge, and Uncertainty
  • D9 - Micro-Based Behavioral Economics