ActTraitBench: Quantifying the Knowledge-Decision Gap in Large Language Models via Human-Grounded Behavioral Validation

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

ActTraitBench is a human-grounded evaluation framework designed to quantify the Knowledge-Decision Gap ($G_{\text{KD}}$) in Large Language Models (LLMs), addressing the discrepancy between explicit self-reports and implicit behavioral decisions. Existing benchmarks often fail due to limited construct validity and biases. ActTraitBench establishes one-to-one mappings between psychometric facets and behavioral paradigms, employing a Distributional Calibration via Quantile Mapping procedure to align LLM-judge scores with human norms. Experiments on 14 mainstream LLMs revealed a pervasive knowledge-decision asymmetry, with larger and more capable models often exhibiting stronger behavioral divergence despite highly consistent self-reports. To mitigate this, the Chain of Cognitive Alignment (CoCA), a plug-and-play inference-time intervention, was introduced, improving alignment in reasoning-capable frontier models while exposing limitations in smaller architectures.

Key takeaway

For AI Scientists and Machine Learning Engineers evaluating LLM persona consistency, you should consider ActTraitBench for its human-grounded approach to quantify the Knowledge-Decision Gap. This framework offers a robust method to identify behavioral divergence, even in larger models with strong self-reports. Implement the Chain of Cognitive Alignment (CoCA) as a plug-and-play intervention for reasoning-capable frontier models to improve alignment, while recognizing its limitations for smaller architectures.

Key insights

ActTraitBench quantifies the Knowledge-Decision Gap in LLMs, revealing behavioral divergence despite consistent self-reports.

Principles

Method

ActTraitBench uses psychometric-behavioral mappings and Distributional Calibration via Quantile Mapping. CoCA is an inference-time intervention for alignment.

In practice

Topics

Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.