CoRA: Confidence-Rationale Alignment for Reliable Chain-of-Thought Reasoning

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

CoRA, a novel Confidence-Rationale Alignment framework, addresses the issue of misleadingly high confidence in Chain-of-Thought (CoT) reasoning within Large Language Models (LLMs), where rationales may seem plausible but lack substantive support. This framework introduces a GRPO-based reinforcement learning approach that jointly optimizes for answer correctness, committed-answer probability, and rubric-based rationale support. The rubric evaluates rationale grounding, coherence, task match, and connection to the selected answer without access to the gold answer. Across MedQA, MathQA, and OpenBookQA datasets, utilizing three open-weight LLMs, CoRA successfully reduced the confidence-rationale alignment error by up to 26.51% compared to untuned checkpoints, SFT, and correctness-only GRPO. The method also maintained competitive accuracy and frequently improved calibration, demonstrating that reliable CoT reasoning necessitates rationales that genuinely support confident answers.

Key takeaway

For Machine Learning Engineers deploying Chain-of-Thought (CoT) LLMs, prioritize confidence-rationale alignment. High answer confidence is insufficient; rationales must substantively justify it. Implement frameworks like CoRA's GRPO-based approach, which explicitly reward rationale quality alongside correctness and confidence. This ensures your models provide transparent, trustworthy reasoning, reducing misleading outputs in critical applications.

Key insights

Reliable Chain-of-Thought reasoning requires aligning model confidence with the substantive support provided by its generated rationale.

Principles

Method

A GRPO-based reinforcement learning framework jointly rewards answer correctness, committed-answer probability, and rubric-based rationale support, assessing grounding, coherence, task match, and answer connection.

In practice

Topics

Best for: AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.