FIDES: Faithful Inference via Deep Evidence Signals for Retrieval-Memory Conflict in RAG

2026-06-06 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

FIDES (Faithful Inference via Deep Evidence Signals) is a training-free decoder designed to address retrieval-memory conflict in Retrieval-Augmented Generation (RAG). It tackles the issue where language models frequently ignore retrieved context that contradicts parametric memory by applying token-level contrastive pressure. Unlike existing methods that assume uniform bias, FIDES identifies heterogeneous conflict concentration on a small fraction of answer-critical tokens. It fuses three internal signals—Opposition (output surface), Shift (hidden representations), and Noise (prediction trajectory)—to dynamically govern intervention strength at each decoding step. Across three benchmarks and six backbones (7B to 70B), FIDES achieved the best context fidelity in all 18 settings, outperforming the strongest training-free baseline (AdaCAD) by +3 to +13 points. On LLaMA3-70B, fidelity reached 92–94% and F1 surged to 62–63%, with only +8%–+11% overhead over CAD.

Key takeaway

For AI Scientists and ML Engineers developing RAG systems, FIDES offers a robust, training-free method to significantly enhance context fidelity, especially when retrieved evidence contradicts model memory. By dynamically adjusting contrastive pressure at the token level, you can prevent stubborn hallucinations and improve overall generation quality, even with large models like LLaMA3-70B. Consider integrating FIDES to ensure your RAG applications consistently follow provided context, particularly in high-stakes factual domains.

Key insights

Retrieval-memory conflict in RAG is token-level, requiring dynamic, deep-signal-driven contrastive decoding for faithfulness.

Principles

Parametric bias is heterogeneous across tokens.
Deep internal signals reveal conflict intensity.
Token-level contrast improves fidelity and utility.

Method

FIDES runs dual context/no-context passes, extracts Opposition (JSD), Shift (L2 distance of hidden states), and Noise (KL divergence of intermediate logits), fuses them, and maps to a token-specific contrastive coefficient α_t.

In practice

Implement dual-path decoding for conflict detection.
Monitor JSD, L2 distance, and KL divergence for token-level risk.
Apply adaptive contrastive weights based on signal fusion.

Topics

Retrieval-Augmented Generation
Contrastive Decoding
Hallucination Mitigation
Context Fidelity
Large Language Models
Token-level Control

Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.