CREDENCE: Claim Reduction for Decomposition & Enhanced Credibility -- Semantic Metrics and Convergence Analysis

2026-06-18 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

Credence is a new framework for decomposing compound sentences into atomic, verifiable claims, crucial for automated fact-checking. It addresses limitations of prior methods, which relied on token-overlap (Jaccard) metrics that underestimated paraphrastic claim quality and lacked formal termination analysis for repair loops. Credence introduces Semantic-F1, a BGE-large cosine similarity fidelity metric, which improves downstream fact-checking accuracy by +15-32pp over Jaccard-F1. The framework also provides convergence theorems, formally characterizing rule-based repair as monotone and finitely terminating, and LLM-based self-repair as non-monotone, requiring an early-exit guard. Evaluated across three benchmarks (SocialClaimSplit, WikiSplitBench, ClaimDecompBench) and four decomposer models (3.8B-12B), Credence demonstrates robust performance, with rule-repair reducing Atomicity Violation Rate by 47-100% without degrading fidelity.

Key takeaway

For NLP Engineers developing automated fact-checking systems, your current reliance on token-overlap metrics like Jaccard-F1 for claim decomposition quality may be significantly underestimating performance, especially with paraphrastic claims. You should integrate semantic similarity metrics, such as Credence's Semantic-F1, to accurately assess decomposition fidelity. Furthermore, when designing iterative repair pipelines, formally characterize their convergence properties and implement early-exit guards for LLM-based self-repair to ensure reliability and termination.

Key insights

Credence improves automated fact-checking by semantically evaluating decomposed claims and formalizing repair loop convergence.

Principles

Semantic similarity metrics are crucial for paraphrastic claim evaluation.
Formal termination analysis is vital for iterative repair pipelines.
LLM-based self-repair requires early-exit mechanisms.

Method

Credence uses a BGE-large cosine similarity (Semantic-F1) for claim fidelity, combined with rule-based or LLM-based repair loops, formally analyzed for convergence and atomicity.

In practice

Use Semantic-F1 for claim decomposition evaluation.
Implement early-exit guards for LLM-based repair.
Apply rule-based repair to reduce atomicity violations.

Topics

Claim Decomposition
Automated Fact-Checking
Semantic-F1 Metric
LLM Repair Pipelines
Evaluation Benchmarks
Convergence Analysis

Best for: Research Scientist, AI Scientist, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.