CREDENCE: Claim Reduction for Decomposition & Enhanced Credibility -- Semantic Metrics and Convergence Analysis

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

Credence is a new framework for decomposing compound sentences into atomic, verifiable claims, crucial for automated fact-checking. It addresses limitations of prior methods, which relied on token-overlap (Jaccard) metrics that underestimated paraphrastic claim quality and lacked formal termination analysis for repair loops. Credence introduces Semantic-F1, a BGE-large cosine similarity fidelity metric, which improves downstream fact-checking accuracy by +15-32pp over Jaccard-F1. The framework also provides convergence theorems, formally characterizing rule-based repair as monotone and finitely terminating, and LLM-based self-repair as non-monotone, requiring an early-exit guard. Evaluated across three benchmarks (SocialClaimSplit, WikiSplitBench, ClaimDecompBench) and four decomposer models (3.8B-12B), Credence demonstrates robust performance, with rule-repair reducing Atomicity Violation Rate by 47-100% without degrading fidelity.

Key takeaway

For NLP Engineers developing automated fact-checking systems, your current reliance on token-overlap metrics like Jaccard-F1 for claim decomposition quality may be significantly underestimating performance, especially with paraphrastic claims. You should integrate semantic similarity metrics, such as Credence's Semantic-F1, to accurately assess decomposition fidelity. Furthermore, when designing iterative repair pipelines, formally characterize their convergence properties and implement early-exit guards for LLM-based self-repair to ensure reliability and termination.

Key insights

Credence improves automated fact-checking by semantically evaluating decomposed claims and formalizing repair loop convergence.

Principles

Method

Credence uses a BGE-large cosine similarity (Semantic-F1) for claim fidelity, combined with rule-based or LLM-based repair loops, formally analyzed for convergence and atomicity.

In practice

Topics

Best for: Research Scientist, AI Scientist, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.