Explicit Evidence Grounding via Structured Inline Citation Generation

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

FullCite, a new framework, addresses the critical need for factual and faithful AI generation by introducing structured inline citations. Unlike prior methods, FullCite links each generated claim to its specific source document and supporting evidence. It employs three distinct strategies for inline citation generation: prompt-based generation, constrained decoding using a citation grammar, and posthoc span alignment. The framework was evaluated across three question answering benchmarks: ASQA, BioASQ, and ExpertQA. Assessment focused on document-level correctness, evidence span identification, and claim-citation faithfulness. The evaluation revealed that while Large Language Models (LLMs) effectively identify relevant source documents, they consistently struggle with pinpointing the precise supporting spans within those documents. This highlights a significant gap, indicating that future research must prioritize accurate evidence span identification to achieve truly faithful attributed question answering.

Key takeaway

For NLP Engineers developing factual generation or question answering systems, you must prioritize robust evidence span identification. While Large Language Models effectively find relevant documents, their struggle with precise span attribution means your systems risk generating unfaithful claims. Implement techniques like constrained decoding or posthoc alignment to improve citation granularity, ensuring your AI output is verifiably grounded.

Key insights

LLMs excel at document retrieval but struggle with precise evidence span identification for faithful inline citations.

Principles

Method

FullCite generates structured inline citations using prompt-based generation, constrained decoding with a citation grammar, and posthoc span alignment to link claims to sources and evidence.

In practice

Topics

Best for: Research Scientist, AI Scientist, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.