TraceBack: Multi-Agent Decomposition for Fine-Grained Table Attribution
Summary
TraceBack is a modular multi-agent framework designed to provide scalable, cell-level attribution for question answering (QA) over structured tables. It addresses the current limitation of existing table QA systems, which often lack fine-grained attribution, thereby reducing trust in high-stakes applications. TraceBack operates by pruning tables to relevant rows and columns, decomposing complex questions into semantically coherent sub-questions, and aligning each answer span with its supporting cells. This process captures both explicit and implicit evidence used during intermediate reasoning steps. To facilitate systematic evaluation, the authors introduce CITEBench, a benchmark with phrase-to-cell annotations derived from ToTTo, FetaQA, and AITQA. They also propose FairScore, a reference-less metric that assesses attribution precision and recall by comparing atomic facts from predicted cells and answers, eliminating the need for human cell labels. Experiments demonstrate that TraceBack significantly outperforms strong baselines across various datasets and granularities, while FairScore accurately reflects human judgments and maintains consistent method rankings.
Key takeaway
For research scientists developing or evaluating table-based QA systems, TraceBack offers a robust approach to enhance transparency and trust through fine-grained cell attribution. You should consider integrating multi-agent decomposition and the proposed FairScore metric to improve both the interpretability of your models and the scalability of your evaluation processes. This framework directly addresses the critical need for verifiable grounding in high-stakes QA applications.
Key insights
TraceBack provides fine-grained, cell-level attribution for table QA using a multi-agent decomposition framework.
Principles
- Decompose complex questions into sub-questions.
- Align answer spans with supporting cells.
- Evaluate attribution without human cell labels.
Method
TraceBack prunes tables, decomposes questions, and aligns answer spans with supporting cells to capture explicit and implicit evidence for fine-grained attribution in single-table QA.
In practice
- Use CITEBench for systematic evaluation.
- Apply FairScore for reference-less attribution metrics.
Topics
- Table Question Answering
- Fine-Grained Attribution
- Multi-Agent Systems
- Evaluation Metrics
- Benchmarking
Best for: Research Scientist, AI Researcher, AI Scientist, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.