Enhancing Factuality through Consensus and Consistency in Summarization Using Minimum Bayes Risk Decoding
Summary
ConSUM is a proposed system designed to improve the factuality of model-generated summaries by addressing limitations in existing reranking methods. Unlike approaches that solely use the source document for guidance, ConSUM reranks candidate summaries based on two key factors: consistency with the source document and consensus among other generated candidates. It establishes consensus using Minimum Bayes Risk (MBR) decoding across the set of generated summaries, while ensuring consistency through factuality-aware metrics that compare the summary against its source. Rigorous testing indicates ConSUM is competitive with current methods, and human evaluations further confirm its summaries are preferred. The system's code is publicly available at https://github.com/naist-nlp/ConSUM.
Key takeaway
For NLP Engineers focused on improving summary factuality, ConSUM offers a robust reranking strategy. You should consider integrating Minimum Bayes Risk (MBR) decoding with factuality-aware consistency checks into your summarization pipelines. This approach, validated by human evaluations, can significantly enhance the reliability of your model-generated outputs, moving beyond source-only guidance. Explore the provided code to adapt this method for your specific applications.
Key insights
ConSUM enhances summary factuality by combining source consistency with candidate consensus via MBR reranking.
Principles
- Factuality in summarization benefits from multi-candidate evaluation.
- Reranking can integrate source consistency and candidate consensus.
- MBR decoding is effective for establishing summary consensus.
Method
ConSUM reranks candidate summaries by applying Minimum Bayes Risk (MBR) decoding for consensus and factuality-aware metrics for source consistency.
In practice
- Implement MBR decoding for summary candidate selection.
- Use factuality-aware metrics to check source consistency.
- Explore multi-candidate reranking for improved summary quality.
Topics
- Text Summarization
- Factuality
- Minimum Bayes Risk
- Reranking
- Natural Language Processing
- ConSUM
Code references
Best for: Research Scientist, AI Scientist, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.