EGTR-Review: Efficient Evidence-Grounded Scientific Peer Review Generation via Multi-Agent Teacher Distillation
Summary
EGTR-Review is a novel framework designed for efficient, evidence-grounded, and traceable scientific peer review generation, addressing issues like generic comments, weak source traceability, and high inference costs in existing Large Language Model (LLM)-based methods. The framework first constructs a multi-agent teacher that performs structure-aware paper decomposition, key-element extraction, external scholarly evidence retrieval, evidence-state labeling, verification reasoning, and review synthesis. This teacher then distills both intermediate reasoning trajectories and final review comments into a lightweight student model using task-prefix-driven multi-task learning and an evidence-weighted objective. Experiments on public peer-review datasets demonstrate that EGTR-Review (Student) surpasses strong prompt-based, fine-tuned, and structured/agentic baselines across automatic metrics, LLM-as-Judge evaluation, and human evaluation. It achieves this while maintaining strong factual grounding and source traceability, alongside substantially lower token consumption and inference time.
Key takeaway
For AI Scientists and Machine Learning Engineers developing automated peer review systems, EGTR-Review offers a path to significantly improve review quality and efficiency. You should consider adopting its multi-agent teacher distillation approach to generate evidence-grounded and traceable feedback. This method reduces inference costs while maintaining high factual accuracy, allowing your systems to provide more specific and useful auxiliary feedback to human reviewers.
Key insights
EGTR-Review distills a multi-agent teacher's evidence-grounded reasoning into an efficient student model for scientific peer review.
Principles
- Decompose papers into locatable units.
- Ground reviews in external scholarly evidence.
- Distill reasoning trajectories, not just outputs.
Method
A multi-agent teacher performs paper decomposition, key-element extraction, evidence retrieval, state labeling, verification, and synthesis. This is distilled into a student via task-prefix-driven multi-task learning.
In practice
- Integrate external scholarly evidence for grounding.
- Segment long papers for traceability.
- Apply multi-task learning for process distillation.
Topics
- Scientific Peer Review
- Multi-Agent Systems
- Knowledge Distillation
- Large Language Models
- Evidence-Grounded Generation
- Natural Language Generation
Best for: AI Scientist, Machine Learning Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.