DeepSlide: From Artifacts to Presentation Delivery
Summary
DeepSlide is a human-in-the-loop multi-agent system designed to optimize the entire presentation delivery process, moving beyond just generating visually plausible slide decks. Developed by researchers at Fudan University, DeepSlide addresses key gaps in existing AI slide generators by focusing on narrative strategy, delivery-time attention guidance, and rehearsal support. It integrates a controllable logical-chain planner with per-node time budgets, a lightweight content-tree retriever for grounding, Markov-style sequential rendering with style inheritance, and sandboxed execution for renderability. The system also introduces a dual-scoreboard benchmark to evaluate both static artifact quality and dynamic delivery excellence. Across 20 domains and diverse audience profiles, DeepSlide matches strong baselines in artifact quality while significantly improving delivery metrics like narrative flow, pacing precision, and slide–script synergy.
Key takeaway
For research scientists preparing presentations, DeepSlide offers a comprehensive approach to enhance delivery quality beyond mere slide creation. You should consider adopting systems that provide granular control over narrative flow, offer content-aware attention guidance during delivery, and include robust rehearsal support. Prioritize tools that allow for time-budgeted planning and offer feedback on pacing and audience engagement, as these features are critical for transforming a good deck into an effective talk.
Key insights
DeepSlide is a multi-agent system optimizing presentation delivery, not just slide generation, through narrative planning and rehearsal support.
Principles
- Presentation quality hinges on delivery, not just visual appeal.
- Narrative strategy is a controllable design space.
- Content-aware attention guidance improves audience engagement.
Method
DeepSlide employs a four-stage pipeline: requirement elicitation, logical chain editing with evidence-grounded generation, interactive slide refinement with attention augmentation, and rehearsal support with dual-scoreboard evaluation.
In practice
- Use time-budgeted narrative logical chains for planning.
- Implement content-tree indexing for targeted evidence retrieval.
- Augment slides with dynamic attention controls like image focus.
Topics
- Presentation Delivery System
- Multi-Agent AI
- Human-in-the-Loop
- Dual-Scoreboard Benchmark
- Content-Tree Retrieval
Best for: Research Scientist, AI Scientist, AI Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.