Assistance to Autonomy: A Systematic Literature Review of Agentic AI across the Software Development Life Cycle
Summary
A systematic literature review of agentic AI in software product development, based on 92 primary studies from over 1600 candidates, reveals that output verifiability is the core enabler of adoption. The review, which focused on publications from 2023 onwards, found that later Software Development Life Cycle (SDLC) phases like Testing & QA, Maintenance, and Deployment show the highest maturity and industrial presence due to their objectively evaluable outputs. Earlier phases, such as Requirements Analysis and Design & Architecture, remain largely academic proofs-of-concept. The dominant architectural pattern identified is the Planner–Executor–Reviewer role specialization, where the Reviewer agent provides executable feedback for verifiability. Industrial mitigation strategies for challenges like non-determinism and scalability consistently involve confining agent actions to bounded, verifiable spaces. The study also introduces a multi-agent screening pipeline, using models like gpt-5.4 and gpt-5.2, which achieved an estimated false-negative rate of approximately 1%.
Key takeaway
For software engineering teams evaluating agentic AI integration, prioritize phases like Testing & QA or Deployment where outputs are objectively verifiable. You should design agentic systems with explicit feedback mechanisms, such as the Planner–Executor–Reviewer pattern, to enable self-correction and ensure reliability. Confine agent actions to bounded, verifiable spaces to mitigate non-determinism and hallucinations, enhancing trust and industrial adoption. Consider hybrid vector-graph RAG for robust context management in complex codebases.
Key insights
Output verifiability is the primary driver for agentic AI adoption and maturity in software development.
Principles
- Agentic AI adoption thrives where outputs are objectively verifiable.
- Decompose complex tasks into specialized Planner–Executor–Reviewer roles.
- Confine agent actions to bounded, verifiable spaces for reliability.
Method
The study developed a multi-agent screening pipeline for systematic literature reviews, using role-specific LLM agents (Assistant, Evaluator) for iterative dialogue, conflict resolution, and automatic metadata curation to minimize false negatives.
In practice
- Prioritize agentic AI integration in SDLC phases with clear, executable feedback.
- Implement Planner–Executor–Reviewer architectures for multi-agent systems.
- Use hybrid vector-graph RAG for context-aware agent reasoning over codebases.
Topics
- Agentic AI
- Software Development Life Cycle
- Multi-agent Systems
- Output Verifiability
- Systematic Literature Review
- LLM Agents
Best for: AI Architect, AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Software Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.