Assistance to Autonomy: A Systematic Literature Review of Agentic AI across the Software Development Life Cycle

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Advanced, extended

Summary

A systematic literature review of agentic AI in software product development, based on 92 primary studies from over 1600 candidates, reveals that output verifiability is the core enabler of adoption. The review, which focused on publications from 2023 onwards, found that later Software Development Life Cycle (SDLC) phases like Testing & QA, Maintenance, and Deployment show the highest maturity and industrial presence due to their objectively evaluable outputs. Earlier phases, such as Requirements Analysis and Design & Architecture, remain largely academic proofs-of-concept. The dominant architectural pattern identified is the Planner–Executor–Reviewer role specialization, where the Reviewer agent provides executable feedback for verifiability. Industrial mitigation strategies for challenges like non-determinism and scalability consistently involve confining agent actions to bounded, verifiable spaces. The study also introduces a multi-agent screening pipeline, using models like gpt-5.4 and gpt-5.2, which achieved an estimated false-negative rate of approximately 1%.

Key takeaway

For software engineering teams evaluating agentic AI integration, prioritize phases like Testing & QA or Deployment where outputs are objectively verifiable. You should design agentic systems with explicit feedback mechanisms, such as the Planner–Executor–Reviewer pattern, to enable self-correction and ensure reliability. Confine agent actions to bounded, verifiable spaces to mitigate non-determinism and hallucinations, enhancing trust and industrial adoption. Consider hybrid vector-graph RAG for robust context management in complex codebases.

Key insights

Output verifiability is the primary driver for agentic AI adoption and maturity in software development.

Principles

Method

The study developed a multi-agent screening pipeline for systematic literature reviews, using role-specific LLM agents (Assistant, Evaluator) for iterative dialogue, conflict resolution, and automatic metadata curation to minimize false negatives.

In practice

Topics

Best for: AI Architect, AI Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.