Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution

2026-05-18 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

Solvita is an agentic evolution framework designed to enhance large language models (LLMs) for competitive programming by enabling continuous learning without requiring weight updates to the underlying LLM. It reorganizes problem-solving into a closed-loop system involving four specialized agents: Planner, Solver, Oracle, and Hacker. Each agent is paired with a trainable, graph-structured knowledge network. Outcome signals, such as pass/fail verdicts, test certification quality, and adversarial vulnerabilities, are recast as reinforcement learning updates to these network weights, allowing agents to dynamically route future queries based on past successes and failures. Evaluated across CodeContests, APPS, AetherCode, and live Codeforces rounds, Solvita establishes a new state-of-the-art among code-generation agents, outperforming existing multi-agent pipelines and nearly doubling the accuracy of single-pass baselines, achieving a pass@1 accuracy of 82.4% on CodeContests with a GPT-5.4 backbone.

Key takeaway

For research scientists developing advanced code generation systems, Solvita demonstrates that integrating a multi-agent framework with trainable, graph-structured knowledge networks can significantly boost performance in competitive programming. You should consider adopting similar agentic evolution principles, particularly the use of reinforcement learning from execution feedback, to enable continuous improvement in frozen LLMs, potentially achieving "Legendary Grandmaster" level performance in coding challenges.

Key insights

Solvita enhances LLMs for competitive programming through a multi-agent, reinforcement learning framework with continuous, experience-driven knowledge networks.

Principles

Continuous learning without LLM weight updates.
Specialized agents with graph-structured knowledge networks.
Reinforcement learning from outcome signals.

Method

Solvita employs a closed-loop system of strategy selection (Planner), program synthesis (Solver), certified supervision (Oracle), and targeted hacking (Hacker), with each agent's knowledge network updated via reinforcement learning based on problem outcomes.

In practice

Implement patch-based repair for efficient code debugging.
Use adversarial testing to expose subtle code bugs.
Structure agent memory as a learned routing mechanism.

Topics

Solvita
Agentic Evolution Framework
Competitive Programming
Graph-Structured Knowledge Networks
Reinforcement Learning

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.