Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering · Depth: Expert, extended

Summary

Solvita is an agentic evolution framework designed to enhance large language models (LLMs) for competitive programming by enabling continuous learning without requiring weight updates to the underlying LLM. It reorganizes problem-solving into a closed-loop system involving four specialized agents: Planner, Solver, Oracle, and Hacker. Each agent is paired with a trainable, graph-structured knowledge network. Outcome signals, such as pass/fail verdicts, test certification quality, and adversarial vulnerabilities, are recast as reinforcement learning updates to these network weights, allowing agents to dynamically route future queries based on past successes and failures. Evaluated across CodeContests, APPS, AetherCode, and live Codeforces rounds, Solvita establishes a new state-of-the-art among code-generation agents, outperforming existing multi-agent pipelines and nearly doubling the accuracy of single-pass baselines, achieving a pass@1 accuracy of 82.4% on CodeContests with a GPT-5.4 backbone.

Key takeaway

For research scientists developing advanced code generation systems, Solvita demonstrates that integrating a multi-agent framework with trainable, graph-structured knowledge networks can significantly boost performance in competitive programming. You should consider adopting similar agentic evolution principles, particularly the use of reinforcement learning from execution feedback, to enable continuous improvement in frozen LLMs, potentially achieving "Legendary Grandmaster" level performance in coding challenges.

Key insights

Solvita enhances LLMs for competitive programming through a multi-agent, reinforcement learning framework with continuous, experience-driven knowledge networks.

Principles

Method

Solvita employs a closed-loop system of strategy selection (Planner), program synthesis (Solver), certified supervision (Oracle), and targeted hacking (Hacker), with each agent's knowledge network updated via reinforcement learning based on problem outcomes.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.