Agentic Discovery of Neural Architectures: AIRA-Compose and AIRA-Design

2026-05-18 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Expert, extended

Summary

Meta's FAIR team introduces AIRA-Compose and AIRA-Design, dual frameworks leveraging LLM agents for autonomous neural architecture discovery and optimization, aiming for recursive self-improvement. AIRA-Compose employs 11 agents to search a combinatorial space of computational primitives (Attention, MLP, Mamba) within a 24-hour compute budget, yielding 14 novel architectures. These include AIRAformers (Transformer-based) and AIRAhybrids (Transformer-Mamba-based), which at 1B parameters, outperform Llama 3.2 and Composer-found alternatives by up to 3.8% accuracy on downstream tasks. AIRA-Compose also identifies architectures like AIRAformer-C and AIRAhybrid-C that scale 54-71% and 23-37% faster, respectively. AIRA-Design tasks up to 20 agents with writing novel attention mechanisms and optimizing training scripts. On the Long Range Arena (LRA) benchmark, agent-designed architectures achieve accuracy within 2.3-2.6% of human SOTA, and on the Autoresearch benchmark, Greedy Opus 4.5 surpasses the published minimum reference with a 0.968 validation bits-per-byte.

Key takeaway

For research scientists focused on next-generation foundation model design, these agentic frameworks offer a powerful paradigm. You should consider integrating LLM-powered agents into your architecture search and optimization workflows to discover novel, high-performing hybrid models and achieve more efficient scaling, potentially accelerating recursive self-improvement efforts. The findings suggest that agent-driven methods can yield competitive designs that rival or surpass human-designed baselines.

Key insights

LLM agents can autonomously discover and optimize neural architectures and training methods, surpassing human-designed baselines.

Principles

Agent-driven search navigates vast combinatorial design spaces efficiently.
Hybrid architectures combining Attention, MLP, and SSMs offer superior performance.
Iterative refinement is crucial for low-level code generation tasks.

Method

A dual-framework approach: AIRA-Compose for high-level architecture search using predefined primitives, and AIRA-Design for low-level mechanistic implementation and training script optimization.

In practice

Explore hybrid Transformer-Mamba architectures for improved scaling.
Utilize agentic frameworks for automated hyperparameter tuning.
Implement iterative debugging for complex code generation tasks.

Topics

Agentic AI Research
Neural Architecture Search
Foundation Models
Hybrid LLM Architectures
Recursive Self-Improvement

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.