OpenAI's GPT-5.6 Sol launches to rival Claude Mythos under government access rules it calls unsustainable
Summary
OpenAI has launched its new GPT-5.6 generation, featuring the flagship Sol, alongside cheaper Terra and Luna tiers. GPT-5.6 Sol claims to match or surpass Anthropic's Claude Mythos 5 across various benchmarks, notably leading in agentic coding with an 88.8 percent score on Terminal-Bench 2.1 (Sol Ultra reaching 91.9 percent) compared to Mythos 5's 88 percent. Sol also demonstrates improved token efficiency in cybersecurity on ExploitBench, matching Mythos Preview's performance using roughly a third of output tokens. Despite these advancements, access to GPT-5.6 Sol is currently restricted to select partners by the US government, a policy OpenAI publicly criticizes as detrimental to developers and businesses. The new models introduce a tiered pricing structure, with Sol costing \$5 input and \$30 output per million tokens, and are slated for a July launch on Cerebras, offering up to 750 tokens per second.
Key takeaway
For Directors of AI/ML evaluating next-generation models, OpenAI's GPT-5.6 Sol presents a compelling option, outperforming Claude Mythos 5 in agentic coding and demonstrating superior token efficiency in cybersecurity. While its \$5 input/\$30 output pricing per million tokens might seem high, its efficiency could lower your effective task costs. However, current US government restrictions mean you cannot widely deploy Sol yet. Monitor policy developments closely to capitalize on its capabilities once broader access becomes available.
Key insights
OpenAI's GPT-5.6 Sol rivals Claude Mythos in performance, but government restrictions limit its immediate availability.
Principles
- Agentic coding and cybersecurity are key AI model benchmarks.
- Token efficiency reduces effective LLM operational costs.
- Government policy can restrict advanced AI model access.
Method
GPT-5.6 offers "max" mode for deeper reasoning and "ultra" mode, which dispatches complex tasks to parallel sub-agents for enhanced performance.
In practice
- Utilize Sol for agentic coding tasks.
- Deploy Sol for cybersecurity flaw detection.
- Evaluate Sol's token efficiency for cost savings.
Topics
- OpenAI GPT-5.6 Sol
- Claude Mythos
- AI Benchmarking
- Agentic Coding
- Cybersecurity AI
- AI Regulation
Best for: AI Engineer, Machine Learning Engineer, Investor, Tech Journalist, AI Scientist, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by The Decoder.