not much happened today

2026-05-29 · Source: AINews · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Cloud Computing & IT Infrastructure · Depth: Advanced, long

Summary

The AI news recap for May 28-29, 2026, highlights several key developments across proprietary and open-source AI. Anthropic rolled out Claude Opus 4.8, showing incremental gains in coding and platform-level changes like mid-conversation system instructions, though benchmarks were mixed and pricing remains a concern. In agent infrastructure, a critical "Token-In, Token-Out" bug was identified in multi-turn RL training loops, and new work on Effective Feedback Compute (EFC) demonstrated harness quality's significant impact on agent success. Open-weight models continue to gain momentum, now used by 1 in 3 AI teams and lagging frontier models by about four months, with llama.app launching for easier local deployment. Google expanded its managed agent stack with Gemini Spark and Flow Agent, while OpenAI enhanced Codex with Windows support and mobile remote steering, signaling a trend towards vertically integrated agent environments. Additionally, a Starlette vulnerability (CVE-2026-48710) exposed LLM infrastructure to risks, and new research explored bidirectional search, continual learning, and multimodal world models.

Key takeaway

For AI Engineers deploying agentic systems, you should prioritize robust infrastructure and vigilant cost management. Verify your RL training loops adhere to a "Token-In, Token-Out" rule to avoid subtle re-tokenization bugs, and actively monitor prompt cache hit-rates given Anthropic's reduced TTL. Furthermore, immediately check your Starlette dependency for CVE-2026-48710 to mitigate critical security vulnerabilities in your LLM infrastructure, especially for internet-exposed services.

Key insights

Agentic AI development is converging on integrated stacks, robust infrastructure, and careful token economics for performance and cost.

Principles

"Token-In, Token-Out" prevents RL re-tokenization bugs.
Harness quality significantly impacts agent success (R² up to 0.99).
Local-first inference is critical for responsive voice agents.

Method

The proposed fix for multi-turn RL bugs is a strict "Token-In, Token-Out" rule, maintaining a single token buffer across turns to prevent re-encoding sampled tokens and misapplying gradients.

In practice

Audit prompt cache hit-rates due to reduced TTL (5 min).
Use billing caps/alerts to prevent runaway token consumption.
Check Starlette versions for CVE-2026-48710 exposure.

Topics

AI Agents
Large Language Models
Local AI
RL Training
API Security
Token Economics
Claude Opus

Code references

ggml-org/llama.cpp

Best for: CTO, VP of Engineering/Data, Director of AI/ML, Machine Learning Engineer, AI Engineer, AI Scientist

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AINews.