[AINews] The End of Finetuning

2026-05-13 · Source: Latent.Space - Www.latent.space · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Cybersecurity & Data Privacy · Depth: Expert, long

Summary

OpenAI has deprecated its finetuning APIs, a move that signals a shift in the AI engineering landscape, despite previous promotion of finetuning as a key toolkit component. This change occurs as Anthropic's valuation potentially surpasses OpenAI's and amidst an extreme GPU crunch. While finetuning may be declining for the general industry, top-tier companies like Cursor and Cognition are increasing their use of open model RLFT. The article also covers advancements in AI research, including new reasoning benchmarks like Soohak's 439 math problems and Medmarks v1.0, agentic systems like Google DeepMind's AI Co-Mathematician, and specialized retrieval models. Further sections detail progress in training optimization, scaling laws, inference systems (e.g., Blackwell racks for MoE serving), and new model releases such as Perceptron Mk1 for video reasoning and Jina's jina-embeddings-v5-omni. Operational security concerns are highlighted by the Mini Shai-Hulud supply-chain attack targeting AI developer tooling.

Key takeaway

For AI Architects evaluating model deployment strategies, OpenAI's finetuning deprecation suggests a pivot towards prompt engineering or larger, more capable base models for general use cases. However, if your team is building frontier applications or leveraging open-source models, investing in RLFT and custom ASIC solutions may still yield significant performance and cost advantages, especially for long-context or specialized tasks. You should assess whether your specific application truly benefits from finetuning or if alternative methods like advanced prompting or agentic orchestration are more efficient.

Key insights

Finetuning is declining for general AI engineering but remains critical for top-tier applications and open models.

Principles

Benchmarks require continuous evolution to challenge frontier models.
Small, specialized models excel in retrieval tasks when paired with generators.

Method

Agentic systems can decompose complex problems into specialized tasks, iteratively refining queries and leveraging external tools for enhanced performance in science and math.

In practice

Consider aggressive GPU power caps for efficiency in local inference.
Use small, distilled models as routers for larger LLMs to optimize costs.

Topics

Finetuning Deprecation
AI Agent Systems
LLM Inference Optimization
Research Benchmarking
Multimodal AI Models

Code references

Best for: CTO, AI Architect, NLP Engineer, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Latent.Space - Www.latent.space.