Researchers say they trained a foundation model from scratch for about $1,500

· Source: VentureBeat · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Advanced, medium

Summary

Sapient researchers have developed HRM-Text, a 1-billion-parameter foundation model trained from scratch for approximately \$1,500. This Hierarchical Recurrent Model (HRM) architecture, introduced in 2025, replaces standard Transformers by decoupling computation into slow-evolving strategic and fast-evolving execution layers. HRM-Text trains exclusively on 40 billion instruction-response pairs, a significant departure from the trillions of tokens used by conventional LLMs. The model achieved competitive performance against 2B to 7B parameter open models on benchmarks like MMLU (60.7%), GSM8K (84.5%), and MATH (56.2%). This efficiency, achieved in 1.9 days on 16 GPUs, aims to make foundational model pretraining accessible and affordable for enterprises, enabling them to create specialized reasoning cores.

Key takeaway

For AI engineering teams evaluating custom foundation model development, HRM-Text presents a compelling alternative to large, general-purpose LLMs. You can now affordably pretrain a 1B-parameter reasoning core for around \$1,500, significantly reducing the "economics of iteration" and infrastructure burden. Consider experimenting with HRM-Text to build specialized models that learn your specific task structures and integrate with proprietary data, rather than relying on expensive, internet-memorizing models.

Key insights

HRM-Text offers a cost-effective, sample-efficient architecture for training specialized 1B-parameter reasoning models from scratch.

Principles

Method

HRM-Text uses MagicNorm for signal stability and a warm-up method for gradual reasoning sequence depth during training, focusing on task completion with instruction-response pairs.

In practice

Topics

Code references

Best for: Research Scientist, CTO, AI Engineer, AI Scientist, Machine Learning Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.