Researchers say they trained a foundation model from scratch for about $1,500

2026-06-10 · Source: VentureBeat · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Advanced, medium

Summary

Sapient researchers have developed HRM-Text, a 1-billion-parameter foundation model trained from scratch for approximately \$1,500. This Hierarchical Recurrent Model (HRM) architecture, introduced in 2025, replaces standard Transformers by decoupling computation into slow-evolving strategic and fast-evolving execution layers. HRM-Text trains exclusively on 40 billion instruction-response pairs, a significant departure from the trillions of tokens used by conventional LLMs. The model achieved competitive performance against 2B to 7B parameter open models on benchmarks like MMLU (60.7%), GSM8K (84.5%), and MATH (56.2%). This efficiency, achieved in 1.9 days on 16 GPUs, aims to make foundational model pretraining accessible and affordable for enterprises, enabling them to create specialized reasoning cores.

Key takeaway

For AI engineering teams evaluating custom foundation model development, HRM-Text presents a compelling alternative to large, general-purpose LLMs. You can now affordably pretrain a 1B-parameter reasoning core for around \$1,500, significantly reducing the "economics of iteration" and infrastructure burden. Consider experimenting with HRM-Text to build specialized models that learn your specific task structures and integrate with proprietary data, rather than relying on expensive, internet-memorizing models.

Key insights

HRM-Text offers a cost-effective, sample-efficient architecture for training specialized 1B-parameter reasoning models from scratch.

Principles

Decouple computation into strategic and execution layers.
Train on instruction-response pairs, not raw text.
Focus on task completion, not next-token prediction.

Method

HRM-Text uses MagicNorm for signal stability and a warm-up method for gradual reasoning sequence depth during training, focusing on task completion with instruction-response pairs.

In practice

Pretrain custom reasoning models for proprietary data.
Pair compact models with external knowledge stores.
Reduce infrastructure and vendor dependency costs.

Topics

Hierarchical Recurrent Models
HRM-Text
Foundation Model Training
Low-Cost AI
Enterprise AI
Instruction Tuning
Model Architecture

Code references

sapientinc/HRM-Text

Best for: Research Scientist, CTO, AI Engineer, AI Scientist, Machine Learning Engineer, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by VentureBeat.