[R] Empirical evidence for a primitive layer in small language models — 18 experiments across 4 architectures

· Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Advanced, medium

Summary

Researchers conducted 18 experiments across four small language model architectures (Qwen 2.5, Gemma 3, LLaMA 3.2, SmolLM2) ranging from 360M to 1B parameters. The study identified a consistent activation gap between "scaffolding primitives" (Layer 0a: SOMEONE, TIME, PLACE) and "content primitives" (Layer 0b: FEAR, GRIEF, JOY, ANGER), with Layer 0b consistently showing higher activation. This gap averaged +0.245 across all models. Furthermore, 11 pre-registered primitive compositions, such as WANT + GRIEF → longing/yearning, matched predicted Layer 1 concepts in three out of four models. The study also observed that this activation gap was largest in the smallest models and narrowed with increasing scale, suggesting larger models develop phenomenological access to scaffolding primitives. All experiments are reproducible locally using Ollama, with code and data available.

Key takeaway

For research scientists investigating language model architecture and emergent capabilities, this work suggests that even small LLMs possess a primitive semantic layer. You should consider exploring the identified activation gap between scaffolding and content primitives, as it may offer insights into how larger models achieve capability jumps. The reproducible experiments provide a foundation for further investigation into linguistic relationships and meaning representation.

Key insights

Small language models exhibit a consistent activation gap between scaffolding and content semantic primitives.

Principles

Method

Probing small LLMs (360M-1B params) with random phonemes to universal semantic primitives, measuring activation gaps between primitive types, and testing pre-registered primitive compositions against predicted concepts.

In practice

Topics

Code references

Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.