[R] Empirical evidence for a primitive layer in small language models — 18 experiments across 4 architectures
Summary
Researchers conducted 18 experiments across four small language model architectures (Qwen 2.5, Gemma 3, LLaMA 3.2, SmolLM2) ranging from 360M to 1B parameters. The study identified a consistent activation gap between "scaffolding primitives" (Layer 0a: SOMEONE, TIME, PLACE) and "content primitives" (Layer 0b: FEAR, GRIEF, JOY, ANGER), with Layer 0b consistently showing higher activation. This gap averaged +0.245 across all models. Furthermore, 11 pre-registered primitive compositions, such as WANT + GRIEF → longing/yearning, matched predicted Layer 1 concepts in three out of four models. The study also observed that this activation gap was largest in the smallest models and narrowed with increasing scale, suggesting larger models develop phenomenological access to scaffolding primitives. All experiments are reproducible locally using Ollama, with code and data available.
Key takeaway
For research scientists investigating language model architecture and emergent capabilities, this work suggests that even small LLMs possess a primitive semantic layer. You should consider exploring the identified activation gap between scaffolding and content primitives, as it may offer insights into how larger models achieve capability jumps. The reproducible experiments provide a foundation for further investigation into linguistic relationships and meaning representation.
Key insights
Small language models exhibit a consistent activation gap between scaffolding and content semantic primitives.
Principles
- Content primitives activate more strongly than scaffolding primitives.
- Primitive compositions can predict higher-level concepts.
Method
Probing small LLMs (360M-1B params) with random phonemes to universal semantic primitives, measuring activation gaps between primitive types, and testing pre-registered primitive compositions against predicted concepts.
In practice
- Reproduce experiments locally via Ollama.
- Explore primitive compositions for concept generation.
Topics
- Small Language Models
- Semantic Primitives
- Neural Activation
- Language Model Scaling
- Natural Semantic Metalanguage
Code references
Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.