To See the Unseen: on the Generalization Ability of Transformers in Symbolic Reasoning
Summary
A study published on April 23, 2026, investigates the generalization capabilities of decoder-only transformer models in abstract symbolic reasoning, specifically for propositional logic problems. Previous research indicated that these models struggle with variable names not encountered during training, partly due to difficulties in copying unseen tokens. This work identifies a crucial representational collapse where the unembeddings (last-layer weights) of unseen tokens converge to similar vectors, making it hard for models to distinguish multiple unseen variables, especially with shared embedding and unembedding parameters. This collapse provides a mechanistic explanation for the effectiveness of interventions like "active forgetting." The researchers propose a combination of techniques, including a minor architectural change to improve copying, increased data diversity, and freezing or resetting (un)embeddings, which successfully enables generalization to unseen tokens. Evidence of this (un)embedding collapse was also observed in the Gemma 3 family of open-weight models, where 99 unused tokens showed correlated embeddings, leading to poor initialization for finetuning.
Key takeaway
For research scientists developing or finetuning transformer models for symbolic reasoning tasks, understanding the representational collapse of unseen token unembeddings is critical. You should consider implementing architectural modifications to facilitate token copying, enhance data diversity, and strategically freeze or reset (un)embeddings to improve generalization to novel variables. This approach can mitigate issues observed in models like Gemma 3, where correlated unused token embeddings lead to suboptimal finetuning performance.
Key insights
Transformer models exhibit representational collapse of unseen token unembeddings, hindering symbolic reasoning generalization.
Principles
- Unseen token unembeddings collapse to similar vectors.
- Shared embedding/unembedding parameters exacerbate collapse.
Method
A combination of architectural changes for copying, data diversity, and freezing/resetting (un)embeddings improves generalization to unseen tokens in symbolic reasoning.
In practice
- Reset or freeze token (un)embeddings.
- Increase data diversity for training.
- Consider architectural changes for copying.
Topics
- Transformers
- Symbolic Reasoning
- Generalization Ability
- Representational Collapse
- Unseen Tokens
Best for: Research Scientist, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.