GEMS: Geometric Constraints Enable Multi-Semantic Superposition in LLMs
Summary
GEMS, a novel training-free method, enables multi-semantic superposition in large language models by addressing two independent sources of collapse: distributional deviation and directional interference. This approach utilizes geometric constraints, including norm-preserving weighted superposition, targeted o_proj attention-pathway injection, and real-time orthogonalization, along with a Gaussian envelope for inter-layer strength modulation. On the GSM8K benchmark, GEMS maintained 98% accuracy when injecting three concurrent non-mathematical directions, significantly outperforming unconstrained addition which collapsed to 4% (baseline 92%). For language modeling, the same injection on Wikitext-2 incurred only a 2.2% perplexity increase. The method demonstrates qualitative steering effects and transferability across models ranging from 3B to 31B parameters, including Llama-3.2-3B, Qwen3.6-27B, and Gemma-4-31B.
Key takeaway
For machine learning engineers developing LLM applications requiring nuanced, multi-attribute control, GEMS offers a robust, training-free solution. You should integrate its geometric constraints, like norm preservation and orthogonalization, to enable simultaneous steering of multiple semantic directions. This prevents model collapse and preserves core capabilities. Your team can achieve fine-grained behavioral control, balancing factual accuracy with communication style, with minimal overhead.
Key insights
Geometric constraints enable robust multi-directional activation steering in LLMs by preventing norm accumulation and directional interference.
Principles
- Multi-directional steering collapse stems from distributional deviation and directional interference.
- Norm preservation and o_proj injection are prerequisites for stable multi-directional steering.
- Orthogonalization resolves mutual dampening between simultaneously injected semantic vectors.
Method
GEMS applies real-time Gram-Schmidt orthogonalization, norm-constrained weighted superposition at the attention output projection (o_proj), and a Gaussian envelope for inter-layer strength modulation during forward propagation.
In practice
- Inject multiple semantic directions (e.g., empathy, accountability, minimalism) concurrently.
- Preserve core model capabilities like mathematical reasoning during steering.
- Apply geometric constraints to activation steering for architecture-agnostic control.
Topics
- Activation Steering
- Large Language Models
- Geometric Constraints
- Multi-directional Control
- Model Collapse Prevention
- Inference-time Intervention
Best for: Research Scientist, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.