The Attention Problem Nobody Has Solved — Until Now?
Summary
On May 5, 2026, the startup Subquadratic (SubQ) emerged from stealth, claiming to have solved the long-standing O(n²) attention cost problem introduced by the 2017 "Attention Is All You Need" paper. This quadratic scaling has historically limited large language model context windows, leading to issues like models losing context in long documents or fragmented RAG. While previous research explored solutions such as sliding windows, global tokens, recurrent state spaces, sparse hybrids, and linear approximations, none fully resolved the issue without significant tradeoffs. SubQ's new mechanism, Subquadratic Sparse Attention (SSA), purports to be content-dependent, exact, capable of attending to any position within a 12-million-token context, and genuinely sub-quadratic end-to-end, potentially marking the most significant architectural advance since the original transformer.
Key takeaway
For AI Architects designing large language model systems, this announcement signals a potential paradigm shift. If SubQ's Subquadratic Sparse Attention (SSA) claims hold, you should monitor its validation closely. A genuinely sub-quadratic attention mechanism could eliminate current context window limitations, drastically simplifying RAG strategies and enabling models to process much longer, coherent documents without fragmentation. This could fundamentally alter future LLM infrastructure decisions.
Key insights
Subquadratic claims to solve the O(n²) attention problem, a fundamental transformer constraint since 2017.
Principles
- Attention cost O(n²) limits context length.
- Prior attention solutions involved tradeoffs.
- Content-dependent attention can be exact.
Topics
- Transformers
- Attention Mechanisms
- Subquadratic Sparse Attention
- Large Language Models
- Context Windows
- O(n²) Complexity
Best for: Research Scientist, AI Engineer, NLP Engineer, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by AIGuys - Medium.