An Algebraic View of the Expressivity of Recurrent Language Models
Summary
This paper presents a unified algebraic framework for analyzing the expressivity of recurrent neural language models (RNNs), resolving conflicting formal results regarding their computational power. It demonstrates that the underlying arithmetic model, including finite precision and evaluation order, fundamentally determines an RNN's capabilities. The framework reduces expressivity questions to algebraic statements, such as whether a network's syntactic monoid divides a specific wreath product. A case study on diagonal state-space models (SSMs) shows that the same architecture cannot implement even-modulus counters with floating-point recurrences but can realize every even-modulus counter under unsigned-integer quantization. This highlights how discretization impacts an architecture's formal language recognition.
Key takeaway
For machine learning engineers designing or evaluating recurrent neural networks, you must rigorously define the arithmetic semantics and evaluation order. Your choice of floating-point versus unsigned-integer quantization directly impacts the model's formal language recognition capabilities, potentially enabling or preventing tasks like modular counting. Consider the algebraic implications of your numerical implementation choices.
Key insights
Arithmetic semantics fundamentally determines recurrent neural network expressivity and recognized formal languages.
Principles
- Arithmetic semantics is part of the object being analyzed.
- RNN expressivity reduces to algebraic divisibility statements.
- Discretization alters algebraic structure, not just approximates.
Method
Develops an algebraic account of RNNs, formalizing arithmetic models and reducing expressivity to monoid division within wreath products.
In practice
- Diagonal SSMs recognize parity with unsigned-integer quantization.
- Krohn–Rhodes decomposition decides language recognition impossibility.
Topics
- Recurrent Neural Networks
- Formal Languages
- Algebraic Automata Theory
- Arithmetic Semantics
- Floating-Point Arithmetic
- Unsigned Integer Quantization
- State Space Models
Best for: Research Scientist, AI Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.