An Algebraic View of the Expressivity of Recurrent Language Models

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Mathematics & Computational Sciences · Depth: Expert, extended

Summary

This paper presents a unified algebraic framework for analyzing the expressivity of recurrent neural language models (RNNs), resolving conflicting formal results regarding their computational power. It demonstrates that the underlying arithmetic model, including finite precision and evaluation order, fundamentally determines an RNN's capabilities. The framework reduces expressivity questions to algebraic statements, such as whether a network's syntactic monoid divides a specific wreath product. A case study on diagonal state-space models (SSMs) shows that the same architecture cannot implement even-modulus counters with floating-point recurrences but can realize every even-modulus counter under unsigned-integer quantization. This highlights how discretization impacts an architecture's formal language recognition.

Key takeaway

For machine learning engineers designing or evaluating recurrent neural networks, you must rigorously define the arithmetic semantics and evaluation order. Your choice of floating-point versus unsigned-integer quantization directly impacts the model's formal language recognition capabilities, potentially enabling or preventing tasks like modular counting. Consider the algebraic implications of your numerical implementation choices.

Key insights

Arithmetic semantics fundamentally determines recurrent neural network expressivity and recognized formal languages.

Principles

Method

Develops an algebraic account of RNNs, formalizing arithmetic models and reducing expressivity to monoid division within wreath products.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.