Language Models Learn Universal Representations of Numbers and Here's Why You Should Care

· Source: cs.NE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

This research investigates how Large Language Models (LLMs) process and represent numerical information, addressing the conflict between their accurate internal number embeddings and their propensity for numerical output errors. The study finds that diverse LLMs, including OLMo 2, Llama 3, and Phi 4, converge to systematic, highly accurate, and universal sinusoidal representations of numbers across their hidden states and input contexts. These representations are consistent across layers, primarily maintained by residual streams, though input/output embeddings are more distributed than the sparser internal representations. The authors developed universal sinusoidal probes that can accurately extract numeric information and attribute up to 94% of arithmetic reasoning errors in models like Llama 3.2 3B to specific internal layers, particularly in division. The work also shows that multi-token numbers are systematically superposed, with high accuracy for up to three tokens (up to 10^9).

Key takeaway

Research Scientists developing or fine-tuning LLMs should focus on the internal sinusoidal representations of numbers. Understanding these universal representations allows for the creation of more accurate probing techniques, which can pinpoint specific layers responsible for numerical errors. This insight enables targeted architectural adjustments to improve arithmetic reasoning and overall numerical accuracy, especially for multi-token numbers, potentially reducing errors by 27-64% in operations like division.

Key insights

LLMs use universal, systematic sinusoidal representations for numbers, enabling precise error tracing to specific internal layers.

Principles

Method

The study uses Representational Similarity Analysis (RSA) and Fourier decompositions to quantify embedding similarity. It employs a sinusoidal probe, defined as $f_{\sin}(\mathbf{x}) =(\mathbf{W}_{\mathrm{out}}\mathbf{S})^{T}(\mathbf{W}_{\mathrm{in}}\mathbf{x})$, to decode internal representations and track error origins.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.NE updates on arXiv.org.