Scaling Accessible Mathematics on arXiv: HTML Conversion and MathML 4

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Software Development & Engineering, Artificial Intelligence & Machine Learning, Emerging Technologies & Innovation · Depth: Advanced, long

Summary

arXiv is advancing its HTML Papers offering, initially released in 2023, to enhance accessibility and user experience for its over 3 million articles. Key developments from 2025 and early 2026 include resolving approximately half of 6,000 user-reported issues, aiming for 90% error-free HTML conversion (currently 75%), and implementing initial MathML 4 Intent annotations for improved speech output. A significant ongoing effort is the Rust port of LaTeXML, which has seen substantial acceleration with AI assistance, reducing compute costs and enabling faster submission previews. This project, while experimental, is maturing to better serve arXiv's growing readership, which saw monthly submissions reach 30,000 in 2026.

Key takeaway

For research scientists and platform developers focused on digital accessibility, arXiv's progress demonstrates that large-scale LaTeX-to-HTML conversion with MathML 4 Intent is viable. You should consider adopting similar structured markup and AI-assisted development for legacy code modernization to improve content accessibility and system efficiency, especially when dealing with complex mathematical notation.

Key insights

arXiv is enhancing document accessibility and conversion efficiency through HTML, MathML 4, and AI-assisted Rust reimplementation.

Principles

Method

arXiv's HTML conversion pipeline uses LaTeXML to transform LaTeX into HTML with MathML Core, applying MathML 4 Intent for speech output. A Rust reimplementation of LaTeXML, accelerated by AI, is underway to improve performance and reduce costs.

In practice

Topics

Code references

Best for: Software Engineer, AI Scientist, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.