Scaling Accessible Mathematics on arXiv: HTML Conversion and MathML 4
Summary
arXiv is advancing its HTML Papers offering, initially released in 2023, to enhance accessibility and user experience for its over 3 million articles. Key developments from 2025 and early 2026 include resolving approximately half of 6,000 user-reported issues, aiming for 90% error-free HTML conversion (currently 75%), and implementing initial MathML 4 Intent annotations for improved speech output. A significant ongoing effort is the Rust port of LaTeXML, which has seen substantial acceleration with AI assistance, reducing compute costs and enabling faster submission previews. This project, while experimental, is maturing to better serve arXiv's growing readership, which saw monthly submissions reach 30,000 in 2026.
Key takeaway
For research scientists and platform developers focused on digital accessibility, arXiv's progress demonstrates that large-scale LaTeX-to-HTML conversion with MathML 4 Intent is viable. You should consider adopting similar structured markup and AI-assisted development for legacy code modernization to improve content accessibility and system efficiency, especially when dealing with complex mathematical notation.
Key insights
arXiv is enhancing document accessibility and conversion efficiency through HTML, MathML 4, and AI-assisted Rust reimplementation.
Principles
- Prioritize both acute and frequent errors for scalable improvements.
- Expose mathematical structure for web platform interaction.
- Maintain close alignment between upstream and project-specific codebases.
Method
arXiv's HTML conversion pipeline uses LaTeXML to transform LaTeX into HTML with MathML Core, applying MathML 4 Intent for speech output. A Rust reimplementation of LaTeXML, accelerated by AI, is underway to improve performance and reduce costs.
In practice
- Use MathML 4 Intent for accessible speech output.
- Employ agentic AI for large-scale code modernization.
- Provide original TeX source as an annotation in MathML output.
Topics
- arXiv HTML Papers
- MathML 4 Intent
- LaTeXML Rust Reimplementation
- Accessible Mathematics
- AI-Assisted Development
Code references
Best for: Software Engineer, AI Scientist, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.