Authorship Attribution in Multilingual Machine-Generated Texts
Summary
A new study introduces the challenge of Multilingual Authorship Attribution (AA) for texts generated by Large Language Models (LLMs), moving beyond traditional binary classification to identify specific LLM generators or human authors across diverse languages. Current AA efforts are largely confined to monolingual settings, primarily English. Researchers investigated the cross-lingual transferability of existing monolingual AA methods across 18 languages, encompassing various language families and writing scripts, and 8 generators (7 LLMs and the human-authored class). The findings, accepted at ACL 2026, indicate that while some monolingual AA techniques can be adapted for multilingual use, substantial limitations persist, particularly in transferring attribution capabilities across different language families. This underscores the inherent complexity of multilingual AA and the urgent need for more robust methodologies to address real-world scenarios effectively.
Key takeaway
For NLP Engineers developing MGT detection systems, recognize that current authorship attribution methods struggle significantly in multilingual contexts. If your applications involve diverse languages, you must move beyond monolingual approaches. Prioritize developing or integrating robust, truly multilingual AA models. Adapting existing methods offers limited success and may lead to inaccurate generator identification across varied language families.
Key insights
Multilingual authorship attribution for LLM-generated text is complex, with current monolingual methods showing limited cross-lingual transferability across diverse language families.
Principles
- LLM fluency complicates MGT detection.
- Fine-grained AA is needed for LLM diversity.
- Cross-lingual transfer remains challenging.
Method
The study investigated monolingual AA method suitability for multilingual settings by testing cross-lingual transferability and generator impact across 18 languages and 8 generators (7 LLMs + human).
In practice
- Adapt monolingual AA for new languages.
- Test AA methods across language families.
- Develop robust multilingual AA models.
Topics
- Authorship Attribution
- Machine-Generated Text
- Large Language Models
- Multilingual NLP
- Cross-Lingual Transfer
- ACL 2026
Best for: Research Scientist, AI Scientist, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.