mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Advanced, quick

Summary

The mdok-style team participated in SemEval-2026 Task 9, which focuses on multilingual polarization detection across 22 languages and multiple cultural and event contexts. This task aims to identify online polarization along three axes: detection, type, and manifestation, to prevent escalation into hate speech and social fragmentation. The team addressed this challenge by finetuning mid-sized Large Language Models (LLMs) for sequence classification. They employed the QLoRA parameter-efficient finetuning technique and augmented the multilingual training data with anonymized, lower-cased, upper-cased, and homoglyphied versions to enhance detection robustness.

Key takeaway

For research scientists developing online content moderation systems, understanding the effectiveness of QLoRA with augmented multilingual data for polarization detection is critical. You should consider integrating similar data augmentation strategies and parameter-efficient finetuning techniques to improve the robustness and scalability of your models in diverse linguistic environments.

Key insights

Finetuning mid-sized LLMs with QLoRA and augmented data improves multilingual online polarization detection.

Principles

Method

Finetune mid-size LLMs for sequence classification using QLoRA, augmenting multilingual training data with anonymized, cased, and homoglyphied versions.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.