PluRule: A Benchmark for Moderating Pluralistic Communities on Social Media

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

Researchers introduced PluRule, a multimodal, multilingual benchmark designed to evaluate AI models' ability to moderate pluralistic online communities, specifically on Reddit. This benchmark formalizes moderation as a multiple-choice task, requiring models to identify which of 2,885 community-defined rules, if any, are violated by a given comment within its full context. PluRule comprises 13,371 moderation instances across 1,989 Reddit communities and 9 languages, including 72,675 comments and 3,643 images. Evaluation of state-of-the-art vision-language models, including GPT-5.2 (high reasoning), revealed significant limitations; GPT-5.2 achieved only 57.7% accuracy, barely surpassing a 50% trivial baseline. Models performed better on universal rules like civility (69%) and self-promotion (63%) but struggled with context-dependent rules such as low-effort (43%), relevance (44%), and evidence-based (47%) violations.

Key takeaway

For research scientists developing AI moderation tools, this study highlights a critical gap: current vision-language models cannot effectively handle the contextual nuances of pluralistic community rules. You should prioritize developing models capable of understanding implicit community norms and context-dependent rule interpretations, possibly through fine-tuning on community-specific examples or retrieval-augmented methods, rather than relying on universal rule enforcement.

Key insights

AI models struggle with context-dependent content moderation in pluralistic online communities, performing only slightly better than a trivial baseline.

Principles

Method

PluRule formalizes moderation as a multiple-choice task, providing models with comments, community rules, and full conversational context to identify specific rule violations across diverse subreddits and languages.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.