RacismoBR: A Manually Annotated Dataset for Racist Discourse Detection in Brazilian Portuguese
Summary
RacismoBR is a new, culturally grounded dataset designed to detect racist discourse in Brazilian Portuguese social media, addressing the challenge of identifying both explicit and subtle forms of racism. Manually annotated exclusively by Black researchers to ensure sociolinguistic validity, the dataset was used to evaluate various classification models, including classical machine learning, supervised Transformer-based (Small) Language Models, and Large Language Models like GPT-4.1 under in-context, few-shot learning. While GPT-4.1 and BERTimbau achieved the highest Macro-F1 scores, Wilcoxon signed-rank tests showed no statistically significant differences across models due to high variability. Classifiers consistently demonstrated higher precision for non-racist content and higher recall for racist content. Qualitative analysis revealed ongoing difficulties with implicit, euphemized, and context-dependent racism, suggesting that culturally informed annotation is more critical than architectural complexity for improving racism detection.
Key takeaway
For research scientists developing hate speech detection systems, prioritize culturally grounded dataset annotation over solely pursuing advanced model architectures. Your efforts should focus on ensuring sociolinguistic validity in data collection, especially for nuanced forms of racism like euphemized or context-dependent discourse. This approach will likely yield more robust and accurate classifiers than simply deploying the latest large language models without specialized data.
Key insights
Culturally grounded annotation is more critical than model architecture for effective racism detection.
Principles
- Racism detection requires sociolinguistic validity.
- Implicit racism remains a significant challenge.
Method
The study involved manual annotation of a Brazilian Portuguese dataset by Black researchers, followed by binary classification using classical ML, Transformer-based LMs, and few-shot LLMs, with performance evaluated via Macro-F1 and Wilcoxon tests.
In practice
- Prioritize culturally informed dataset creation.
- Focus on implicit bias in model training.
Topics
- RacismoBR Dataset
- Racist Discourse Detection
- Brazilian Portuguese
- Natural Language Processing
- Culturally Grounded Annotation
Best for: Research Scientist, AI Scientist, NLP Engineer, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.