Socially Responsible and Explainable Automated Fact-Checking and Hate Speech Detection

2026-04-12 · Source: Paper Index on ACL Anthology · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, medium

Summary

A Ph.D. dissertation by Francielle Vargas, Fabrício Benevenuto, and Thiago A. S. Pardo advances Natural Language Processing (NLP) for Portuguese, focusing on hate speech detection and automated fact-checking. The work introduces several benchmark datasets for Brazilian Portuguese, including HateBR, HateBRXplain, HateBRMoralXplain, MFTCXplain, MOL, and FactNews, which address resource gaps and have been adopted by the research community. Additionally, the dissertation proposes novel explainable NLP methods such as Sentence-Level Factual Reasoning (SELFAR), Social Stereotype Analysis (SSA), Contextual Bag-of-Words with Interpretable Input and Feature Optimization (B+M), Supervised Rational Attention (SRA), and Supervised Moral Rational Attention (SMRA). These methods reportedly outperform baselines across various Portuguese tasks and datasets, enhancing interpretability and robustness while demonstrating joint optimization of explainability and performance. The thesis has garnered significant national and international recognition, including awards like Google LARA and the ACL Diversity and Inclusion Award.

Key takeaway

For research scientists developing NLP models for Portuguese, this work highlights the importance of integrating explainability from the outset. You should explore the proposed datasets like HateBRXplain and FactNews to enhance model training and evaluation. Consider implementing methods such as Supervised Rational Attention (SRA) or Supervised Moral Rational Attention (SMRA) to improve both performance and the interpretability of your hate speech detection and fact-checking systems.

Key insights

Explainable NLP methods and high-quality datasets can jointly optimize performance and interpretability for Portuguese hate speech and fact-checking.

Principles

Explainability and performance are jointly optimizable.
High-quality datasets are crucial for NLP advancement.

Method

The dissertation proposes novel post-hoc and self-explaining NLP methods, including SELFAR, SSA, B+M, SRA, and SMRA, for hate speech detection and automated fact-checking in Portuguese.

In practice

Utilize HateBR and FactNews datasets for Portuguese NLP.
Apply SELFAR or SMRA for explainable hate speech detection.

Topics

Natural Language Processing
Hate Speech Detection
Automated Fact-Checking
Explainable AI
Brazilian Portuguese Datasets

Best for: Research Scientist, AI Scientist, NLP Engineer, AI Student

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.