Modeling Linguistic Violence: An Ontology-Based Framework for the Computational Analysis of Violence Manifested in Language
Summary
A new framework addresses the conceptual ambiguity among terms like "hate speech," "toxic speech," and "dangerous speech," which currently hinder research and automated content moderation. This framework introduces a unified ontology that differentiates various forms of linguistic violence, including verbal aggression and cyberbullying, based on attributes such as target, intent, and rhetorical hallmarks. It also proposes a computational methodology to operationalize this ontology using a multi-stage NLP pipeline. This pipeline employs semantic analysis, specifically Semantic Role Labeling and Named Entity Recognition, to break down speech acts into core components like target and action. This component-based classification aims to robustly distinguish nuanced phenomena, such as general insults versus targeted identity-based attacks, and is particularly beneficial for low-resource languages like Portuguese due to its reliance on multilingual semantic models.
Key takeaway
For AI Engineers developing content moderation systems, this framework offers a robust approach to classifying linguistic violence. You should consider integrating an ontology-based, multi-stage NLP pipeline that leverages semantic role labeling and named entity recognition to move beyond lexical cues, especially when dealing with implicit harm or low-resource languages like Portuguese.
Key insights
An ontology and multi-stage NLP pipeline clarify and classify linguistic violence beyond lexical cues.
Principles
- Linguistic violence requires nuanced, attribute-based differentiation.
- Semantic analysis improves classification of implicit harm.
Method
The proposed method uses a multi-stage NLP pipeline with Semantic Role Labeling and Named Entity Recognition to deconstruct speech acts into components (target, action) for granular classification based on an established ontology.
In practice
- Apply semantic analysis to differentiate implicit harm.
- Utilize multilingual models for low-resource languages.
Topics
- Linguistic Violence
- Ontology-Based Framework
- Semantic Analysis
- NLP Pipeline
- Automated Moderation
Best for: Research Scientist, AI Engineer, AI Product Manager, AI Scientist, NLP Engineer, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.