TeamHerald@CHIPSAL 2026: Hate Speech Detection and Sentiment Analysis of Nepali Memes using Transformer-based Architectures and Ensemble Learning
Summary
The study analyzes Nepali internet memes for hate speech detection and sentiment analysis, addressing challenges like code-mixing and a lack of established baseline resources. Focusing on a text-centric approach, the research extracted embedded text using an OCR layer and modeled it with six distinct Transformer-based architectures. It investigated the comparative effectiveness of Hard and Soft Voting ensemble strategies across two tasks: binary hate speech detection and three-class sentiment analysis. Experimental results showed that a standalone decoder-only model achieved the highest performance for binary classification. Conversely, the Soft Voting ensemble performed best for the multi-class sentiment task, yielding a 15.8% relative improvement in Macro F1-score over the strongest standalone baseline. These findings highlight that ensemble strategies behave differently across binary and multi-class tasks, underscoring the importance of selecting aggregation methods suited to the classification objective.
Key takeaway
For NLP Engineers developing hate speech or sentiment analysis systems for code-mixed languages like Nepali, you should carefully select ensemble aggregation methods based on your classification objective. A standalone decoder-only model is effective for binary tasks, while Soft Voting ensembles significantly improve multi-class sentiment analysis, as demonstrated by a 15.8% Macro F1-score gain. Tailor your ensemble strategy to the specific task to optimize performance.
Key insights
Ensemble strategies for meme analysis perform differently for binary vs. multi-class tasks, requiring task-specific aggregation.
Principles
- Ensemble effectiveness varies by task type.
- Code-mixing complicates Nepali meme analysis.
- Text-centric OCR approach is viable for memes.
Method
Extracted embedded text from Nepali memes via OCR, then modeled it using six Transformer-based architectures. Evaluated Hard and Soft Voting ensembles for binary hate speech and three-class sentiment analysis.
In practice
- Use standalone decoder-only for binary tasks.
- Apply Soft Voting ensembles for multi-class tasks.
- Consider OCR for meme text extraction.
Topics
- Hate Speech Detection
- Sentiment Analysis
- Nepali Memes
- Transformer Architectures
- Ensemble Learning
- Code-mixing NLP
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.