TeamHerald@CHIPSAL 2026: Hate Speech Detection and Sentiment Analysis of Nepali Memes using Transformer-based Architectures and Ensemble Learning

2026-06-07 · Source: Machine Learning · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Computer Vision & Pattern Recognition · Depth: Expert, quick

Summary

The study analyzes Nepali internet memes for hate speech detection and sentiment analysis, addressing challenges like code-mixing and a lack of established baseline resources. Focusing on a text-centric approach, the research extracted embedded text using an OCR layer and modeled it with six distinct Transformer-based architectures. It investigated the comparative effectiveness of Hard and Soft Voting ensemble strategies across two tasks: binary hate speech detection and three-class sentiment analysis. Experimental results showed that a standalone decoder-only model achieved the highest performance for binary classification. Conversely, the Soft Voting ensemble performed best for the multi-class sentiment task, yielding a 15.8% relative improvement in Macro F1-score over the strongest standalone baseline. These findings highlight that ensemble strategies behave differently across binary and multi-class tasks, underscoring the importance of selecting aggregation methods suited to the classification objective.

Key takeaway

For NLP Engineers developing hate speech or sentiment analysis systems for code-mixed languages like Nepali, you should carefully select ensemble aggregation methods based on your classification objective. A standalone decoder-only model is effective for binary tasks, while Soft Voting ensembles significantly improve multi-class sentiment analysis, as demonstrated by a 15.8% Macro F1-score gain. Tailor your ensemble strategy to the specific task to optimize performance.

Key insights

Ensemble strategies for meme analysis perform differently for binary vs. multi-class tasks, requiring task-specific aggregation.

Principles

Ensemble effectiveness varies by task type.
Code-mixing complicates Nepali meme analysis.
Text-centric OCR approach is viable for memes.

Method

Extracted embedded text from Nepali memes via OCR, then modeled it using six Transformer-based architectures. Evaluated Hard and Soft Voting ensembles for binary hate speech and three-class sentiment analysis.

In practice

Use standalone decoder-only for binary tasks.
Apply Soft Voting ensembles for multi-class tasks.
Consider OCR for meme text extraction.

Topics

Hate Speech Detection
Sentiment Analysis
Nepali Memes
Transformer Architectures
Ensemble Learning
Code-mixing NLP

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.