LLM-based Detection of Manipulative Political Narratives

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

A new computational framework leverages Large Language Models (LLMs) to detect and structure manipulative political narratives within large, unfiltered social media datasets. The pipeline processes over 1.2 million posts from X, Reddit, and Telegram, with an 80% German and 20% English split, collected between January and February 2025. It employs a prompt-based filtering step using the Qwen3.5-122B-A10B-FP8 model to differentiate manipulative content from legitimate critique, achieving an F1 score of 0.77 with high recall (0.92). Subsequently, posts are embedded using Qwen3-Embedding-8B, dimensionality-reduced with UMAP, and clustered into 41 distinct narrative groups using HDBSCAN with a minimum cluster size of 400. Finally, the Qwen3.5-397B-A17B-FP8 model extracts detailed strategic narratives for each cluster, identifying themes like "The Great Replacement" and "The Proxy War."

Key takeaway

For NLP Engineers and Research Scientists working on disinformation detection, this framework offers a robust approach to identifying and structuring manipulative narratives in real-world social media data. You should consider integrating prompt-based LLM filtering with unsupervised clustering to move beyond traditional topic modeling and capture nuanced manipulative intent, especially when dealing with large, uncurated datasets. This can help you uncover emerging FIMI campaigns without relying on predefined categories.

Key insights

LLM-driven pipelines can effectively identify and cluster manipulative political narratives in raw social media data.

Principles

Method

The method involves prompt-based LLM filtering, intent-driven embedding generation, UMAP dimensionality reduction, HDBSCAN clustering, and LLM-based narrative labeling to structure manipulative political narratives.

In practice

Topics

Code references

Best for: AI Scientist, NLP Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.