On GATE, Text and Social Media Analysis, and Detecting Misinformation Online

· Source: On GATE, Text and Social Media Analysis, and Detecting Misinformation Online · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Advanced, medium

Summary

A Master's thesis project explored the application of Retrieval-Augmented Generation (RAG) systems for newsroom environments, aiming to enhance journalistic integrity and traceability of AI-assisted work. The researcher, Tasos Galanopoulos, visited the University of Sheffield's GATE team from April 7-17, 2026, to develop and test a configurable RAG system. This system, built with Streamlit, ChromaDB, and open-access LLMs like Mistral and DeepSeek, allows journalists to interactively configure parameters and evaluate outputs. Experiments involved four diverse journalistic datasets (economic reports, political interviews, newspaper editorials, central bank reports) and four distinct response styles (Strict RAG, Journalistic Style, Analysis & Key Points, Archivist). Performance was measured using Faithfulness, Answer Relevance, Context Precision, and Ground Truth Similarity, revealing that dataset structure significantly impacts RAG performance more than parameter tuning.

Key takeaway

For AI Scientists developing tools for newsrooms, recognize that a "one-size-fits-all" RAG assistant is not viable. You should prioritize designing adaptive RAG systems that dynamically adjust retrieval and generation parameters based on the specific dataset characteristics and journalistic task requirements to ensure both grounding and relevance.

Key insights

RAG systems offer traceable AI assistance for journalism, but performance heavily depends on dataset characteristics, not just parameter tuning.

Principles

Method

A RAG application was developed using Streamlit, ChromaDB, and open-access LLMs. It allows users to upload documents, configure retrieval/generation parameters, and run quantitative assessments using embedding-based metrics.

In practice

Topics

Code references

Best for: AI Scientist, AI Student, NLP Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by On GATE, Text and Social Media Analysis, and Detecting Misinformation Online.