FRAGATA: Semantic Retrieval of HPC Support Tickets via Hybrid RAG over 20 Years of Request Tracker History

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Cloud Computing & IT Infrastructure · Depth: Advanced, long

Summary

Fragata is a semantic ticket search system developed and deployed at the Galician Supercomputing Center (CESGA) to enhance knowledge reuse from over twenty years of Request Tracker (RT) history. The system addresses RT 4.4.1's limitations, such as poor indexing, case-sensitivity, and lack of semantic understanding, by combining dense retrieval using 384-dimensional embeddings from the paraphrase-multilingual-MiniLM-L12-v2 model with classical BM25 lexical retrieval. It also incorporates query-aware reranking via the mmarco-mMiniLMv2-L12-H384-v1 cross-encoder and supports incremental updates without service interruption. Fragata's architecture offloads expensive ingestion stages to the FinisTerrae III supercomputer and integrates external documentation, providing robust search capabilities across Spanish, English, and Galician queries, including those with typos or morphological variants.

Key takeaway

For AI Scientists developing knowledge retrieval systems for technical support, Fragata demonstrates a robust approach to overcoming limitations of legacy ticketing systems. You should consider a hybrid RAG architecture that integrates dense and lexical retrieval, query-aware reranking, and incremental ingestion with hot-swap capabilities. This strategy allows for efficient semantic search over large, heterogeneous, and multilingual historical data, significantly improving knowledge reuse and reducing resolution times.

Key insights

Hybrid RAG systems can effectively transform decades of unstructured support ticket data into a semantically searchable knowledge base.

Principles

Method

Fragata uses SQL extraction, normalization, and chunking of RT history, then applies hybrid retrieval (FAISS for dense, BM25 for lexical) with Weighted Reciprocal Rank Fusion (WRRF), followed by cross-encoder reranking and domain-specific score adjustments.

In practice

Topics

Code references

Best for: AI Scientist, MLOps Engineer, AI Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.