FRAGATA: Semantic Retrieval of HPC Support Tickets via Hybrid RAG over 20 Years of Request Tracker History
Summary
Fragata is a semantic ticket search system developed and deployed at the Galician Supercomputing Center (CESGA) to enhance knowledge reuse from over twenty years of Request Tracker (RT) history. The system addresses RT 4.4.1's limitations, such as poor indexing, case-sensitivity, and lack of semantic understanding, by combining dense retrieval using 384-dimensional embeddings from the paraphrase-multilingual-MiniLM-L12-v2 model with classical BM25 lexical retrieval. It also incorporates query-aware reranking via the mmarco-mMiniLMv2-L12-H384-v1 cross-encoder and supports incremental updates without service interruption. Fragata's architecture offloads expensive ingestion stages to the FinisTerrae III supercomputer and integrates external documentation, providing robust search capabilities across Spanish, English, and Galician queries, including those with typos or morphological variants.
Key takeaway
For AI Scientists developing knowledge retrieval systems for technical support, Fragata demonstrates a robust approach to overcoming limitations of legacy ticketing systems. You should consider a hybrid RAG architecture that integrates dense and lexical retrieval, query-aware reranking, and incremental ingestion with hot-swap capabilities. This strategy allows for efficient semantic search over large, heterogeneous, and multilingual historical data, significantly improving knowledge reuse and reducing resolution times.
Key insights
Hybrid RAG systems can effectively transform decades of unstructured support ticket data into a semantically searchable knowledge base.
Principles
- Combine dense and lexical retrieval for robust search.
- Use reranking to refine initial search results.
- Ensure continuous availability with hot-swap mechanisms.
Method
Fragata uses SQL extraction, normalization, and chunking of RT history, then applies hybrid retrieval (FAISS for dense, BM25 for lexical) with Weighted Reciprocal Rank Fusion (WRRF), followed by cross-encoder reranking and domain-specific score adjustments.
In practice
- Implement query variants for multilingual and typo-tolerant search.
- Offload compute-intensive indexing to HPC resources.
- Employ transactional watermarks for reliable incremental updates.
Topics
- Retrieval-Augmented Generation
- HPC Support Tickets
- Semantic Search
- Hybrid Information Retrieval
- Request Tracker
Code references
Best for: AI Scientist, MLOps Engineer, AI Engineer, Research Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.