CTIConnect: A Benchmark for Retrieval-Augmented LLMs over Heterogeneous Cyber Threat Intelligence

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy · Depth: Expert, extended

Summary

CTIArena is introduced as the first benchmark for evaluating large language models (LLMs) on heterogeneous, multi-source cyber threat intelligence (CTI) in knowledge-augmented settings. This benchmark addresses limitations of prior efforts by covering nine tasks across structured, unstructured, and hybrid CTI categories, comprising 691 high-quality QA pairs. Evaluation of ten widely used LLMs, including proprietary models like GPT-5 and open-source models like LLaMA-3-405B, revealed that most LLMs perform poorly in closed-book scenarios. However, they show noticeable performance gains when augmented with security-specific knowledge through techniques like CSKG-guided RAG and query-expanded RAG. These findings underscore that scaling model size alone is insufficient for CTI; domain-tailored knowledge augmentation is crucial.

Key takeaway

For AI Scientists and Machine Learning Engineers developing CTI solutions, you should prioritize integrating domain-specific knowledge augmentation over relying solely on larger, general-purpose LLMs. Implement tailored retrieval-augmented generation (RAG) strategies, such as CSKG-guided RAG for unstructured data or query-expanded RAG for hybrid tasks, to significantly improve performance and reduce hallucinations. This approach is critical for building robust CTI copilots that can effectively reason across diverse and fragmented intelligence sources.

Key insights

LLMs require domain-specific knowledge augmentation and tailored retrieval strategies for effective cyber threat intelligence analysis.

Principles

Method

CTIArena uses a three-stage pipeline: seed correlation annotation, factually-grounded QA synthesis via templates, and LLM-human collaborative curation for quality control.

In practice

Topics

Code references

Best for: Research Scientist, AI Architect, AI Engineer, AI Scientist, AI Security Engineer, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.