Towards LLM Accelerated Rapid Reviews for Software Tool Discovery -- Case for Log Anomaly Detection

2026-06-16 · Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Software Development & Engineering, Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

A novel LLM-accelerated rapid review pipeline has been developed to identify and execute software tools from academic literature, focusing on log anomaly detection. The process began with a Scopus search yielding 3233 papers, which two LLMs (gemini-2.5-flash and openai/gpt-4.1-mini) screened to 569 relevant abstracts using a 0.90 inclusion probability threshold. From 470 downloaded full papers, 83 unique and suitable tool links were extracted. An LLM-based coding agent, utilizing Claude Code agent with Opus 4.6 model, then successfully ran 24 of these tools in a Linux virtual machine. This pipeline achieved high efficiency, requiring only 4 hours of human effort (primarily for PDF downloading) and 12 hours of LLM processing time. LLM screening accuracy was validated with a Cohen's Kappa of 0.839, precision of 0.933, recall of 0.909, and F1-score of 0.921.

Key takeaway

For Machine Learning Engineers or AI Scientists seeking to quickly identify and validate functional software tools from academic research, adopting an LLM-accelerated rapid review pipeline can drastically cut human effort. You can screen thousands of abstracts and test tool executability in hours, not weeks. Prioritize tools with clear dependency specifications, as they are more likely to run successfully. Consider formalizing your workflow as LLM Agent Skills for future reusability.

Key insights

LLMs can significantly accelerate software tool discovery and validation in rapid reviews, reducing human effort.

Principles

LLM screening achieves near-perfect agreement with human consensus.
Dedicated requirements files improve tool execution success.
Rapid reviews are ideal for technology transfer.

Method

A pipeline involving LLM-based abstract screening, regex link extraction, and an LLM coding agent for repository execution, including dependency resolution and script running in an isolated VM.

In practice

Use compact LLMs (e.g., gpt-4.1-mini) for abstract screening.
Employ regex for efficient tool link extraction from PDFs.
Isolate LLM coding agents in VMs for security.

Topics

LLM Agents
Rapid Reviews
Software Tool Discovery
Log Anomaly Detection
Empirical Software Engineering
Code Execution Automation

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Software Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.