Spoon-feeding a Monster

2026-04-15 · Source: AI Advances - Medium · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Cybersecurity & Data Privacy, Software Development & Engineering · Depth: Advanced, long

Summary

Molly is an automated offensive security testing agent designed to overcome the hallucination problem prevalent in AI-driven security tools. Unlike models that merely describe hypothetical vulnerabilities, Molly separates reasoning from execution, using a Rust-based safety layer for critical functions like scope enforcement, rate limiting, and audit logging, while Python handles orchestration and LLM calls. The system operates through a five-phase orchestration loop: Discover, Plan, Research, Coverage, and Closing, employing a 7-stage analytical pipeline for deeper reasoning. Molly has successfully confirmed vulnerabilities across six intentionally vulnerable applications in a "shooting range" environment, including IDOR, Mass Assignment, Role Bypass, Race Condition, and Blind SSRF/XXE, achieving a zero false positive rate. The project aims for a future where Molly instances collectively contribute to a central "armory" of attack tactics and automatically generate new tools.

Key takeaway

For security engineering leaders evaluating AI for offensive security, recognize that direct LLM-based penetration testing is prone to hallucination. Your teams should prioritize architectures like Molly's, which enforce strict safety and execution controls via a low-level language like Rust, while leveraging LLMs for hypothesis generation and evidence evaluation. This approach ensures reproducible findings and mitigates legal and ethical risks associated with autonomous attack agents.

Key insights

Autonomous security testing requires separating LLM reasoning from deterministic execution to ensure ground truth and prevent hallucinations.

Principles

Security research demands ground truth.
Safety constraints require robust, low-level enforcement.
LLMs should generate hypotheses, not directly attack.

Method

Molly employs a Rust-based safety layer for execution control and a Python layer for LLM orchestration. It uses a five-phase loop (Discover, Plan, Research, Coverage, Closing) and a 7-stage analytical pipeline to confirm vulnerabilities with concrete evidence.

In practice

Implement a Rust layer for critical security controls.
Use a "shooting range" for continuous validation.
Separate LLM reasoning from attack execution.

Topics

Automated Offensive Security
AI Hallucination Mitigation
Rust Safety Layer
LLM-Tool Orchestration
Vulnerability Validation

Code references

PyO3/pyo3

Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Security Engineer, AI Engineer, AI Architect

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by AI Advances - Medium.