Multi-View Decompilation for LLM-Based Malware Classification
Summary
A new approach, Multi-View Decompilation, addresses the fragility of existing LLM-based malware classification pipelines that rely on a single decompiler view. Recognizing that decompilers are lossy heuristic tools and produce different artifacts, researchers curated a benchmark of benign and malicious programs. Each sample was compiled and then decompiled using both Ghidra and RetDec, yielding matched pseudo-C views. Across various LLMs, providing both decompiler views significantly improved malicious-class F1 scores, primarily by increasing recall on malicious samples. Agreement analyses confirmed that Ghidra and RetDec make partially different errors, supporting the idea that their outputs offer complementary evidence. This multi-decompiler prompting method is a simple, training-free way to enhance LLM-based malware triage.
Key takeaway
For AI Security Engineers developing LLM-based malware classification systems, you should integrate multi-decompiler prompting. Relying on a single decompiler view is fragile; providing both Ghidra and RetDec outputs to your LLMs can significantly boost malicious sample recall and overall F1 scores. This simple, training-free approach offers a practical way to enhance detection accuracy in real-world triage.
Key insights
Using multiple decompiler views improves LLM-based malware classification by providing complementary evidence.
Principles
- Decompilers are lossy heuristic tools.
- Different decompilers expose complementary artifacts.
- Multi-decompiler prompting boosts LLM recall.
Method
The proposed method involves feeding LLMs pseudo-C code from multiple decompilers, such as Ghidra and RetDec, for the same binary to improve classification.
In practice
- Integrate Ghidra and RetDec outputs.
- Apply multi-decompiler prompting.
- Enhance LLM-based malware triage.
Topics
- Malware Classification
- Large Language Models
- Decompilation
- Ghidra
- RetDec
- Cybersecurity
Best for: AI Engineer, Machine Learning Engineer, Research Scientist, AI Security Engineer, NLP Engineer, AI Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.