Agent Orchestration - LLM for Legal Metadata Extraction: A Comparative Analysis of Efficiency and Precision
Summary
JAMEX (Judicial Multi-Agent Metadata Extraction) is a multi-agent pipeline designed to extract structured metadata from Brazilian court decisions, specifically "Espelho do Acórdão." Researchers evaluated JAMEX against a single-prompt baseline in an Information Retrieval-only setting. A pilot study involved 300 decisions, followed by a main experiment on a stratified dataset of n=1,225 instances, with completion rates ranging from 779 to 1,216. The accuracy of agentic configurations was strategy-dependent; GPT-5 showed improvements over the baseline in some multi-agent strategies, but not all. Smaller models like Gemma3-12B and Gemma3-27B did not exhibit robust gains. While orchestration refinements, including memory, planning, and directed review, enhanced traceability, overall performance was sensitive to task decomposition and context splitting. JAMEX increases token usage and operational complexity, necessitating a balance between accuracy, completion reliability, and cost for Portuguese legal metadata extraction.
Key takeaway
For research scientists developing LLM-based legal information extraction systems, you should carefully evaluate the trade-offs between multi-agent orchestration complexity and actual performance gains. While larger models like GPT-5 may offer accuracy improvements in specific agentic strategies, smaller models show no consistent benefits. Prioritize robust task decomposition and context splitting to maximize agent performance, and conduct thorough cost-benefit analyses before deploying agentic solutions for Portuguese legal metadata extraction.
Key insights
Multi-agent LLM pipelines for legal metadata extraction offer conditional accuracy gains but increase complexity and cost.
Principles
- Agent accuracy is strategy-dependent.
- Smaller LLMs show no robust gains in agentic setups.
- Performance is sensitive to task decomposition.
Method
JAMEX employs a multi-agent pipeline for metadata extraction, comparing it to a single-prompt baseline. It uses orchestration refinements like memory, planning, and directed review.
In practice
- Consider GPT-5 for agentic legal extraction.
- Evaluate cost vs. accuracy for deployment.
- Optimize task decomposition for agent performance.
Topics
- JAMEX
- Legal Metadata Extraction
- Multi-Agent Systems
- LLM Orchestration
- Brazilian Court Decisions
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.