Retrieval-Based Multi-Label Legal Annotation: Extensible, Data-Efficient and Hallucination-Free

· Source: cs.CL updates on arXiv.org · Field: Legal & Regulatory — Legal Technology (LegalTech), Compliance & Risk Management · Depth: Expert, long

Summary

A new study introduces a retrieval-based framework for multi-label legal annotation, addressing challenges like large, evolving taxonomies, limited supervision, and the computational cost and hallucination risks of generative large language models (LLMs). The proposed method embeds legal documents and label descriptions using a frozen retrieval model, then predicts labels via $k$-nearest neighbors (k-NN) in the embedding space. This approach allows for label set updates by re-embedding and re-indexing, bypassing gradient-based backpropagation. Across three legal datasets (ECtHR-A, ECtHR-B, Eurlex with 100+ labels), the retrieval method achieved competitive accuracy, significantly improved data efficiency, and eliminated label hallucination, which occurred in 0.12–0.9% of test samples with GPT-5.2. For instance, on Eurlex, Qwen-8B retrieval improved Macro-F1 from 40.41 (GPT-5.2, zero-shot) to 49.12 while reducing estimated compute by approximately 20–30 times compared to fine-tuning.

Key takeaway

For legal professionals or NLP engineers building multi-label annotation systems, consider adopting retrieval-based models, especially when dealing with high-cardinality or rapidly evolving legal taxonomies. This approach offers superior data efficiency, significantly reduces computational costs, and critically, guarantees hallucination-free label assignments, which is paramount for legal validity and reducing human review burden. Your team can achieve robust performance with fewer training samples and ensure compliance with data sovereignty requirements.

Key insights

Retrieval-based legal annotation offers a data-efficient, hallucination-free alternative to generative LLMs for high-cardinality label spaces.

Principles

Method

Embed documents and label descriptions with a frozen retrieval model, then use k-NN in the embedding space for multi-label prediction. Update label sets by re-embedding and re-indexing.

In practice

Topics

Code references

Best for: NLP Engineer, Research Scientist, AI Engineer, Machine Learning Engineer, AI Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.