Prompt Engineering for Named Entity Extraction from Portuguese Legal Documents

2026-04-12 · Source: Paper Index on ACL Anthology · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

A study investigated prompt engineering for Named Entity Recognition (NER) in Portuguese legal documents, addressing the scarcity and cost of annotated legal data. The research explored whether Large Language Models (LLMs) and In-Context Learning (ICL) could effectively support legal NER in low-supervision and low-resource environments. Utilizing the LeNER-Br corpus, the evaluation focused on category-specific prompts, varying chunking sizes, and different prompt engineering strategies. Entity-level evaluation, using Exact Match Micro F1, revealed that prompt engineering significantly influenced performance more than other tested strategies. The highest scores were achieved by larger models, specifically the 4-bit quantized Qwen-2.5:32B and GPT-5.2, which attained 57.9% and 71.9% respectively, demonstrating the potential of this method as an alternative to conventional supervised NER.

Key takeaway

For research scientists developing NER solutions for low-resource languages like Portuguese, you should investigate prompt engineering with larger, quantized LLMs as a strong alternative to traditional supervised pipelines. Focusing on refining prompt strategies can yield significant performance gains, potentially reducing the reliance on extensive, costly annotated datasets and accelerating development.

Key insights

Prompt engineering with LLMs offers a viable alternative for legal NER in low-resource settings.

Principles

Prompt engineering impacts NER performance more than chunking.
Larger LLMs generally yield better NER results.
ICL can mitigate data scarcity in legal text analysis.

Method

The study evaluated category-specific prompts, chunking sizes, and prompt engineering strategies using LLMs and In-Context Learning on the LeNER-Br corpus for legal NER.

In practice

Consider 4-bit quantized LLMs for legal NER.
Prioritize prompt engineering over chunking size.
Explore ICL for low-supervision NER tasks.

Topics

Prompt Engineering
Named Entity Recognition
Large Language Models
Portuguese Legal Documents
In-Context Learning

Best for: Research Scientist, AI Scientist, NLP Engineer, Prompt Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.