Retrieval-Augmented Generation and Knowledge Graphs in Portuguese-Language Legal Documents
Summary
A Graph Retrieval-Augmented Generation (GraphRAG) pipeline has been developed for Question Answering (QA) in Portuguese legal documents. This system was applied to a corpus of 203 normative resolutions from Companhia Energética de Minas Gerais (CEMIG), addressing the inherent structural complexity of legal texts, including hierarchical dependencies and temporal modifications. The approach models documents as knowledge graphs, where nodes represent structural units like Articles, Paragraphs, and Items, and edges denote normative relationships, thereby preserving context and traceability. Its retrieval mechanism reconstructs evidence paths from root to leaf, followed by semantic re-ranking before answer generation. Evaluated using the RAGAS framework, the system achieved a mean answer accuracy of 0.81 and a median of 1.00. The system demonstrates robust performance on short, focused queries, though intermediate-length questions pose challenges due to semantic dispersion.
Key takeaway
For research scientists developing legal AI systems, this GraphRAG pipeline offers a robust method for handling complex Portuguese legal documents. You should consider implementing knowledge graph representations to explicitly model structural dependencies and temporal modifications, which significantly improves interpretability and precision in QA. Be aware that while short queries perform well, you may need to refine semantic re-ranking for intermediate-length questions to mitigate semantic dispersion.
Key insights
GraphRAG enhances legal QA by modeling document structure as knowledge graphs for context-aware retrieval.
Principles
- Explicitly model legal text structure.
- Preserve context via knowledge graph edges.
- Reconstruct evidence paths for traceability.
Method
Documents are modeled as knowledge graphs with structural units as nodes and normative relationships as edges. Retrieval reconstructs evidence paths, followed by semantic re-ranking before generation.
In practice
- Apply GraphRAG to complex, structured documents.
- Use RAGAS for QA system evaluation.
- Focus on short queries for higher accuracy.
Topics
- Graph Retrieval-Augmented Generation
- Portuguese Legal Documents
- Knowledge Graphs
- Question Answering
- RAGAS Framework
Best for: Research Scientist, AI Scientist, NLP Engineer, Legal Professional
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Paper Index on ACL Anthology.