Train, Retrieve, or Both? A Four-Arm Head-to-Head for Correct Statutory Citation on the Ontario Residential Tenancies Act
Summary
A study investigated methods for generating correct statutory citations from the Ontario Residential Tenancies Act (RTA) for self-represented individuals and help-desk staff. Researchers conducted a four-arm head-to-head comparison using Qwen2.5-7B-Instruct, evaluating base zero-shot, LoRA SFT-only, RAG-only, and an SFT+RAG hybrid. The SFT+RAG hybrid achieved the highest score at 0.481 exact-match, crucially eliminating hallucinated citations. This hybrid approach, utilizing a "cheap bge-small" embedder, matched or surpassed pipelines built with larger, specialized retrieval models and showed that increased training data did not improve performance. Retrieval proved essential, as SFT-only models mis-recalled sections, and the base model failed to cite the RTA.
Key takeaway
For legal tech developers building statutory citation tools, your focus should be on integrating a Supervised Fine-Tuning (SFT) and Retrieval-Augmented Generation (RAG) hybrid. This approach, even with smaller models like Qwen2.5-7B-Instruct and a "cheap bge-small" embedder, significantly boosts exact-match accuracy to 0.481 and eliminates hallucinated citations. Prioritize robust retrieval over larger models or extensive training data to achieve reliable legal information delivery.
Key insights
A Qwen2.5-7B-Instruct SFT+RAG hybrid achieves 0.481 exact-match statutory citation with zero hallucination on the RTA.
Principles
- Retrieval is essential for accurate statutory citation, eliminating hallucination.
- SFT enhances provision selection robustness in RAG, improving overall accuracy.
- Specialized retrieval models or more data are not always necessary for strong statutory citation.
Method
A four-arm head-to-head comparison of Qwen2.5-7B-Instruct (base zero-shot, LoRA SFT-only, RAG-only, SFT+RAG hybrid) on statutory citation exact-match.
In practice
- Combine SFT with RAG to improve statutory citation accuracy.
- Prioritize retrieval to eliminate LLM hallucination in legal contexts.
- Smaller embedders like bge-small can be effective in hybrid systems.
Topics
- Statutory Citation
- Retrieval-Augmented Generation
- Supervised Fine-Tuning
- Legal AI
- Qwen2.5-7B-Instruct
- Ontario Residential Tenancies Act
Best for: Research Scientist, AI Architect, AI Engineer, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning.