Exploring Lightweight Large Language Models for Court View Generation

· Source: cs.CL updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Advanced, medium

Summary

A study systematically explores the capabilities of lightweight Large Language Models (LLMs) (smaller than 2B parameters) for Criminal Court View Generation (CVG) and their impact on charge prediction. The research investigates how different LLM architectures and sizes affect CVG quality and charge prediction, compares lightweight LLMs with Deep Neural Networks (DNNs) in these tasks, and analyzes the effectiveness of predicting charges via court view generation versus direct prediction. The authors developed CVGEvalKit, an evaluation framework incorporating three public datasets (C3VG, LCVG, CCVG) for CVG and charge prediction tasks. Experiments involved training models on a mixed dataset and evaluating them on individual test sets, revealing insights into model architecture, size, and task interdependencies, underscoring the potential of lightweight LLMs in judicial AI.

Key takeaway

For Research Scientists developing legal AI systems, this study suggests that lightweight LLMs offer a promising avenue for automating Criminal Court View Generation and improving charge prediction accuracy. You should consider fine-tuning models like Qwen3-1.7B or InternLM2.5-1.8B-Chat, as they demonstrate strong performance, and explore generating court views prior to charge prediction to enhance overall system accuracy and interpretability.

Key insights

Lightweight LLMs show significant potential for Criminal Court View Generation and charge prediction in legal AI.

Principles

Method

Models are fine-tuned using Low-Rank Adaptation (LoRA) on a mixed CVG dataset, then evaluated on separate test sets for CVG and charge prediction.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, NLP Engineer, Domain Expert

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.