Claude Legal Prompt Shock, LegalOn GPT 5.4 Review, Legal Innovators +

2026-03-20 · Source: Artificial Lawyer · Field: Legal & Regulatory — Legal Technology (LegalTech), Corporate Law & Business Legal Services, Compliance & Risk Management · Depth: Intermediate, medium

Summary

Nav Toor recently published 12 detailed legal prompts for Claude, incorporating famous law firm brands like Latham & Watkins and Kirkland & Ellis, which sparked debate within the legal community regarding the quality and impact of such AI-generated contracts. An experiment by Artificial Lawyer (AL) using OpenAI's GPT 5.4 with one of these prompts produced a comprehensive contract, raising questions about its legal soundness. Concurrently, LegalOn's analysis of GPT 5.4 versus GPT 5.2 in contract review showed significant improvements for 5.4, with overall accuracy increasing by 5.5 percentage points to 79.4% and total errors reduced by 21%. This improvement was consistent across all five contract types, with notable gains in NDAs (+10pp) and MSAs (+8pp). The Financial Times adopted Wordsmith as its enterprise legal AI platform, while Harvey partnered with the Dallas Mavericks, American Airlines Center, and Fulham FC. Herbert Smith Freehills Kramer is implementing Legora firmwide, and Centari expanded its deal intelligence platform with "Views" and "Intelligence" products. Additionally, AltaClaro and Verbit's DepoSim were chosen by Taft for lawyer training.

Key takeaway

For legal professionals evaluating AI tools for contract drafting or review, recognize that general LLMs like GPT 5.4 are demonstrating significant, consistent accuracy improvements in legal tasks. You should prioritize solutions that integrate these advanced models, but always maintain rigorous human oversight for critical legal outputs, especially for complex or high-stakes agreements. Be cautious about relying solely on prompt engineering with brand names to ensure quality.

Key insights

LLMs are steadily improving in legal task accuracy, reducing the need for extensive human oversight.

Principles

Adding firm names to prompts may not guarantee quality.
Accuracy gains compound over successive LLM versions.

Method

LegalOn compared GPT 5.4 and 5.2 "out of the box" performance on contract review across five contract types and 26 guidelines to measure accuracy, precision, and recall.

In practice

Experiment with branded prompts in a "safe space."
Monitor LLM accuracy improvements for legal tech adoption.
Consider AI platforms for broad legal functions.

Topics

Legal AI
Large Language Models
Contract Review
Legal Technology Adoption
Prompt Engineering

Best for: NLP Engineer, Entrepreneur, Legal Professional, AI Product Manager, Executive

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Lawyer.