Claude Legal Prompt Shock, LegalOn GPT 5.4 Review, Legal Innovators +
Summary
Nav Toor recently published 12 detailed legal prompts for Claude, incorporating famous law firm brands like Latham & Watkins and Kirkland & Ellis, which sparked debate within the legal community regarding the quality and impact of such AI-generated contracts. An experiment by Artificial Lawyer (AL) using OpenAI's GPT 5.4 with one of these prompts produced a comprehensive contract, raising questions about its legal soundness. Concurrently, LegalOn's analysis of GPT 5.4 versus GPT 5.2 in contract review showed significant improvements for 5.4, with overall accuracy increasing by 5.5 percentage points to 79.4% and total errors reduced by 21%. This improvement was consistent across all five contract types, with notable gains in NDAs (+10pp) and MSAs (+8pp). The Financial Times adopted Wordsmith as its enterprise legal AI platform, while Harvey partnered with the Dallas Mavericks, American Airlines Center, and Fulham FC. Herbert Smith Freehills Kramer is implementing Legora firmwide, and Centari expanded its deal intelligence platform with "Views" and "Intelligence" products. Additionally, AltaClaro and Verbit's DepoSim were chosen by Taft for lawyer training.
Key takeaway
For legal professionals evaluating AI tools for contract drafting or review, recognize that general LLMs like GPT 5.4 are demonstrating significant, consistent accuracy improvements in legal tasks. You should prioritize solutions that integrate these advanced models, but always maintain rigorous human oversight for critical legal outputs, especially for complex or high-stakes agreements. Be cautious about relying solely on prompt engineering with brand names to ensure quality.
Key insights
LLMs are steadily improving in legal task accuracy, reducing the need for extensive human oversight.
Principles
- Adding firm names to prompts may not guarantee quality.
- Accuracy gains compound over successive LLM versions.
Method
LegalOn compared GPT 5.4 and 5.2 "out of the box" performance on contract review across five contract types and 26 guidelines to measure accuracy, precision, and recall.
In practice
- Experiment with branded prompts in a "safe space."
- Monitor LLM accuracy improvements for legal tech adoption.
- Consider AI platforms for broad legal functions.
Topics
- Legal AI
- Large Language Models
- Contract Review
- Legal Technology Adoption
- Prompt Engineering
Best for: NLP Engineer, Entrepreneur, Legal Professional, AI Product Manager, Executive
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Lawyer.