Crosby Starts Contract Benchmark, Launches Agent Research Group

2026-06-17 · Source: Artificial Lawyer · Field: Legal & Regulatory — Legal Technology (LegalTech), Corporate Law & Business Legal Services, Artificial Intelligence & Machine Learning · Depth: Intermediate, short

Summary

NewMod law firm Crosby has launched "Multi-turn Negotiation Bench" (Redline), a contract negotiation benchmark for AI outputs, alongside Crosby Intelligence, a research arm focused on legal agents. The benchmark evaluates frontier models on senior commercial lawyers' workflows, measuring negotiation as a sequence of judgment calls. Initial findings show ChatGPT 5.5 performing best at 50.5%, followed by Claude Fable 5 at 47.3%, Gemini 3.5 Flash at 45.1%, and Claude Opus 4.8 at 44.4%. Human lawyers still excel at finding "new routes to resolution," where AI models tend to get stuck. Crosby Intelligence aims to build "agentic attorneys" for the firm, releasing benchmarks for legal judgment domains and focusing on simulating negotiations to reduce time-to-signature from weeks to hours. Crosby has raised over \$85 million in funding.

Key takeaway

For Directors of AI/ML evaluating legal tech investments, Crosby's initiatives underscore that while AI models are improving, human judgment remains critical for complex contract negotiations. Your teams should focus on solutions that integrate AI for efficiency in high-volume, fixed-fee tasks, but ensure human oversight for "new routes to resolution." Consider benchmarking AI tools against multi-turn negotiation workflows to identify where human-in-the-loop processes are indispensable.

Key insights

Crosby's new benchmark and agent research highlight AI's current negotiation limits and future potential in legal workflows.

Principles

Contract negotiation requires sequential judgment.
AI excels at isolated edits, not "new routes."
Automation boosts fixed-fee legal models.

Method

The "Multi-turn Negotiation Bench" measures AI performance in contract negotiation by evaluating a sequence of judgment calls, including understanding deal context, commercial leverage, legal edits, and anticipating counterparty responses.

In practice

Integrate human judgment for complex negotiation.
Explore agentic AI for high-volume contract work.
Benchmark AI tools against specific legal workflows.

Topics

Legal AI
Contract Negotiation
AI Benchmarking
Legal Agents
Law Firm Automation
Generative AI

Best for: AI Engineer, NLP Engineer, Research Scientist, Legal Professional, AI Scientist, Director of AI/ML

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Lawyer.