Thomson Is Coming, TR’s Own Legally-Trained LLM
Summary
Thomson Reuters (TR) is preparing to launch "Thomson," its proprietary legally-trained Large Language Model (LLM), this summer. Developed since 2024 following the acquisition of Safe Sign, Thomson is built upon open-source models like Meta or Mistral, leveraging TR's extensive legal data for pre-training and post-training. The LLM aims to outperform general models on legal tasks, with internal benchmarks already showing superior performance in four out of ten key legal areas. Thomson will integrate with existing TR products like CoCounsel, enhancing contract review and legal research capabilities. Its architecture is described as a "huge letter T," combining broad general language understanding with deep legal, tax, and news-related training data, offering portability across foundational open-source models and emphasizing enhanced security and privacy through potential on-premises deployment.
Key takeaway
For legal technology leaders evaluating LLM solutions, Thomson Reuters' "Thomson" demonstrates that purpose-built, domain-specific LLMs can achieve superior performance on legal tasks compared to general models. Your teams should consider the benefits of specialized models for accuracy, security, and privacy, especially for sensitive applications like contract review and legal research. Prioritize solutions that offer robust data governance and the flexibility to adapt to evolving foundational models.
Key insights
Specialized LLMs, trained on vast domain-specific data, can outperform general models on targeted tasks.
Principles
- Open-source models offer foundational flexibility.
- Domain-specific data is critical for specialized LLM performance.
- Expert input refines LLM accuracy and utility.
Method
Develop a specialized LLM by fine-tuning open-source models with massive proprietary domain data and expert input, allowing for portability and continuous improvement.
In practice
- Integrate specialized LLMs into existing workflows.
- Consider on-premises deployment for data privacy.
- Continuously update LLM with new data and training.
Topics
- Thomson Reuters
- Legal LLMs
- Generative AI
- Open-Source Models
- Legal Tech
Best for: CTO, VP of Engineering/Data, Director of AI/ML, AI Engineer, Legal Professional, AI Product Manager
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Lawyer.