Perplexity upgrades Deep Research tool with Claude Opus 4.5 integration
Summary
Perplexity has upgraded its Deep Research tool, integrating Anthropic's Claude Opus 4.5 model with its proprietary search engine and sandbox infrastructure. This enhancement is immediately available to Max subscribers and will soon roll out to Pro users. Concurrently, Perplexity released DRACO, an open-source benchmark designed to evaluate deep research agents based on real-world usage patterns across 100 tasks in 10 domains, including Finance, Law, and Medicine. The benchmark assesses performance against 40 expert-defined criteria across four dimensions: factual accuracy, breadth/depth of analysis, presentation, and citation quality. Perplexity's Deep Research tool achieved a normalized score of 67.15% on DRACO, outperforming Google Gemini Deep Research (58.97%) and OpenAI Deep Research (52.06% with o3 model). It also demonstrated the lowest average latency at 459.6 seconds.
Key takeaway
For AI Scientists and Research Scientists evaluating deep research agents, Perplexity's upgraded Deep Research tool, leveraging Claude Opus 4.5, demonstrates superior performance on the new DRACO benchmark. You should consider this tool for applications requiring high factual accuracy and comprehensive analysis, especially in domains like Law and Medicine, where it showed significant leads over competitors. Its low latency also makes it suitable for time-sensitive research tasks.
Key insights
Perplexity's Deep Research tool, powered by Claude Opus 4.5, leads in a new open-source benchmark for comprehensive research agents.
Principles
- Benchmarks should reflect real-world usage.
- Accuracy and efficiency are critical for research agents.
Method
DRACO benchmark tasks are derived from anonymized user requests, augmented into complex, open-ended research requirements, and evaluated across 40 expert criteria.
In practice
- Use DRACO to evaluate deep research agents.
- Prioritize tools with low latency and high accuracy.
Topics
- Perplexity Deep Research
- Claude Opus 4.5
- DRACO Benchmark
- Deep Research Agents
- AI Search Engines
Best for: AI Scientist, Research Scientist, AI Researcher, AI Product Manager, Tech Journalist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Dataconomy.