Why AI can’t be trusted to write scientific reviews
Summary
The Cochrane Collaboration, a London-based publisher specializing in high-quality health-related systematic reviews, is evaluating artificial-intelligence tools for increasing review efficiency and scale. Despite the potential, current AI tools are deemed unready for mainstream adoption in this high-stakes field, where errors could impact clinical practice and public-health policy. While AI models mimic human review processes like study identification and data extraction, they struggle with defining meaningful questions, interpreting results, and understanding clinical implications due to a lack of context, subjective nuance, and a tendency to hallucinate. Furthermore, most available tools are proprietary "black box" systems from private companies, posing independence issues for reviews of drugs and medical devices. Practical experience shows these tools require extensive training and currently make the review process longer than manual methods.
Key takeaway
For research scientists or AI ethicists considering AI integration into systematic reviews, recognize that current tools are not a direct replacement for human expertise. Your focus should shift from full automation to designing collaborative workflows where AI supports, rather than dictates, critical tasks. Prioritize tools with transparency and open-source models to mitigate risks of bias and ensure independence, especially in health-related evidence synthesis.
Key insights
Current AI tools are not ready for high-stakes scientific systematic reviews due to limitations in context, nuance, and transparency.
Principles
- Human specialists are essential for defining review questions, interpreting results, and understanding implications.
- Proprietary "black box" AI tools introduce independence and bias risks in sensitive scientific reviews.
- AI's optimal role in scientific review is collaborative, augmenting human expertise rather than replacing it.
Method
Developers should build systems enabling effective human-AI collaboration for study assessment, moving beyond AI generating individual reviews.
In practice
- Expect current AI tools for screening and data extraction to potentially increase overall review time.
- Scrutinize AI tool provenance and transparency, particularly for reviews evaluating drugs or medical devices.
Topics
- AI in Scientific Review
- Systematic Reviews
- Health Policy
- AI Hallucination
- Human-AI Collaboration
- Proprietary AI
Best for: CTO, VP of Engineering/Data, Director of AI/ML, Research Scientist, AI Scientist, AI Ethicist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.