Accelerating pharmaceutical research with a multi-agent AI system
Summary
A multi-agent AI system was developed for a pharmaceutical client to accelerate research by making decades of fragmented data discoverable and usable. The project evolved through three stages: "Search" using semantic search and knowledge graphs, "Ask" incorporating sophisticated retrieval and a conversational UI, and "Do" which utilized multiple agents for research, evaluation, and content creation. Key technical innovations included a four-agent system featuring a critical "reflection agent" to ensure accuracy and comprehensiveness, alongside robust AI evaluation using Langfuse and Ragas for metrics like answer relevancy and faithfulness. Data quality was also enhanced through a process involving PDF extraction, chunking, embeddings, GPT-4 for named entity recognition, and LLM-as-judge evaluation. This system reduced manual searching by 90% and regulatory report drafting from weeks to minutes, avoiding redundant experiments.
Key takeaway
For AI Engineers or Research Scientists tasked with integrating AI into data-rich, regulated environments like pharmaceuticals, consider a multi-agent architecture. Your approach should prioritize iterative development from search to agentic "do" capabilities, incorporating reflection agents for accuracy. Implement robust LLM evaluation using platforms like Langfuse and Ragas, and invest in structuring fragmented data with tools like GPT-4 for named entity recognition to ensure reliable, high-impact outcomes.
Key insights
Multi-agent AI systems can transform fragmented data into actionable insights, significantly accelerating complex research processes.
Principles
- Iterative AI development (Search -> Ask -> Do) improves capability.
- Reflection agents enhance AI accuracy and comprehensiveness.
- Ground truth curation is vital for robust LLM evaluation.
Method
Implement a multi-stage AI system: semantic search, conversational Q&A, and agentic task execution. Enhance data quality via extraction, chunking, embedding, NER, and LLM-as-judge evaluation.
In practice
- Use a reflection agent to self-correct AI outputs.
- Integrate Langfuse and Ragas for LLM evaluation metrics.
- Structure unstructured data with GPT-4 for NER.
Topics
- Multi-agent AI Systems
- Pharmaceutical Research
- LLM Evaluation
- Knowledge Graphs
- Named Entity Recognition
- Data Quality Improvement
Best for: AI Engineer, Research Scientist, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Thoughtworks Insights.