"Excuse me, may I say something..." CoLabScience, A Proactive AI Assistant for Biomedical Discovery and LLM-Expert Collaborations

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems, Health & Medical Research · Depth: Expert, extended

Summary

CoLabScience is a proactive AI assistant designed to improve human-AI collaboration in biomedical discovery by enabling Large Language Models (LLMs) to intervene in scientific discussions autonomously. Unlike traditional reactive LLMs, CoLabScience uses a novel framework called PULI (Positive-Unlabeled Learning-to-Intervene), which is trained with reinforcement learning to determine both when and how to intervene in streaming dialogues. The system leverages a team's project proposal and conversational memory (long-term and short-term) to provide context-aware suggestions. To support its development, the researchers introduced BSDD (Biomedical Streaming Dialogue Dataset), a new benchmark of simulated research discussions with intervention points derived from PubMed articles. Experimental results demonstrate that PULI significantly outperforms existing baselines in intervention precision and collaborative task utility across LLM backbones like LLaMA3 and Qwen3, achieving up to 67.4% accuracy in timing and 33.5% ROUGE-1 for content quality.

Key takeaway

For AI Scientists and Machine Learning Engineers developing collaborative LLM agents, CoLabScience's PULI framework offers a robust approach to move beyond reactive systems. You should consider implementing a two-stage intervention model, separating timing decisions from content generation, to enhance efficiency and relevance. This can significantly improve an LLM's ability to act as an active team member, rather than a passive tool, in complex scientific or professional dialogues.

Key insights

Proactive LLMs can enhance biomedical collaboration by autonomously intervening in discussions based on context and memory.

Principles

Method

PULI uses a coordinator, Observer LLM, and Presenter LLM in an end-to-end reinforcement learning loop. The Observer decides when to intervene, and the Presenter generates content, both informed by project context and dialogue memory.

In practice

Topics

Code references

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.