Sci-CoE: Co-evolving Scientific Reasoning LLMs via Geometric Consensus with Sparse Supervision

2026-02-12 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Natural Language Processing · Depth: Expert, quick

Summary

Sci-CoE is a two-stage scientific co-evolving framework designed to enhance large language models' (LLMs) reasoning capabilities in scientific tasks. It addresses the fragility of current LLMs in this domain, which stems from unreliable solution evaluation and limited verification diversity. The first stage uses a small set of annotated data to establish fundamental correctness judgment anchors for the Verifier component. The second stage introduces a geometric reward mechanism that considers consensus, reliability, and diversity, enabling large-scale self-iteration on unlabeled data. This transition from sparse supervision to unsupervised learning allows models to self-evolve as both solver and verifier. Experiments on general scientific benchmarks indicate that Sci-CoE improves complex reasoning and demonstrates strong scalability, leading to more robust and diverse evaluation systems.

Key takeaway

For research scientists developing LLMs for scientific reasoning, Sci-CoE offers a robust framework to overcome current limitations in evaluation and verification. You should consider adopting its two-stage approach, leveraging sparse supervision to establish initial correctness and then employing the geometric reward mechanism for large-scale, unsupervised self-iteration to build more resilient and diverse models.

Key insights

Sci-CoE improves LLM scientific reasoning by co-evolving solver and verifier through sparse-to-unsupervised learning.

Principles

Self-evolution enhances LLM reasoning.
Geometric rewards drive diverse verification.
Sparse supervision anchors initial correctness.

Method

Sci-CoE operates in two stages: first, establishing Verifier judgment anchors with sparse annotated data, then self-iterating on unlabeled data using a geometric reward mechanism considering consensus, reliability, and diversity.

In practice

Apply sparse supervision for initial model anchoring.
Implement geometric rewards for self-iteration.
Integrate solver and verifier co-evolution.

Topics

Sci-CoE
Scientific Reasoning
Large Language Models
Co-evolution
Geometric Reward Mechanism

Code references

InternScience/Sci-CoE

Best for: Research Scientist, AI Researcher, AI Scientist, Deep Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.