scLLM-DSC: LLM-Knowledge Enhanced Cross-Modal Deep Structural Clustering for Single-Cell RNA Sequencing

2026-06-12 · Source: cs.AI updates on arXiv.org · Field: Science & Research — Life Sciences & Biology, Mathematics & Computational Sciences, Research Methodology & Innovation · Depth: Expert, extended

Summary

scLLM-DSC is a novel LLM-Knowledge Enhanced Cross-Modal Deep Structural Clustering framework designed for single-cell RNA sequencing (scRNA-seq) analysis. It addresses the semantic agnosticism of traditional clustering methods and the structural mismatch when directly adapting Large Language Models (LLMs) for discriminative tasks. The framework establishes a semantically-grounded representation by combining a Knowledge-Driven Semantic View, derived from NCBI gene priors and contextualized Cell2Sentence embeddings, with a Structure-Aware Topological View extracted via a graph-guided encoder. A crucial cross-modal contrastive alignment mechanism enforces consistency between biological semantics and transcriptomic features in a unified latent space. Extensive benchmarks demonstrate scLLM-DSC significantly outperforms eleven state-of-the-art baselines, achieving 88.80% ACC, 85.35% NMI, and 83.04% ARI, while also providing explicit biological attribution.

Key takeaway

For Research Scientists or Machine Learning Engineers analyzing single-cell RNA sequencing data, if you are struggling with the semantic interpretability or accuracy of current clustering methods, you should consider adopting a knowledge-enhanced approach. scLLM-DSC demonstrates that integrating LLM-derived biological semantics with structural transcriptomic features via cross-modal alignment significantly improves clustering fidelity and provides explicit biological attribution. You can achieve more robust and interpretable cell population identification by moving beyond purely numerical patterns.

Key insights

Integrating LLM-derived biological semantics with structural transcriptomic features enhances scRNA-seq clustering accuracy and interpretability.

Principles

Semantic grounding improves biological reasoning.
Cross-modal alignment unifies diverse data views.
Task-specific optimization overcomes general model limitations.

Method

Collect NCBI gene metadata, encode with LLM, create dual-path cell semantic embeddings, extract structural features via graph encoder, then align and fuse these views using contrastive learning for clustering.

In practice

Use LLMs as semantic mappers for biological priors.
Combine expression-weighted and sequence-based gene semantics.
Apply InfoNCE loss for cross-modal feature alignment.

Topics

Single-Cell RNA Sequencing
Deep Structural Clustering
Large Language Models
Cross-Modal Alignment
Biological Semantics
Cell Population Identification

Code references

XPgogogo/scLLM-DSC

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.