OCOO-T : A Simple and Scalable Virtual Cell Model for Transcriptional Perturbation Response Prediction

· Source: cs.AI updates on arXiv.org · Field: Science & Research — Life Sciences & Biology, Mathematics & Computational Sciences, Artificial Intelligence & Machine Learning · Depth: Expert, extended

Summary

OCOO-T is a minimalist flow-matching-based AI Virtual Cell (AIVC) model designed for predicting single-cell transcriptional responses to genetic, chemical, and cytokine perturbations. Developed by INFevo, it employs a vanilla Transformer stack that directly processes continuous gene expression profiles, framing prediction as a continuous-time denoising process. Perturbation embeddings, dosage, and cell-line/cell-type specificity are integrated via adaptive layer normalization and in-context tokens. Evaluated on Tahoe100M, Replogle, and PBMC benchmarks, OCOO-T achieves strong performance across diverse perturbations and cell types. It scales to long transcriptional profiles (e.g., 18,080 genes) through a patching and depatching mechanism, demonstrating that architectural simplicity can yield competitive results without complex auxiliary modules or latent spaces.

Key takeaway

For AI Scientists and Machine Learning Engineers developing virtual cell models, OCOO-T demonstrates that focusing on a minimalist Transformer architecture with direct gene expression processing and flow-matching denoising can achieve strong results. You should consider simplifying your model designs, leveraging patching for full-transcriptome analysis, and exploring direct covariate conditioning to enhance scalability and generalizability in perturbation response prediction. This approach avoids complex latent spaces and auxiliary encoders, streamlining development.

Key insights

OCOO-T uses a simple Transformer and flow matching to directly denoise gene expression for perturbation prediction.

Principles

Method

OCOO-T formulates prediction as continuous-time denoising using Rectified Flow. It interpolates between Gaussian noise and target response, training a Transformer to predict velocity. Inference integrates this velocity field via an ODE solver.

In practice

Topics

Best for: AI Scientist, Machine Learning Engineer, Research Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.