A diffusion model conditioned on compound bioactivity profiles for generating high-content images

· Source: Machine learning : nature.com subject feeds · Field: Science & Research — Life Sciences & Biology, Health & Medical Research, Mathematics & Computational Sciences · Depth: Expert, medium

Summary

Novartis researchers developed Profile-Diffusion (pDIFF), a generative method that uses a profile-to-image latent diffusion model to anticipate phenotypic outcomes of chemical perturbations. Conditioned on in silico bioactivity profiles, pDIFF generates high-content images displaying cellular responses to compound treatments. The model was trained and evaluated using Cell Painting assay images from 3750 molecules, comprising 3375 training compounds and 375 held-out compounds, each with corresponding in silico bioactivity profiles. Evaluation on the held-out set demonstrated that pDIFF produced improved visual depictions of phenotypic responses for compounds structurally dissimilar to training data, outperforming a baseline model. In a virtual hit expansion scenario, pDIFF significantly improved nearest-neighbor retrieval accuracy compared to expansions based on structural representations, bioactivity profiles, or generative models using only substructural molecular descriptors, indicating its potential to accelerate the discovery of novel phenotypically active molecules.

Key takeaway

For AI Scientists and Machine Learning Engineers working in drug discovery, pDIFF offers a novel approach to accelerate the identification of new phenotypically active molecules. By leveraging in silico bioactivity profiles to generate high-content images, your team can achieve more accurate virtual hit expansion, particularly for compounds structurally distinct from known training data. Consider integrating pDIFF into your early-stage drug discovery workflows to enhance screening efficiency and broaden chemical space exploration.

Key insights

pDIFF uses bioactivity profiles to generate high-content images, improving virtual hit expansion for drug discovery.

Principles

Method

pDIFF is a profile-to-image latent diffusion model conditioned on in silico bioactivity profiles, trained on Cell Painting assay images to generate cellular phenotypic outcomes.

In practice

Topics

Code references

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine learning : nature.com subject feeds.