I got frustrated teaching ML to scientists, so I started building domain-specific workshops – would love your thoughts

· Source: Machine Learning ML & Generative AI News · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Life Sciences & Biology, Engineering & Applied Sciences · Depth: Intermediate, quick

Summary

An AI workshop organizer for biotech and nanotechnology researchers identified a significant gap between standard machine learning education and the practical needs of scientific research. While scientists grasp core ML concepts like gradient descent and cross-validation, they struggle with real-world challenges such as predicting nanoparticle formulations with $800 per experiment costs, handling datasets with only 47 data points from mass spectrometers, and quantifying prediction certainty for reviewers. The core issue is that scientific research involves expensive, time-consuming data collection where uncertainty is critical, contrasting with standard ML's assumption of abundant, cheap data focused on accuracy. The organizer currently runs 2-3 day intensive workshops covering standard ML techniques (CNNs, ensemble methods, PyTorch) framed around specific research scenarios like drug screening with 50 compounds or materials property prediction with limited synthesis data, but is questioning if this approach is sufficient.

Key takeaway

For AI Scientists designing educational programs for domain experts, recognize that standard ML curricula often overlook the realities of scientific data scarcity and the critical need for uncertainty quantification. Your workshops should prioritize specialized techniques like Bayesian methods, physics-informed neural networks, and active learning, or at least deeply integrate small-data strategies beyond basic transfer learning, to truly equip researchers for their specific challenges.

Key insights

Standard ML education often fails to address the unique data constraints and uncertainty requirements of scientific research.

Principles

Method

Workshops frame standard ML techniques (CNNs, ensemble methods, transfer learning) within specific scientific research scenarios like drug screening with limited compounds or materials property prediction.

In practice

Topics

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Machine Learning ML & Generative AI News.