AI Trained on Birdsong Can Recognize Whale Calls

2026-03-17 · Source: IEEE Spectrum · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Life Sciences & Biology · Depth: Advanced, quick

Summary

Google DeepMind's Perch 2.0, an AI audio model primarily trained on millions of land-based animal recordings including birds, amphibians, insects, and mammals, has shown unexpected strong performance in classifying whale vocalizations. This success is attributed to transfer learning, enabling the model to apply knowledge gained from avian calls to cetacean sounds, thereby reducing computational time and experimentation effort. Researchers evaluated Perch 2.0 on marine audio datasets by converting sounds into spectrograms, generating embeddings, and training a logistic regression classifier, with results presented at a NeurIPS workshop demonstrating good performance even with limited data. The model's effectiveness is theorized to arise from evolutionary parallels in vocal production, the "laws of scale" for large foundation models, and its ability to recognize fine-grained acoustic characteristics across diverse soundscapes. This breakthrough offers a powerful tool for passive acoustic monitoring and aiding whale conservation efforts.

Key takeaway

Google DeepMind's Perch 2.0, an AI audio model trained on land animal bioacoustics, demonstrates strong transfer learning capabilities for classifying whale vocalizations. It achieves robust performance on marine datasets using a logistic regression classifier trained on as few as 4-32 embeddings, often outperforming or matching specialized models. This significantly reduces computation and experimentation effort, enabling scalable bioacoustic monitoring for marine conservation and discovery of new underwater sounds.

Topics

Bioacoustics
Transfer Learning
Perch 2.0
Whale Vocalization
Foundation Models

Best for: Research Scientist, AI Researcher, AI Scientist, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by IEEE Spectrum.