Introducing Groundsource: Turning news reports into data with Gemini

· Source: The latest research from Google · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Emerging Technologies & Innovation · Depth: Advanced, medium

Summary

Google Research has introduced Groundsource, a new scalable methodology that uses the Gemini Large Language Model to convert unstructured global news reports into structured, historical data. The initial open-access Groundsource dataset focuses on urban flash floods, comprising 2.6 million records across more than 150 countries from 2000 to the present. This initiative addresses the critical scarcity of high-quality historical data for hydro-meteorological hazards, which traditionally lack standardized global sensor networks. Groundsource processes news in 80 languages, standardizes it to English via Cloud Translation API, and employs Gemini for classification, temporal reasoning, and spatial precision, achieving 82% practical accuracy in location and timing. This expanded dataset significantly enhances the ability to provide near-global urban flash flood forecasts up to 24 hours in advance, now being rolled out in Google's Flood Hub.

Key takeaway

For AI Scientists developing predictive models for natural disasters, Groundsource demonstrates a powerful approach to overcome historical data scarcity. You should consider integrating large language models like Gemini into your data pipeline to extract structured event data from unstructured news and reports. This method can significantly expand your training datasets, improving model accuracy and enabling more timely, localized forecasts for hazards like flash floods, and potentially other events lacking traditional sensor networks.

Key insights

Groundsource leverages Gemini to transform unstructured global news into structured historical data for disaster forecasting.

Principles

Method

Groundsource analyzes news reports, isolates primary text in 80 languages, translates to English, then uses Gemini for classification of events, temporal anchoring, and spatial mapping to Google Maps Platform polygons.

In practice

Topics

Best for: AI Scientist, Research Scientist, Software Engineer, AI Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by The latest research from Google.