Google researchers are using artificial intelligence to study decades of news reports about flooding. They turn these stories into structured data that can help predict flash floods in areas with few historical records.
The project uses Google’s large language models to scan millions of old news stories about past floods and turn them into data that computers can use to train forecasting systems.
AI analyzes millions of news articles about floods
According to TechCrunch, Google researchers used Gemini, the company’s large language model, to analyze 5 million news articles from around the world, identifying reports of 2.6 million different floods and converting them into a dataset called “Groundsource.”
The Groundsource dataset sorts the information it finds into fields like location, timing, and event descriptions. Turning news stories into standard records lets researchers create training data for machine-learning models that predict flood risks.
This method helps AI systems find patterns in past flood events that would otherwise stay hidden in scattered newspaper and online reports.
Filling gaps in global disaster data
Flood prediction models typically depend on environmental sensors, river gauges, rainfall monitoring stations, and satellite imagery. However, those datasets are often incomplete, particularly in developing countries where historical monitoring infrastructure was limited or nonexistent.
By using news stories as a data source, researchers can rebuild flood histories that were never officially recorded in scientific databases.
TechBuzz reported that the project uses LLMs to convert historical news reports into quantitative data for flash flood prediction systems, helping address critical data gaps in disaster forecasting.
Through natural language processing, the system interprets descriptions of flooding events and converts them into structured variables that machines can analyze.
The technology transforms qualitative narrative accounts from old newspapers into structured datasets that machine learning models can process.
Flood Hub platform expands early-warning capabilities
Google is already applying the dataset to improve its existing flood forecasting services. The company’s flash-flood prediction model is now highlighting risks for urban areas in 150 countries on the company’s Flood Hub platform.
Flood Hub is a public platform that gives flood forecasts and early warnings to communities, emergency responders, and government agencies.
The system brings together weather forecasts, hydrological models, and AI analysis to estimate flood risks. This helps authorities get ready for evacuations or disaster response.
Early tests show the system can help emergency managers respond more quickly when floods happen.
Experts say data scarcity remains a major challenge
Environmental researchers say the approach highlights a creative solution to one of the biggest obstacles in climate modeling: the lack of historical data.
Marshall Moutenot, CEO of the environmental analytics firm Upstream Tech, emphasized the scale of the problem, noting that “data scarcity is one of the most difficult challenges in geophysics.”
Machine-learning systems need large datasets to train reliable prediction models, but many environmental disasters, especially flash floods, are not well documented in scientific records.
By turning old news stories into structured data, AI systems can use decades of information that would otherwise be unusable.
Expanding AI tools for disaster forecasting
Researchers say the same approach could be applied to other environmental hazards that lack consistent datasets.
This technique could also help analyze old records of heat waves, mudslides, or landslides. This would let AI models find patterns and improve risk forecasting.
As extreme weather events happen more often because of climate change, researchers are looking for new ways to rebuild environmental histories and improve early warning systems.
Using artificial intelligence to study decades of journalism shows that news archives, which have long helped document disasters, may also become a valuable resource for predicting them.