Planet-led RapidAI4EO Consortium Releases One of the Largest Earth Observation Training Datasets for Machine Learning Applications

First published at https://www.planet.com

In January of 2021, Planet set out to lead the RapidAI4EO consortium to advance state-of-the-art, continuous land monitoring applications throughout Europe. The initiative was awarded a competitive grant under the Horizon 2020 program to develop improved AI processes and provide critical training data for higher frequency updates of land use land cover. As an output of this program, today, we are proud to share the release of one of the largest (both temporally and spatially) training datasets to date of satellite imagery suited for diverse research applications in the machine learning domain. This dataset is accessible to the entire remote sensing community on Source Cooperative, Radiant Earth’s new cloud-based neutral data publishing utility (license terms apply).

Covering 500,000 patch locations across Europe with a frequency of every five days over two years, this dataset accounts for country representation and spatial distribution. The EO data is sourced from Planet partner Vision Impulse, who created monthly cloud-free Sentinel-2 image mosaics, with 10 meter resolution, and Planet Fusion Monitoring, which provides a 3 meter image every five days. Our Fusion Monitoring product offers a combination of multi-sensor data types that are all refined into a single uninterrupted datastream. Consisting of our high-frequency, daily satellite data fused with publicly sourced satellite data, Fusion offers gap free insights, ideal for time-series analysis.

“Thanks to programs like Copernicus and Horizon, Europe already has a world class downstream EO services industry. We believe that the launch of this powerful dataset can support the EU’s progress towards tackling climate change, advancing the UN SDGs, and driving further growth for the European EO ecosystem,” said Massimiliano Vitale, Planet’s Senior Vice President of Operations EMEA.

In order to train models to identify changes in landscape types such as crops, forests, and urban regions, an abundance of time series data is critical. Some land cover changes can only be identified by understanding how they change over time, such as seasonal crop patterns. While European land cover datasets have been in existence for some time, this high cadence time series at all locations is a key innovation, enabling the European region to classify and evaluate their changing landscape with more insights than ever before. While this dataset is designed for analysis of land use and land cover change, the insights can be generalized to a number of research initiatives which would benefit from dense time series, such as agricultural monitoring.

“We are proud to host one of the largest open Earth observation training dataset to date, thanks to the RapidAI4EO consortium led by Planet,” said Jed Sundwall, Executive Director at Radiant Earth. “The ambitious scale of this project helped us accelerate the development of Source Cooperative, our new data publishing utility. Planet has set a new standard for open Earth observation training datasets and we expect this dataset to enable reproducible scientific research for years to come.”

The training data has already enabled the creation of AI-powered change detection models to derive heat maps of change, helping to prioritize areas for map updates. With this high cadence time series, we believe the data can open the door to a new family of high fidelity machine learning models that can disentangle phenology from structural change and learn the dynamism of land covers. The release of this novel training dataset is an exciting step forward for understanding European land use for inter alia research purposes, and we are eager to see the many ways it will benefit the region.