Research News

New Foundation Model Improves Accuracy for Remote Sensing Image Interpretation

September 04, 2022

A new foundation model dubbed RingMo has been developed to improve accuracy for remote sensing image interpretation, according to the Aerospace Information Research Institute (AIR), Chinese Academy of Sciences (CAS).

The study, "RingMo: A Remote Sensing Foundation Model with Masked Image Modeling," was published in IEEE Transactions on Geoscience and Remote Sensing, 2022, doi: 10.1109/TGRS.2022.3194732.

Remote sensing images has been successfully applied in many fields, such as classification and change detection, and deep learning approaches have contributed to the rapid development of remote sensing (RS) image interpretation. The most widely used training paradigm is to utilize ImageNet pre-trained models to process RS data for specified tasks. However, there are issues such as domain gap between natural and RS scenes, and the poor generalization capacity of RS models. It makes sense to develop a foundation model with general RS feature representation. Since a large amount of unlabeled data is available, the self-supervised method has more development significance than the fully supervised method in remote sensing.

The study aims to propose a remote sensing foundation model framework, which can leverage the benefits of generative self-supervised learning for RS images. RingMo features a large-scale dataset constructed by collecting two million RS images from satellite and aerial platforms, covering multiple scenes and objects around the world. In addition, RS foundation model training method is designed for dense and small objects in complicated RS scenes.

RingMo is the first generative foundation model for cross-modal remote sensing data. It is state-of-the-art on eight datasets across four downstream tasks, demonstrating the effectiveness of the proposed framework. In the future, the model can be applied to 3D reconstruction, residential construction, transportation, water conservancy, environmental protection and other fields.

Fig. 1. Flowchart for training the remote sensing base model RingMo and fine-tuning downstream interpretation tasks.(Image by AIR)