Unveiling the GaoFen-7 Building Dataset: A New Horizon in Satellite-Based Urban and Rural Building Extraction

In urban development and environmental studies, accurate and efficient building data extraction from satellite imagery is a cornerstone for myriad applications. This endeavor, while technologically advanced, has faced significant hurdles due to the intricate and variable nature of urban landscapes, especially across China’s diverse urban and rural expanses. Traditionally, methods like pixel-based classifications struggled against the backdrop of complex environments, leading researchers to turn towards convolutional neural networks (CNNs) and deep learning for solutions. These advanced methods grapple with a common Achilles’ heel: the dire need for extensive, high-quality training data reflective of real-world diversity.

A study by researchers from Sun Yat-Sen University, Southern Marine Science and Engineering Guangdong Laboratory, International Research Center of Big Data for Sustainable Development Goals, Peng Cheng Laboratory, The Key Laboratory of Natural Resources Monitoring in Tropical and Subtropical Area of South China, Remote Sensing Application Center, Ministry of Housing and Urban-Rural Development of the People’s Republic of China, and China Academy of Urban Planning and Design introduces a dataset from GaoFen-7 (GF-7) satellite images meticulously assembled to bolster building extraction endeavors across China. The GF-7 Building dataset emerges as a beacon in this domain, featuring 5,175 pairs of high-resolution image tiles encapsulating an expansive coverage of 573.17 km². With a staggering count of 170,015 buildings, the dataset highlights urban structures with 84.8%. It sheds light on rural constructions with 15.2%, offering a balanced representation that has been notably absent in previous datasets.

The creation of the GF-7 Building dataset was driven by a meticulous process involving manual digitization and the innovative use of cadastral building data to ensure unparalleled accuracy and detail in building labels. This process was crucial for capturing the intricate geometries of buildings, a task that simplified boundary representations from sources like OpenStreetMap could not accomplish. The dataset’s high resolution and the inclusion of both urban and rural environments make it a comprehensive tool for model training and evaluation, setting a new standard in the field.

To validate the dataset’s efficacy, the research team embarked on extensive evaluations using seven state-of-the-art CNN models designed for semantic segmentation. These models, ranging from Fully Convolutional Network (FCN) 8S to High-Resolution Network (HRNet), were chosen for their diverse approaches to semantic segmentation, each bringing unique strengths to the task of building extraction. All models achieved an overall accuracy exceeding 93%, with HRNet and Attention UNet showing exceptional performance, thereby underscoring the dataset’s robustness and versatility.

The introduction of the GF-7 Building dataset is timely, aligning with the increasing reliance on deep learning in building extraction tasks. Its high resolution, quality, and diverse representation of China’s built environment significantly enhance the development of algorithms for building detection. This dataset serves as a new benchmark for model performance evaluation and a foundational resource that will drive future innovations in urban planning and environmental studies.

In conclusion, the GF-7 Building dataset is a monumental contribution to remote sensing and urban planning. It addresses a significant gap by providing a high-quality, diverse dataset for building extraction in China, which is crucial for advancing deep learning models in this domain. The meticulous creation process, combined with the dataset’s comprehensive coverage and the successful validation experiments, highlights its potential to revolutionize building extraction methodologies. As urban landscapes continue to evolve, the GF-7 Building dataset will undoubtedly play a pivotal role in shaping the future of urban analysis and development, offering researchers and practitioners a powerful tool to navigate the complexities of the built environment.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and Google News. Join our 37k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and LinkedIn Group.

If you like our work, you will love our newsletter..

Don’t Forget to join our Telegram Channel

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...