Enhanced Feature Pyramid Network with Deep Semantic Embedding for Remote Sensing Scene Classification
journal contributionposted on 07.01.2021, 16:36 by X Wang, S Wang, C Ning, Huiyu Zhou
Recent progress on remote sensing scene classification is substantial, benefiting mostly from the explosive development of convolutional neural networks (CNNs). However, different from the natural images in which the objects occupy most of the space, objects in remote sensing images are usually small and separated. Therefore, there is still a large room for improvement of the vanilla CNNs that extract global image-level features for remote sensing scene classification, ignoring local object-level features. In this paper, we propose a novel remote sensing scene classification method via enhanced feature pyramid network with deep semantic embedding. Our proposed framework extracts multi-scale multi-level features using an enhanced feature pyramid network (EFPN). Then, to leverage the complementary advantages of the multi-level and multi-scale features, we design a deep semantic embedding (DSE) module to generate discriminative features. Third, a feature fusion module, called two-branch deep feature fusion (TDFF), is introduced to aggregate the features at different levels in an effective way. Our method produces state-of-the-art results on two widely used remote sensing scene classification benchmarks, with better effectiveness and accuracy than the existing algorithms. Beyond that, we conduct an exhaustive analysis on the role of each module in the proposed architecture, and the experimental results further verify the merits of the proposed method.