Multi-scale Feature Extraction and Fusion Net: Research on UAVs Image Semantic Segmentation Technology
Keywords:Semantic segmentation, drone image, Deep learning, multi-scale feature extraction, contextual information
Since UAV aerial images are usually captured by UAVs at high altitudes with oblique viewing angles, the amount of data is large, and the spatial resolution changes greatly, so the information on small targets is easily lost during segmentation. Aiming at the above problems, this paper presents a semantic segmentation method for UAV images, which introduces a multi-scale feature extraction and fusion module based on the encoding-decoding framework. By combining multi-scale channel feature extraction and multi-scale spatial feature extraction, the network can focus more on certain feature layers and spatial regions when extracting features. Some invalid redundant features are eliminated and the segmentation results are optimized by introducing global context information to capture global information and detailed information. Moreover, one compares the proposed method with FCN-8s, MSDNet, and U-Net network models on the large-scale multi-class UAV dataset UAVid. The experimental results indicate that the proposed method has higher performance in both MIoU and MPA, with an overall improvement of 9.2% and 8.5%, respectively, and its prediction capability is more balanced for both large-scale and small-scale targets.
Shelhamer E, Long J, Darrell T. Fully Convolutional Networks for Semantic Segmentation [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017, 39(4): 640–651.
Duan Lijuan, Sun Qichao, Qiao Yuanhua, et al. Semantic Segmentation Algorithm of RGB-D Indoor Image Based on Attention Perception and Semantic Perception [J]. Chinese Journal of Computers, 2021.
Badrinarayanan V, Kendall A, Cipolla R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481–2495.
Zhao H S, Shi J P, Qi X, et al. Pyramid Scene Parsing Network [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawail. USA. 2017: 6230–6239.
Chen L C, Papandreou G, Kokkinos I, et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs [J]. Computer Science, 2014(4): 357–361.
Chen L C, Papandreou G, Kokkinos I, et al. Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs [J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 40(4): 834–848.
Chen L C, Papandreou G, Schroff F, et al. Rethinking Atrous Convolution for Semantic Image Segmentation [J]. ArXiv preprint arXiv: 1706.05587, 2017.
Chen L C, Zhu Y, Papandreou G, et al. Encoder-decoder with Atrous Separable Convolution for Semantic Image Segmentation [C]. Proceedings of the European conference on computer vision (ECCV), 2018: 801–818.
Badrinarayanan V, Kendall A. Cipolla R. SegNet: A Deep Convolutional Encoder-decoder Architecture for Image Segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481–2495.
Ronneberger O, Fischer P, Brox T. U-Net: Convolution Networks for Biomedical Image Segmentation [C]. International Conference on Medical Image Computing and Computer-Assisted Intervention. Munich, Germany, 2015: 234–241.
Jegou S, Drozdzal M, Vazquez D, et al. The One Hundred layers Tiramisu: Fully Convolutional Densenets for Semantic Segmentation. arXiv preprint arXiv: 1611.09326, 2016.
Lin G S, Milan A, Shen C H, Reid I. RefineNet: Multi-path Refinement Network for High-resolution Semantic Segmentation. //Proceedings of the IEEE(Conference on Computer Vision and Pattern Recognition. Hawaii, USA. 2017: 5168–5177.
Wang Yanran, Chen Qingliang, Wu Junjun. A Review of Image Semantic Segmentation Methods for Complex Environments [J]. Computer Science, 2019, 46(9): 36–46.
Ye L, Vosselman G, Xia G, et al. UAVid: A Semantic Segmentation Dataset for UAV Imagery [J]. 2018.