GLVMamba: a global-local visual state space model for remote sensing image segmentation.

Li, Huihui; Pan, Huajian; Liu, Xiaoyong; Ren, Jinchang; Du, Zhiguo; Cao, Jingjing

doi:10.1109/tgrs.2025.3572127

GLVMamba: a global-local visual state space model for remote sensing image segmentation.

Li, Huihui; Pan, Huajian; Liu, Xiaoyong; Ren, Jinchang; Du, Zhiguo; Cao, Jingjing

Authors

Huihui Li

Huajian Pan

Xiaoyong Liu

Professor Jinchang Ren j.ren@rgu.ac.uk
Professor of Computing Science

Zhiguo Du

Jingjing Cao

Abstract

Semantic segmentation of remote sensing images has significant advances with the adoption of deep neural networks, taking the advantages of Convolutional Neural Networks (CNNs) in local feature extraction with Transformers in global information modeling. However, due to the limitations of CNNs in long-range modeling capabilities and the computational complexity constraints of Transformers, remote sensing semantic segmentation still faces issues such as serious holes, rough edge segmentation, false and even missed detections caused by the light, shadow and other factors. To address these issues, we propose a visual state space model called GLVMamba, which employs CNNs as the encoder and the proposed Global-Local Visual State Space (GLVSS) block as the core decoder. Specifically, the GLVSS block introduces locality forward feedback and shift window mechanism to addresses the deficiency of insufficient modeling of neighboring pixel dependencies of Mamba, which enhances the integration of global and local context during feature reconstruction, boosts object perception capabilities of the model, and effectively refines edge contours. Additionally, the scale-aware pyramid pooling (SCPP) module is proposed to fully merge the features from various scales and adaptively fuse and extract the distinguishing features to mitigate the holes and false detections. The GLVMamba effectively captures global-local semantic information and multi-scale feature through the GLVSS block and the SCPP module, achieving efficient and accurate remote sensing semantic segmentation. Extensive experiments on two widely used datasets have effectively demonstrated the superiority of our proposed method over the other state-of-the-art methods. The code will be available at https://github.com/Tokisakiwlp/GLVMamba.

Citation

LI, H., PAN, H., LIU, X., REN, J., DU, Z. and CAO, J. 2025. GLVMamba: a global-local visual state space model for remote sensing image segmentation. IEEE transactions on geoscience and remote sensing [online], 63, article number 4412115. Available from: https://doi.org/10.1109/TGRS.2025.3572127

Journal Article Type	Article
Acceptance Date	May 23, 2025
Online Publication Date	May 23, 2025
Publication Date	Dec 31, 2025
Deposit Date	May 29, 2025
Publicly Available Date	May 29, 2025
Journal	IEEE transactions on geoscience and remote sensing
Print ISSN	0196-2892
Electronic ISSN	1558-0644
Publisher	Institute of Electrical and Electronics Engineers (IEEE)
Peer Reviewed	Peer Reviewed
Article Number	4412115
DOI	https://doi.org/10.1109/tgrs.2025.3572127
Keywords	Remote sensing (RS); Semantic segmentation; Mamba; Scale-aware pyramid pooling (SCPP)
Public URL	https://rgu-repository.worktribe.com/output/2849237

Files

LI 2025 GLVMamba (AAM) (13.8 Mb)
PDF

Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/

Copyright Statement
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Unsupervised domain adaptation for VHR urban scene segmentation via prompted foundation model-based hybrid training joint-optimized network. (2025)
Journal Article

Hyperspectral image classification using a multi-scale CNN architecture with asymmetric convolutions from small to large kernels. (2025)
Journal Article

FusDreamer: label-efficient remote sensing world model for multimodal data classification. (2025)
Journal Article

Entropy guidance hierarchical rich-scale feature network for remote sensing image semantic segmentation of high resolution. (2025)
Journal Article

MDDNet: multilevel difference-enhanced denoise network for unsupervised change detection in SAR images. (2025)
Presentation / Conference Contribution

Downloadable Citations

HTML

BIB

RTF

Authors

Abstract

Citation

Files

You might also like

Downloadable Citations