Huihui Li
GLVMamba: a global-local visual state space model for remote sensing image segmentation.
Li, Huihui; Pan, Huajian; Liu, Xiaoyong; Ren, Jinchang; Du, Zhiguo; Cao, Jingjing
Authors
Huajian Pan
Xiaoyong Liu
Professor Jinchang Ren j.ren@rgu.ac.uk
Professor of Computing Science
Zhiguo Du
Jingjing Cao
Abstract
Semantic segmentation of remote sensing images has significant advances with the adoption of deep neural networks, taking the advantages of Convolutional Neural Networks (CNNs) in local feature extraction with Transformers in global information modeling. However, due to the limitations of CNNs in long-range modeling capabilities and the computational complexity constraints of Transformers, remote sensing semantic segmentation still faces issues such as serious holes, rough edge segmentation, false and even missed detections caused by the light, shadow and other factors. To address these issues, we propose a visual state space model called GLVMamba, which employs CNNs as the encoder and the proposed Global-Local Visual State Space (GLVSS) block as the core decoder. Specifically, the GLVSS block introduces locality forward feedback and shift window mechanism to addresses the deficiency of insufficient modeling of neighboring pixel dependencies of Mamba, which enhances the integration of global and local context during feature reconstruction, boosts object perception capabilities of the model, and effectively refines edge contours. Additionally, the scale-aware pyramid pooling (SCPP) module is proposed to fully merge the features from various scales and adaptively fuse and extract the distinguishing features to mitigate the holes and false detections. The GLVMamba effectively captures global-local semantic information and multi-scale feature through the GLVSS block and the SCPP module, achieving efficient and accurate remote sensing semantic segmentation. Extensive experiments on two widely used datasets have effectively demonstrated the superiority of our proposed method over the other state-of-the-art methods. The code will be available at https://github.com/Tokisakiwlp/GLVMamba.
Citation
LI, H., PAN, H., LIU, X., REN, J., DU, Z. and CAO, J. 2025. GLVMamba: a global-local visual state space model for remote sensing image segmentation. IEEE transactions on geoscience and remote sensing [online], Early Access. Available from: https://doi.org/10.1109/TGRS.2025.3572127
Journal Article Type | Article |
---|---|
Acceptance Date | May 23, 2025 |
Online Publication Date | May 23, 2025 |
Deposit Date | May 29, 2025 |
Publicly Available Date | May 29, 2025 |
Journal | IEEE transactions on geoscience and remote sensing |
Print ISSN | 0196-2892 |
Electronic ISSN | 1558-0644 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
DOI | https://doi.org/10.1109/tgrs.2025.3572127 |
Keywords | Remote sensing (RS); Semantic segmentation; Mamba; Scale-aware pyramid pooling (SCPP) |
Public URL | https://rgu-repository.worktribe.com/output/2849237 |
Files
LI 2025 GLVMamba (AAM)
(13.8 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
Copyright Statement
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
Prototype-guided spatial-spectral interaction network for hyperspectral anomaly detection.
(2025)
Journal Article
FusDreamer: label-efficient remote sensing world model for multimodal data classification.
(2025)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search