Shuchang Lyu
Unsupervised domain adaptation for VHR urban scene segmentation via prompted foundation model-based hybrid training joint-optimized network.
Lyu, Shuchang; Zhao, Qi; Sun, Yaxuan; Cheng, Guangliang; He, Yiwei; Wang, Guangbiao; Ren, Jinchang; Shi, Zhenwei
Authors
Qi Zhao
Yaxuan Sun
Guangliang Cheng
Yiwei He
Guangbiao Wang
Professor Jinchang Ren j.ren@rgu.ac.uk
Professor of Computing Science
Zhenwei Shi
Abstract
Unsupervised Domain Adaptation for Remote Sensing Semantic Segmentation (UDA-RSSeg) is to adapt a model trained on the source domain data to the target domain samples, thereby minimizing the need for annotated data across diverse remote sensing scenes. In urban planning and monitoring, the task of UDA-RSSeg on Very-High-Resolution (VHR) images has garnered significant research interest. While recent deep learning techniques have demonstrated huge success in tackling the UDA-RSSeg task for VHR urban scenes, a persistent challenge in addressing the domain shift issue remains. Specifically, there are two primary problems: (1) severe inconsistencies in feature representation across diverse domains, characterized by notably differing data distributions, and (2) the domain gap problem due to the representation bias of the source domain patterns when translating features to predictive logits. To solve these problems, we propose a prompted foundation model based hybrid training joint-optimized network (PFM-JONet) for UDA-RSSeg on VHR urban scene. Our approach integrates the notable "Segment Anything Model" (SAM) as prompted foundation model to leverage its robust generalized representation capabilities, thereby alleviating feature inconsistencies. Based on the feature extracted by SAM-Encoder, we introduce a mapping decoder designed to convert SAM-Encoder features into predictive logits. Additionally, a prompted segmentor is employed to generate class-agnostic maps, which guide the mapping decoder’s feature representations. To efficiently optimize the entire network in an end-to-end manner, we design a hybrid training scheme that integrates feature-level and logits-level adversarial training strategies alongside a self-training mechanism. This scheme enhances the model from diverse, compatible perspectives. To evaluate the performance of our proposed PFM-JONet, we conduct extensive experiments on urban scene benchmark datasets, including ISPRS (Potsdam/Vaihingen) and CITY-OSM (Paris/Chicago). On ISPRS dataset, PFM-JONet surpasses previous SOTA methods by 1.60% in mean IoU value across four adaptation tasks. For CITY-OSM's adaptation task, it outperforms SOTA by 4.84% in mean IoU value. These results demonstrate the effectiveness of our method. Furthermore, visualization and analysis reinforce the method's interpretability. The code of this paper is available at https://github.com/CV-ShuchangLyu/PFM-JONet.
Citation
LYU, S., ZHAO, Q., SUN, Y., CHENG, G., HE, Y., WANG, G., REN, J. and SHI, Z. 2025. Unsupervised domain adaptation for VHR urban scene segmentation via prompted foundation model based hybrid training joint-optimized network. IEEE transactions on geoscience and remote sensing [online], 63, article number 4409117. Available from: https://doi.org/10.1109/tgrs.2025.3564216
Journal Article Type | Article |
---|---|
Acceptance Date | Apr 22, 2025 |
Online Publication Date | Apr 24, 2025 |
Publication Date | Dec 31, 2025 |
Deposit Date | May 5, 2025 |
Publicly Available Date | May 5, 2025 |
Journal | IEEE transactions on geoscience and remote sensing |
Print ISSN | 0196-2892 |
Electronic ISSN | 1558-0644 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
Volume | 63 |
Article Number | 4409117 |
DOI | https://doi.org/10.1109/TGRS.2025.3564216 |
Keywords | Unsupervised domain adaptation; Semantic segmentation; Hybrid training; Prompted foundation model; Ver-high-resolution images; Urban scene |
Public URL | https://rgu-repository.worktribe.com/output/2801786 |
Files
LYU 2025 Unsupervised domain adaptation (VOR)
(4.6 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by-nc-nd/4.0/
Copyright Statement
© 2025 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
You might also like
Prototype-guided spatial-spectral interaction network for hyperspectral anomaly detection.
(2025)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search