Tianyi Gao
Enrich, distill and fuse: generalized few-shot semantic segmentation in remote sensing leveraging foundation model's assistance.
Gao, Tianyi; Ao, Wei; Wang, Xing Ao; Zhao, Yuanhao; Ma, Ping; Xie, Mengjie; Fu, Hang; Ren, Jinchang; Gao, Zhi
Authors
Wei Ao
Xing Ao Wang
Yuanhao Zhao
Ms Ping Ma p.ma2@rgu.ac.uk
Research Fellow
Mengjie Xie
Hang Fu
Professor Jinchang Ren j.ren@rgu.ac.uk
Professor of Computing Science
Zhi Gao
Abstract
Generalized few-shot semantic segmentation (GFSS) unifies semantic segmentation with few-shot learning, showing great potential for Earth observation tasks under data scarcity conditions, such as disaster response, urban planning, and natural resource management. GFSS requires simultaneous prediction for both base and novel classes, with the challenge lying in balancing the segmentation performance of both. Therefore, this paper introduces a novel framework named FoMA, Foundation Model Assisted GFSS framework for remote sensing images. We aim to leverage the generic semantic knowledge inherited in foundation models. Specifically, we employ three strategies named Support Label Enrichment (SLE), Distillation of General Knowledge (DGK) and Voting Fusion of Experts (VFE). For the support images, SLE explores credible unlabeled novel categories, ensuring that each support label contains multiple novel classes. For the query images, DGK technique allows an effective transfer of generalizable knowledge of foundation models on certain categories to the GFSS learner. Additionally, VFE strategy integrates the zero-shot prediction of foundation models with the few-shot prediction of GFSS learners, achieving improved segmentation performance. Extensive experiments and ablation studies conducted on the OpenEarthMap few-shot challenge dataset demonstrate that our proposed method achieves state-of-the-art performance.
Citation
GAO, T., AO, W., WANG, X.-A., ZHAO, Y., MA, P., XIE, M., FU, H., REN, J. and GAO, Z. 2024. Enrich, distill and fuse: generalized few-shot semantic segmentation in remote sensing leveraging foundation model’s assistance. In Proceedings of the 2024 IEEE (Institute of Electrical and Electronics Engineers) Computer Society conference on Computer vision and pattern recognition workshops (CVPRW 2024), 16-22 June 2024, Seattle, WA, USA. Piscataway: IEEE [online], pages 2771-2780. Available from: https://doi.org/10.1109/CVPRW63382.2024.00283
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 2024 IEEE (Institute of Electrical and Electronics Engineers) Computer Society conference on Computer vision and pattern recognition workshops (CVPRW 2024) |
Start Date | Jun 16, 2024 |
End Date | Jun 22, 2024 |
Acceptance Date | Feb 26, 2024 |
Online Publication Date | Jun 22, 2024 |
Publication Date | Dec 31, 2024 |
Deposit Date | Jan 21, 2025 |
Publicly Available Date | Jan 21, 2025 |
Print ISSN | 2160-7508 |
Electronic ISSN | 2160-7516 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
Pages | 2771-2780 |
Series ISSN | 2160-7508; 2160-7516 |
DOI | https://doi.org/10.1109/CVPRW63382.2024.00283 |
Keywords | Natural resources; Image recognition; Annotations; Fuses; Semantic segmentation; Urban planning; Semantics |
Public URL | https://rgu-repository.worktribe.com/output/2619966 |
Files
GAO 2024 Enrich distill and fuse (AAM)
(2.4 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
Copyright Statement
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
Prompting-to-distill semantic knowledge for few-shot learning.
(2024)
Journal Article
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search