Jinping Wang
FusDreamer: label-efficient remote sensing world model for multimodal data classification.
Wang, Jinping; Song, Weiwei; Chen, Hao; Ren, Jinchang; Zhao, Huimin
Authors
Weiwei Song
Hao Chen
Professor Jinchang Ren j.ren@rgu.ac.uk
Professor of Computing Science
Huimin Zhao
Abstract
World models significantly enhance hierarchical understanding, improving data integration and learning efficiency. To explore the potential of the world model in the remote sensing (RS) field, this paper proposes a label-efficient remote sensing world model for multimodal data fusion (FusDreamer). The FusDreamer uses the world model as a unified representation container to abstract common and high-level knowledge, promoting interactions across different types of data, i.e., hyperspectral (HSI), light detection and ranging (LiDAR), and text data. Initially, a new latent diffusion fusion and multimodal generation paradigm (LaMG) is utilized for its exceptional information integration and detail retention capabilities. Subsequently, an open-world knowledge-guided consistency projection (OK-CP) module incorporates prompt representations for visually described objects and aligns language-visual features through contrastive learning. In this way, the domain gap can be bridged by fine-tuning the pre-trained world models with limited samples. Finally, an end-to-end multitask combinatorial optimization (MuCO) strategy can capture slight feature bias and constrain the diffusion process in a collaboratively learnable direction. Experiments conducted on four typical datasets indicate the effectiveness and advantages of the proposed FusDreamer.
Citation
WANG, J., SONG, W., CHEN, H., REN, J. and ZHAO, H. 2025. FusDreamer: label-efficient remote sensing world model for multimodal data classification. IEEE transactions on geoscience and remote sensing [online], 63, article number 570314. Available from: https://doi.org/10.1109/TGRS.2025.3554862
Journal Article Type | Article |
---|---|
Acceptance Date | Mar 22, 2025 |
Online Publication Date | Mar 26, 2025 |
Publication Date | Apr 15, 2025 |
Deposit Date | Mar 27, 2025 |
Publicly Available Date | Mar 27, 2025 |
Journal | IEEE transactions on geoscience and remote sensing |
Print ISSN | 0196-2892 |
Electronic ISSN | 1558-0644 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
Volume | 63 |
Article Number | 570314 |
DOI | https://doi.org/10.1109/TGRS.2025.3554862 |
Keywords | Multimodal data fusion; World model; Constrastive learning; Diffusion process |
Public URL | https://rgu-repository.worktribe.com/output/2762032 |
Additional Information | The code used in this article is available from: https://github.com/Cimy-wang/FusDreamer |
Files
WANG 2025 FusDreamer (AAM)
(10 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
Copyright Statement
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
MDDNet: multilevel difference-enhanced denoise network for unsupervised change detection in SAR images.
(2025)
Presentation / Conference Contribution
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search