Prompting-to-distill semantic knowledge for few-shot learning.

Ji, Hong; Gao, Zhi; Ren, Jinchang; Wang, Xing-ao; Gao, Tianyi; Sun, Wenbo; Ma, Ping

doi:10.1109/lgrs.2024.3414505

Prompting-to-distill semantic knowledge for few-shot learning.

Ji, Hong; Gao, Zhi; Ren, Jinchang; Wang, Xing-ao; Gao, Tianyi; Sun, Wenbo; Ma, Ping

Authors

Hong Ji

Zhi Gao

Professor Jinchang Ren j.ren@rgu.ac.uk
Professor of Computing Science

Xing-ao Wang

Tianyi Gao

Wenbo Sun

Ms Ping Ma p.ma2@rgu.ac.uk
Research Fellow

Abstract

Recognizing visual patterns in low-data regime necessitates deep neural networks to glean generalized representations from limited training samples. In this paper, we propose a novel few-shot classification method, namely ProDFSL, leveraging multi-modal knowledge and attention mechanism. We are inspired by recent advances of large language models and the great potential they have shown across a wide range of downstream tasks, and tailor it to benefit the remote sensing community. We utilize ChatGPT to produce class-specific textual inputs for enabling CLIP with rich semantic information. To promote the adaptation of CLIP in remote sensing domain, we introduce a Cross-modal Knowledge Generation Module, which dynamically generates a group of soft prompts conditioned on the few-shot visual samples and further uses a shallow Transformer to model the dependencies between language sequences. Fusing the semantic information with few-shot visual samples, we build representative class prototypes, which are conducive to both inductive and transductive inference. In extensive experiments on standard benchmarks, our ProDFSL consistently outperforms the state-of-the-art in few-shot learning.

Citation

JI, H., GAO, Z., REN, J., WANG, X.-A., GAO, T., SUN, W. and MA, P. 2024. Prompting-to-distill semantic knowledge for few-shot learning. IEEE geoscience and remote sensing letters [online], 21, article 6011605. Available from: https://doi.org/10.1109/lgrs.2024.3414505

Journal Article Type	Article
Acceptance Date	Jun 9, 2024
Online Publication Date	Jun 14, 2024
Publication Date	Dec 31, 2024
Deposit Date	Jun 24, 2024
Publicly Available Date	Jun 24, 2024
Journal	IEEE geoscience and remote sensing letters
Print ISSN	1545-598X
Electronic ISSN	1558-0571
Publisher	Institute of Electrical and Electronics Engineers (IEEE)
Peer Reviewed	Peer Reviewed
Volume	21
Article Number	6011605
DOI	https://doi.org/10.1109/lgrs.2024.3414505
Keywords	Few-shot learning; ChatGPT; CLIP; Multi-modal knowledge; Attention mechanism
Public URL	https://rgu-repository.worktribe.com/output/2378315

Files

JI 2024 Prompting-to-distill (AAM) (1.1 Mb)
PDF

Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/

Copyright Statement
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Automatic geolocation and measuring of offshore energy infrastructure with multimodal satellite data. (2023)
Journal Article

CBANet: an end-to-end cross band 2-D attention network for hyperspectral change detection in remote sensing. (2023)
Journal Article

Multiscale superpixelwise prophet model for noise-robust feature extraction in hyperspectral images. (2023)
Journal Article

Multiscale superpixelwise prophet model for noise-robust feature extraction in hyperspectral images. [Dataset] (2023)
Data

Multiscale 2-D singular spectrum analysis and principal component analysis for spatial–spectral noise-robust feature extraction and classification of hyperspectral images. (2020)
Journal Article

Downloadable Citations

HTML

BIB

RTF

Authors

Abstract

Citation

Files

You might also like

Downloadable Citations