Skip to main content

Research Repository

Advanced Search

Prompting-to-distill semantic knowledge for few-shot learning.

Ji, Hong; Gao, Zhi; Ren, Jinchang; Wang, Xing-ao; Gao, Tianyi; Sun, Wenbo; Ma, Ping

Authors

Hong Ji

Zhi Gao

Xing-ao Wang

Tianyi Gao

Wenbo Sun



Abstract

Recognizing visual patterns in low-data regime necessitates deep neural networks to glean generalized representations from limited training samples. In this paper, we propose a novel few-shot classification method, namely ProDFSL, leveraging multi-modal knowledge and attention mechanism. We are inspired by recent advances of large language models and the great potential they have shown across a wide range of downstream tasks, and tailor it to benefit the remote sensing community. We utilize ChatGPT to produce class-specific textual inputs for enabling CLIP with rich semantic information. To promote the adaptation of CLIP in remote sensing domain, we introduce a Cross-modal Knowledge Generation Module, which dynamically generates a group of soft prompts conditioned on the few-shot visual samples and further uses a shallow Transformer to model the dependencies between language sequences. Fusing the semantic information with few-shot visual samples, we build representative class prototypes, which are conducive to both inductive and transductive inference. In extensive experiments on standard benchmarks, our ProDFSL consistently outperforms the state-of-the-art in few-shot learning.

Citation

JI, H., GAO, Z., REN, J., WANG, X.-A., GAO, T., SUN, W. and MA, P. 2024. Prompting-to-distill semantic knowledge for few-shot learning. IEEE geoscience and remote sensing letters [online], 21, article 6011605. Available from: https://doi.org/10.1109/lgrs.2024.3414505

Journal Article Type Article
Acceptance Date Jun 9, 2024
Online Publication Date Jun 14, 2024
Publication Date Dec 31, 2024
Deposit Date Jun 24, 2024
Publicly Available Date Jun 24, 2024
Journal IEEE geoscience and remote sensing letters
Print ISSN 1545-598X
Electronic ISSN 1558-0571
Publisher Institute of Electrical and Electronics Engineers (IEEE)
Peer Reviewed Peer Reviewed
Volume 21
Article Number 6011605
DOI https://doi.org/10.1109/lgrs.2024.3414505
Keywords Few-shot learning; ChatGPT; CLIP; Multi-modal knowledge; Attention mechanism
Public URL https://rgu-repository.worktribe.com/output/2378315

Files

JI 2024 Prompting-to-distill (AAM) (1.1 Mb)
PDF

Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/

Copyright Statement
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.




You might also like



Downloadable Citations