Hong Ji
Prompting-to-distill semantic knowledge for few-shot learning.
Ji, Hong; Gao, Zhi; Ren, Jinchang; Wang, Xing-ao; Gao, Tianyi; Sun, Wenbo; Ma, Ping
Authors
Zhi Gao
Professor Jinchang Ren j.ren@rgu.ac.uk
Professor of Computing Science
Xing-ao Wang
Tianyi Gao
Wenbo Sun
Ms Ping Ma p.ma2@rgu.ac.uk
Research Fellow
Abstract
Recognizing visual patterns in low-data regime necessitates deep neural networks to glean generalized representations from limited training samples. In this paper, we propose a novel few-shot classification method, namely ProDFSL, leveraging multi-modal knowledge and attention mechanism. We are inspired by recent advances of large language models and the great potential they have shown across a wide range of downstream tasks, and tailor it to benefit the remote sensing community. We utilize ChatGPT to produce class-specific textual inputs for enabling CLIP with rich semantic information. To promote the adaptation of CLIP in remote sensing domain, we introduce a Cross-modal Knowledge Generation Module, which dynamically generates a group of soft prompts conditioned on the few-shot visual samples and further uses a shallow Transformer to model the dependencies between language sequences. Fusing the semantic information with few-shot visual samples, we build representative class prototypes, which are conducive to both inductive and transductive inference. In extensive experiments on standard benchmarks, our ProDFSL consistently outperforms the state-of-the-art in few-shot learning.
Citation
JI, H., GAO, Z., REN, J., WANG, X.-A., GAO, T., SUN, W. and MA, P. 2024. Prompting-to-distill semantic knowledge for few-shot learning. IEEE geoscience and remote sensing letters [online], 21, article 6011605. Available from: https://doi.org/10.1109/lgrs.2024.3414505
Journal Article Type | Article |
---|---|
Acceptance Date | Jun 9, 2024 |
Online Publication Date | Jun 14, 2024 |
Publication Date | Dec 31, 2024 |
Deposit Date | Jun 24, 2024 |
Publicly Available Date | Jun 24, 2024 |
Journal | IEEE geoscience and remote sensing letters |
Print ISSN | 1545-598X |
Electronic ISSN | 1558-0571 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Peer Reviewed | Peer Reviewed |
Volume | 21 |
Article Number | 6011605 |
DOI | https://doi.org/10.1109/lgrs.2024.3414505 |
Keywords | Few-shot learning; ChatGPT; CLIP; Multi-modal knowledge; Attention mechanism |
Public URL | https://rgu-repository.worktribe.com/output/2378315 |
Files
JI 2024 Prompting-to-distill (AAM)
(1.1 Mb)
PDF
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
Copyright Statement
© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search