Faseela Chakkalakkal Abdullakutty
Decoding memes: a comprehensive analysis of late and early fusion models for explainable meme analysis.
Abdullakutty, Faseela Chakkalakkal; Naseem, Usman
Authors
Usman Naseem
Contributors
Tat-Seng Chua
Editor
Chong-Wah Ngo
Editor
Ravi Kumar
Editor
Hady W. Lauw
Editor
Roy Ka-Wei Lee
Editor
Abstract
Memes are important because they serve as conduits for expressing emotions, opinions, and social commentary online, providing valuable insight into public sentiment, trends, and social interactions. By combining textual and visual elements, multi-modal fusion techniques enhance meme analysis, enabling the classification of offensive and sentimental memes effectively. Early and late fusion methods effectively integrate multi-modal data but face limitations. Early fusion integrates features from different modalities before classification. Late fusion combines classification outcomes from each modality after individual classification and reclassifies the combined results. This paper compares early and late fusion models in meme analysis. It showcases their efficacy in extracting meme concepts and classifying meme reasoning. Pre-trained vision encoders, including ViT and VGG-16, and language encoders such as BERT, AlBERT, and DistilBERT, were employed to extract image and text features. These features were subsequently utilized for performing both early and late fusion techniques. This paper further compares the explainability of fusion models through SHAP analysis. In comprehensive experiments, various classifiers such as XGBoost and Random Forest, along with combinations of different vision and text features across multiple sentiment scenarios, showcased the superior effectiveness of late fusion over early fusion.
Citation
ABDULLAKUTTY, F. and NASEEM, U. 2024. Decoding memes: a comprehensive analysis of late and early fusion models for explainable meme analysis. In Chua, T.-S., Ngo, C.-W., Kumar, R., Lauw, H.W. and Lee, R.K.-W. (eds.). WWW'24 companion: companion proceedings of the ACM web conference 2024, 13-17 May 2024, Singapore. New York: ACM [online], pages 1681-1689. Available from: https://doi.org/10.1145/3589335.3652504
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 2024 ACM Web conference (WWW '24) |
Start Date | May 13, 2024 |
End Date | May 17, 2024 |
Acceptance Date | Mar 4, 2024 |
Online Publication Date | May 13, 2024 |
Publication Date | May 31, 2024 |
Deposit Date | Jun 6, 2024 |
Publicly Available Date | Jun 6, 2024 |
Publisher | Association for Computing Machinery (ACM) |
Peer Reviewed | Peer Reviewed |
Pages | 1681-1689 |
Book Title | WWW'24 companion: companion proceedings of the ACM web conference 2024 |
DOI | https://doi.org/10.1145/3589335.3652504 |
Keywords | Explainability; Fusion; Multi-modal meme analysis |
Public URL | https://rgu-repository.worktribe.com/output/2368218 |
Files
ABDULLAKUTTY 2024 Decoding memes
(19.9 Mb)
Archive
Publisher Licence URL
https://creativecommons.org/licenses/by/4.0/
Copyright Statement
© 2024 Owner/Author. This work is licensed under a Creative Commons Attribution International 4.0 License.
You might also like
Fusion methods for face presentation attack detection.
(2022)
Journal Article
Deep transfer learning on the aggregated dataset for face presentation attack detection.
(2022)
Journal Article
Unmasking the imposters: task-specific feature learning for face presentation attack detection.
(2023)
Presentation / Conference Contribution
Downloadable Citations
About OpenAIR@RGU
Administrator e-mail: publications@rgu.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search