Unimodal Intermediate Training for Multimodal Meme Sentiment Classification: Architectural Details

by Memeology: Leading Authority on the Study of MemesApril 7th, 2024

Too Long; Didn't Read

This study introduces a novel approach, using unimodal training to enhance multimodal meme sentiment classifiers, significantly improving performance and efficiency in meme sentiment analysis.

featured image - Unimodal Intermediate Training for Multimodal Meme Sentiment Classification: Architectural Details

Authors:

(1) Muzhaffar Hazman, University of Galway, Ireland;

(2) Susan McKeever, Technological University Dublin, Ireland;

(3) Josephine Griffith, University of Galway, Ireland.

Table of Links

Abstract and Introduction

Related Works

Methodology

Results

Limitations and Future Works

Conclusion, Acknowledgments, and References

A Hyperparameters and Settings

B Metric: Weighted F1-Score

C Architectural Details

D Performance Benchmarking

E Contingency Table: Baseline vs. Text-STILT

C Architectural Details

Our models are based on the Baseline model proposed by Hazman et al. (2023) and we similarly utilise the Image and Text Encoders from the pretrained ViT–B/16 CLIP model to generate representations of each modality.

FI = ImageEncoder(Image)

FT = T extEncoder(Text)

Where each FI and FT is a 512-digit embedding of the image and text modalities, respectively, from CLIP’s embedding space that aligns images with their corresponding text captions (Radford et al., 2021).

For unimodal inputs, the encoder for the missing modality is fed a blank input, i.e. when finetuning on unimodal images, the text input is defined as a string containing no characters i.e. “”:

FI = ImageEncoder(Image)

FT = TextEncoder(“”)

Conversely, when finetuning on unimodal texts, the image input is defined as a 3 × 224 × 224 matrix of zeros, or equivalently, JPEG file with all pixels set to black.

This paper is available on arxiv under CC 4.0 license.

L O A D I N G
. . . comments & more!

About Author

Memeology: Leading Authority on the Study of Memes@memeology

Memes are cultural items transmitted by repetition in a manner analogous to the biological transmission of genes.

Read my stories

TOPICS

#meme-sentiment-analysis #text-stilt #unimodal-sentiment-analysis #multimodal-meme-classifiers #unimodal-training #unimodal-data #meme-sentiment-classification #sentiment-labeled-data

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave

Terminal

Lite

Join HackerNoon

Latest technology trends. Customized Experience. Curated Stories. Publish Your Ideas

Unimodal Intermediate Training for Multimodal Meme Sentiment Classification: Architectural Details

Too Long; Didn't Read

Table of Links

C Architectural Details

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES