This paper discusses the epistemological differences between the label ‘multimodal’ in computational and sociosemiotic terms by addressing the challenges of automatic detection of hate speech in racist memes, considered as germane families of multimodal artifacts. Assuming that text-image interplays, such is the case of memes, may be extremely complex to disentangle by AI-driven models, the paper adopts a sociosemiotic multimodal critical approach to discuss the challenges of automatic detection of hateful memes on the Internet. As a case study, we select two different English-language datasets, 1) the Hateful Memes Challenge (HMC) Dataset, which was built by the Facebook AI Research group in 2020, and 2) the Text-Image Cluster (TIC) Dataset, including manually collected user-generated (UG) hateful memes. By discussing different combinations of non-hateful/hateful texts and non-hateful/hateful images, we will show how humour, intertextuality, and anomalous juxtapositions of texts and images, as well as contextual cultural knowledge, may make AI-based automatic interpretation incorrect, biased or misleading. In our conclusions, we will argue the case for the development of computational models that incorporate insights from sociosemiotics and multimodal critical discourse analysis.

Multimodal computation or interpretation? Automatic vs. critical understanding of text-image relations in racist memes in English

Polli, C.;Sindoni, M. G.
2024-01-01

Abstract

This paper discusses the epistemological differences between the label ‘multimodal’ in computational and sociosemiotic terms by addressing the challenges of automatic detection of hate speech in racist memes, considered as germane families of multimodal artifacts. Assuming that text-image interplays, such is the case of memes, may be extremely complex to disentangle by AI-driven models, the paper adopts a sociosemiotic multimodal critical approach to discuss the challenges of automatic detection of hateful memes on the Internet. As a case study, we select two different English-language datasets, 1) the Hateful Memes Challenge (HMC) Dataset, which was built by the Facebook AI Research group in 2020, and 2) the Text-Image Cluster (TIC) Dataset, including manually collected user-generated (UG) hateful memes. By discussing different combinations of non-hateful/hateful texts and non-hateful/hateful images, we will show how humour, intertextuality, and anomalous juxtapositions of texts and images, as well as contextual cultural knowledge, may make AI-based automatic interpretation incorrect, biased or misleading. In our conclusions, we will argue the case for the development of computational models that incorporate insights from sociosemiotics and multimodal critical discourse analysis.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11570/3287808
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact