Publications

Improving Multimodal Classification of Social Media Posts by Leveraging Image-Text Auxiliary tasks. In EACL 2024 Findings.
Danae Sánchez Villegas, Daniel Preotiuc-Pietro, Nikolaos Aletras.
Arxiv Code
  • We use two auxiliary losses, Image-Text Contrastive (ITC) and Image-Text Matching (ITM), jointly with the main task when fine-tuning any pre-trained multimodal model for social media posts classification.
  • We combine these objectives with five multimodal models, demonstrating consistent improvements across four popular social media datasets.
A Multimodal Analysis of Influencer Content on Twitter 🏆 Best Area Chair Award – Society & NLP. In AACL 2023.
Danae Sánchez Villegas, Catalina Goanta, Nikolaos Aletras.
Paper Data Slides
  • Our research explores the challenges in automatically detecting regulatory compliance breaches in influencer advertising.
  • We introduce a new dataset, and experiments to improve the detection of commercial influencer content.
Sheffield’s Submission to the AmericasNLP Shared Task on Machine Translation into Indigenous Languages. 🥇 Best Submission. In Workshop on Natural Language Processing for Indigenous Languages of the Americas 2023.
Edward Gow-Smith, Danae Sánchez Villegas.
Paper Code
  • We describe our submission to the AmericasNLP 2023 Shared Task on Machine Translation into Indigenous Languages.
  • Our approach consists of extending, training, and ensembling different variations of NLLB-200.
  • We achieve the highest average chrF of all the submissions.
Combining Humor and Sarcasm for Improving Political Parody Detection. In NAACL 2022.
Xiao Ao, Danae Sánchez Villegas, Daniel Preotiuc-Pietro, Nikolaos Aletras.
Paper Code Tech At Bloomberg
  • We propose a method that combines parallel encoders to capture parody, humor, and sarcasm-specific representations from input sequences, which outperforms previous state-of-the-art models for parody detection.
Point-of-Interest Type Prediction using Text and Images. In EMNLP 2021.
Danae Sánchez Villegas and Nikolaos Aletras.
Paper Data Poster
  • We propose a model for POI type prediction combining text and image using a modality gate to control the amount of information needed from the text and image, and a cross-attention mechanism to learn cross-modal interactions.
Analyzing Online Political Advertisements. In ACL Findings 2021.
Danae Sánchez Villegas, Saeid Mokaram, Nikolaos Aletras.
Paper Data
  • We present work on inferring ideology and sponsor type from political ads in the US.
  • We make available two new datasets for political ad analysis, evaluate multimodal models and provide an in-depth analysis of the limitations of our models.
Point-of-Interest Type Inference from Social Media Text. In AACL 2020.
Danae Sánchez Villegas, Daniel Preotiuc-Pietro, Nikolaos Aletras.
Paper Data Tech At Bloomberg
  • We introduce a dataset of tweets mapped into Foursquare POIs (locations), evaluate several text classifier models & provide temporal analysis.
Analyzing Political Parody in Social Media. In ACL 2020.
Antonis Maronikolakis, Danae Sánchez Villegas, Daniel Preotiuc-Pietro, Nikolaos Aletras.
Paper Data Full Dataset Request Slides Blog Post
  • We present a first study of parody using methods from computational linguistics and machine learning.
  • We introduce a freely available large-scale data set containing a total of 131,666 English tweets from 184 real and corresponding parody accounts, and evaluate a range of neural models achieving high predictive accuracy.
Beyond Words: Analyzing Social Media with Text and Images. PhD thesis, University of Sheffield.
Danae Sánchez Villegas.
eThesis
  • My Ph.D. thesis is particularly focused on introducing challenging tasks as well as novel methods to gain a better understanding of multimodal content and its underlying dynamics in the context of social media. .