Overview of DIPROMATS 2024: Detection, Characterization and Tracking of Propaganda in Messages from Diplomats and Authorities of World Powers

Moral, Pablo; Fraile, Jesús M.; Marco, Guillermo; Peñas, Anselmo; Gonzalo, Julio

Overview of DIPROMATS 2024Detection, Characterization and Tracking of Propaganda in Messages from Diplomats and Authorities of World Powers

Journal:

Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2024

Issue: 73

Pages: 347-358

Type: Article

DIALNET GOOGLE SCHOLAR Open access editor

More publications in: Procesamiento del lenguaje natural

Abstract

This paper summarizes the findings of DIPROMATS 2024, a challenge included at the Iberian Languages Evaluation Forum (IberLEF). This second edition introduces a refined typology of techniques and a more balanced dataset for propaganda detection, alongside a new task focused on identifying strategic narratives. The dataset for the first task includes 12,012 annotated tweets in English and 9,501 in Spanish, posted by authorities from China, Russia, the United States, and the European Union. Participants tackled three subtasks in each language: binary classification to detect propagandistic tweets, clustering tweets into three propaganda categories, and fine-grained categorization using seven techniques. The second task presents a multi-class, multi-label classification challenge where systems identify which predefined narratives (associated with each international actor) tweets belong to. This task is supported by narrative descriptions and example tweets in English and Spanish, using few-shot learning techniques. 40 runs from nine different teams were evaluated.

Bibliographic References

Alisetti, S. V. 2024. Paraphrase Generator with T5, June.
Amigo, E. and A. Delgado. 2022. Evaluating extreme hierarchical multi-label classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5809–5819, Dublin, Ireland, May. Association for Computational Linguistics.
Bolsover, G. and P. Howard. 2017. Computational Propaganda and Political Big Data: Moving Toward a More Critical Research Agenda. Big Data, 5(4):273–276, December.
Bolt, N. 2012. The violent image: insurgent propaganda and the new revolutionaries. Hurst & Company, London. OCLC: 1233055622.
Cañete, J., G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, and J. Pérez. 2020. Spanish pre-trained bert model and evaluation data. Pml4dc at iclr, 2020(2020):1–10.
Colley, T. 2020. Strategic narratives and war propaganda. In P. Baines, N. O’Shaughnessy, and N. Snow, editors, The SAGE handbook of propaganda. SAGE, London, pages 491–508.
Conneau, A., K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov. 2020. Unsupervised Crosslingual Representation Learning at Scale, April. arXiv:1911.02116 [cs].
Da San Martino, G., S. Yu, A. Barrón-Cedeño, R. Petrov, and P. Nakov. 2019. Fine-Grained Analysis of Propaganda in News Article. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5636–5646, Hong Kong, China, November. Association for Computational Linguistics.
Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2018. BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding. Publisher: arXiv Version Number: 2.
García-Díaz, J. A., P. J. Vivancos-Vicente, ´A. Almela, and R. Valencia-García. 2022. UMUTextStats: A linguistic feature extraction tool for Spanish. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, J. Odijk, and S. Piperidis, editors, Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6035–6044, Marseille, France, June. European Language Resources Association.
Jacob, B., S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko. 2017. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference, December.
Jiang, A. Q., A. Sablayrolles, A. Roux, A. Mensch, B. Savary, C. Bamford, D. S. Chaplot, D. de las Casas, E. B. Hanna, F. Bressand, G. Lengyel, G. Bour, G. Lample, L. R. Lavaud, L. Saulnier, M.-A. Lachaux, P. Stock, S. Subramanian, S. Yang, S. Antoniak, T. L. Scao, T. Gervet, T. Lavril, T.Wang, T. Lacroix, and W. E. Sayed. 2024. Mixtral of Experts, January.
Jowett, G. and V. O’Donnell. 2015. Propaganda & persuasion. SAGE, Thousand Oaks, Calif, sixth edition edition.
Liu, Y., M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach, July. arXiv:1907.11692 [cs].
Miskimmon, A., B. O’Loughlin, and L. Roselle. 2013. Strategic narratives: communication power and the new world order. Number 3 in Routledge studies in global information, politics and society. Routledge, Taylor & Francis Group, New York ; London.
Moral, P. 2023. Restoring reputation through digital diplomacy: the European Union’s strategic narratives on Twitter during the COVID-19 pandemic. Communication & Society, pages 241–269, April.
Moral, P. 2024. A tale of heroes and villains: Russia’s strategic narratives on twitter during the covid-19 pandemic. Journal of Information Technology & Politics, 21(2):146–165.
Moral, P. and G. Marco. 2023. Assembling stories tweet by tweet: strategic narratives from Chinese authorities on Twitter during the COVID-19 pandemic. Communication Research and Practice, 9(2):159–183, April.
Moral, P., G. Marco, J. Gonzalo, J. Carrillo-de Albornoz, and I. Gonzalo-Verdugo. 2023. Overview of DIPROMATS 2023: automatic detection and characterization of propaganda techniques in messages from diplomats and authorities of world powers. Procesamiento del Lenguaje Natural, 71, September.
Nguyen, D. Q., T. Vu, and A. Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 9–14, Online. Association for Computational Linguistics.
Pérez, J. M., D. A. Furman, L. Alonso Alemany, and F. M. Luque. 2022. RoBERTuito: a pre-trained language model for social media text in Spanish. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 7235–7243, Marseille, France, June. European Language Resources Association.
Richards, J. 2023. The Use of Discourse Analysis in Propaganda Detection and Understanding. In Routledge Handbook of Disinformation and National Security. Routledge, London, 1 edition, October, pages 385–400.
Riessman, C. K. 2008. Narrative methods for the human sciences. Sage Publications, Los Angeles.
Sparkes-Vian, C. 2019. Digital Propaganda: The Tyranny of Ignorance. Critical Sociology, 45(3):393–409, May.
Zhang, X., Y. Malkov, O. Florez, S. Park, B. McWilliams, J. Han, and A. El-Kishky. 2023. TwHIN-BERT: A Socially-Enriched Pre-trained Language Model for Multilingual Tweet Representations at Twitter, August. arXiv:2209.07562 [cs].

Data source: Dialnet