Overview of DIPROMATS 2024Detection, Characterization and Tracking of Propaganda in Messages from Diplomats and Authorities of World Powers

  1. Moral, Pablo
  2. Fraile, Jesús M.
  3. Marco, Guillermo
  4. Peñas, Anselmo
  5. Gonzalo, Julio
Revista:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Any de publicació: 2024

Número: 73

Pàgines: 347-358

Tipus: Article

Altres publicacions en: Procesamiento del lenguaje natural

Resum

Este trabajo presenta DIPROMATS 2024, una tarea compartida incluida en IberLEF. Esta segunda edición introduce una tipología refinada de técnicas y un conjunto de datos más equilibrado para la detección de propaganda, además de una nueva tarea para la detección de narrativas estratégicas. El dataset de la primera tarea incluye 12.012 tweets en inglés y 9.501 en español de autoridades de China, Rusia, Estados Unidos y la Unión Europea. Los participantes abordaron tres subtareas por idioma: clasificación binaria de tweets propagandísticos, su agrupación en tres categorías de propaganda y su categorización en siete técnicas. La segunda tarea consiste en una clasificación multi-clase y multi-etiqueta para identificar a cuáles de las narrativas predefinidas pertenecen los tweets, siguiendo descripciones y ejemplos en inglés y español (aprendizaje few-shot). En total fueron evaluadas 40 ejecuciones de nueve equipos diferentes.

Referències bibliogràfiques

  • Alisetti, S. V. 2024. Paraphrase Generator with T5, June.
  • Amigo, E. and A. Delgado. 2022. Evaluating extreme hierarchical multi-label classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5809–5819, Dublin, Ireland, May. Association for Computational Linguistics.
  • Bolsover, G. and P. Howard. 2017. Computational Propaganda and Political Big Data: Moving Toward a More Critical Research Agenda. Big Data, 5(4):273–276, December.
  • Bolt, N. 2012. The violent image: insurgent propaganda and the new revolutionaries. Hurst & Company, London. OCLC: 1233055622.
  • Cañete, J., G. Chaperon, R. Fuentes, J.-H. Ho, H. Kang, and J. Pérez. 2020. Spanish pre-trained bert model and evaluation data. Pml4dc at iclr, 2020(2020):1–10.
  • Colley, T. 2020. Strategic narratives and war propaganda. In P. Baines, N. O’Shaughnessy, and N. Snow, editors, The SAGE handbook of propaganda. SAGE, London, pages 491–508.
  • Conneau, A., K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov. 2020. Unsupervised Crosslingual Representation Learning at Scale, April. arXiv:1911.02116 [cs].
  • Da San Martino, G., S. Yu, A. Barrón-Cedeño, R. Petrov, and P. Nakov. 2019. Fine-Grained Analysis of Propaganda in News Article. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5636–5646, Hong Kong, China, November. Association for Computational Linguistics.
  • Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2018. BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding. Publisher: arXiv Version Number: 2.
  • García-Díaz, J. A., P. J. Vivancos-Vicente, ´A. Almela, and R. Valencia-García. 2022. UMUTextStats: A linguistic feature extraction tool for Spanish. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, J. Odijk, and S. Piperidis, editors, Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6035–6044, Marseille, France, June. European Language Resources Association.
  • Jacob, B., S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko. 2017. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference, December.
  • Jiang, A. Q., A. Sablayrolles, A. Roux, A. Mensch, B. Savary, C. Bamford, D. S. Chaplot, D. de las Casas, E. B. Hanna, F. Bressand, G. Lengyel, G. Bour, G. Lample, L. R. Lavaud, L. Saulnier, M.-A. Lachaux, P. Stock, S. Subramanian, S. Yang, S. Antoniak, T. L. Scao, T. Gervet, T. Lavril, T.Wang, T. Lacroix, and W. E. Sayed. 2024. Mixtral of Experts, January.
  • Jowett, G. and V. O’Donnell. 2015. Propaganda & persuasion. SAGE, Thousand Oaks, Calif, sixth edition edition.
  • Liu, Y., M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach, July. arXiv:1907.11692 [cs].
  • Miskimmon, A., B. O’Loughlin, and L. Roselle. 2013. Strategic narratives: communication power and the new world order. Number 3 in Routledge studies in global information, politics and society. Routledge, Taylor & Francis Group, New York ; London.
  • Moral, P. 2023. Restoring reputation through digital diplomacy: the European Union’s strategic narratives on Twitter during the COVID-19 pandemic. Communication & Society, pages 241–269, April.
  • Moral, P. 2024. A tale of heroes and villains: Russia’s strategic narratives on twitter during the covid-19 pandemic. Journal of Information Technology & Politics, 21(2):146–165.
  • Moral, P. and G. Marco. 2023. Assembling stories tweet by tweet: strategic narratives from Chinese authorities on Twitter during the COVID-19 pandemic. Communication Research and Practice, 9(2):159–183, April.
  • Moral, P., G. Marco, J. Gonzalo, J. Carrillo-de Albornoz, and I. Gonzalo-Verdugo. 2023. Overview of DIPROMATS 2023: automatic detection and characterization of propaganda techniques in messages from diplomats and authorities of world powers. Procesamiento del Lenguaje Natural, 71, September.
  • Nguyen, D. Q., T. Vu, and A. Tuan Nguyen. 2020. BERTweet: A pre-trained language model for English Tweets. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 9–14, Online. Association for Computational Linguistics.
  • Pérez, J. M., D. A. Furman, L. Alonso Alemany, and F. M. Luque. 2022. RoBERTuito: a pre-trained language model for social media text in Spanish. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 7235–7243, Marseille, France, June. European Language Resources Association.
  • Richards, J. 2023. The Use of Discourse Analysis in Propaganda Detection and Understanding. In Routledge Handbook of Disinformation and National Security. Routledge, London, 1 edition, October, pages 385–400.
  • Riessman, C. K. 2008. Narrative methods for the human sciences. Sage Publications, Los Angeles.
  • Sparkes-Vian, C. 2019. Digital Propaganda: The Tyranny of Ignorance. Critical Sociology, 45(3):393–409, May.
  • Zhang, X., Y. Malkov, O. Florez, S. Park, B. McWilliams, J. Han, and A. El-Kishky. 2023. TwHIN-BERT: A Socially-Enriched Pre-trained Language Model for Multilingual Tweet Representations at Twitter, August. arXiv:2209.07562 [cs].