Rantanplan, Fast and Accurate Syllabification and Scansion of Spanish Poetry

  1. Javier de la Rosa
  2. Álvaro Pérez
  3. Laura Hernández
  4. Salvador Ros
  5. Elena González-Blanco García
Zeitschrift:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Datum der Publikation: 2020

Nummer: 65

Seiten: 83-90

Art: Artikel

Andere Publikationen in: Procesamiento del lenguaje natural

Zusammenfassung

Automated analysis of Spanish poetry corpora lacks the richness of tools available for English. The existing options suffer from a number of issues: are limited to fixed-metre hendecasyllabic verses, are not publicly available, the syllabification procedure underneath is not thoroughly tested, and their speed is questionable. This paper introduces new methods to alleviate these concerns. For syllabification, we contribute with our own method and manually crafted corpus. For scansion, our approach is based on a heuristic for the application of rhetorical figures that alter metrical length. Experimental evaluation shows that both fixed-metre and mixed-metre poetry can be successfully analyzed, producing metrical patterns more accurately (increasing accuracy by 2% and 15%, respectively), and at a fraction of the time other methods need (running at least 100 times faster).

Informationen zur Finanzierung

This research was supported by the project Poetry Standardization and Linked Open Data (POSTDATA) (ERC-2015-STG-679528) obtained by Elena González-Blanco and funded by an European Research Council (h t t p s : / / e r c . e u r o p a . e u) Starting Grant under the Horizon2020 Program of the European Union.

Geldgeber

Bibliographische Referenzen

  • Agirrezabal, M., I. Alegria, and M. Hulden. 2017. A comparison of feature-based and neural scansion of poetry. In Proceedings of the International Conference Recent Advances in Natural Language Processing, Ranlp 2017, pages 18–23.
  • Agirrezabal, M., A. Astigarraga, B. Arrieta, and M. Hulden. 2016. Zeuscansion: a tool for scansion of english poetry. Journal of Language Modelling, 4.
  • Agirrezabal, M., J. Heinz, M. Hulden, and B. Arrieta. 2014. Assigning stress to outof-vocabulary words: three approaches. In International Conference on Artificial Intelligence, Las Vegas, NV, volume 27, pages 105–110.
  • Caparrós, J. D. 1993. Métrica española. Síntesis Madrid.
  • Fernández-Carvajal, F. 2003. Antología de textos.
  • Gervás, P. 2000. A logic programming application for the analysis of spanish verse. In International Conference on Computational Logic, pages 1330–1344. Springer.
  • Hartman, C. O. 2005. The scandroid 1.1. [Online; accessed 20-July-2020].
  • Honnibal, M. and I. Montani. 2017. spacy 2: Natural language understanding with bloom embeddings. Convolutional Neural Networks and Incremental Parsing, 7.
  • Moretti, F. 2013. Distant reading. Verso Books.
  • Navarro-Colorado, B. 2017. A metrical scansion system for fixed-metre spanish poetry. Digital Scholarship in the Humanities, 33(1):112–127.
  • Navarro-Colorado, B., M. R. Lafoz, and N. Sánchez. 2016. Metrical annotation of a large corpus of spanish sonnets: representation, scansion and evaluation. In International Conference on Language Resources and Evaluation, pages 4360–4364.
  • Navarro Tomás, T. 1991. Métrica española. Reseña histórica y descriptiva, 50.
  • Padró, L. and E. Stanilovsky. 2012. Freeling 3.0: Towards wider multilinguality. In International Conference on Language Resources and Evaluation.
  • Quilis, A. 1969. Métrica española. Alcalá Madrid.
  • RAE, R. A. E. 2010. Ortografía de la lengua española. Espasa.
  • Ríos Mestre, A. 1998. La transcripcion fonetica automatica del diccionario electronico de formas simples flexivas del español: un estudio fonologico en el lexico. Ph.D. thesis, Universitat Autònoma de Barcelona.
  • Taulé, M., M. A. Martí, and M. Recasens. 2008. Ancora: Multilevel annotated corpora for catalan and spanish. In International Conference on Language Resources and Evaluation.