Procesamiento del lenguaje natural y fijacióndel textoExperiencias en torno a la constituciónde un corpus diacrónico de sonetos

  1. Helena Bermúdez Sabel 1
  2. Clara Isabel Martínez Cantón 2
  3. Pablo Ruiz Fabo 3
  1. 1 JinnTec
  2. 2 Universidad Nacional de Educación a Distancia
    info

    Universidad Nacional de Educación a Distancia

    Madrid, España

    ROR https://ror.org/02msb5n36

  3. 3 University of Strasbourg
    info

    University of Strasbourg

    Estrasburgo, Francia

    ROR https://ror.org/00pg6eq24

Book:
Editar el Siglo de Oro en la era digital
  1. Susanna Allés Torrent (coord.)
  2. Eugenia Fosalba Vela (coord.)

Publisher: Servicio de Publicaciones = Servei de Publicacions ; Universidad Autónoma de Barcelona = Universitat Autònoma de Barcelona

ISBN: 978-84-128138-3-8

Year of publication: 2024

Pages: 161-174

Type: Book chapter

Abstract

We present work carried out within the development of DISCO, the Diachronic Spanish Sonnet Corpus project, which consists of 4,530 sonnets in Spanish from Europe, Latin America and the Philippines, including texts from the15th to the 20th centuries. The resource offers versification annotations obtained automatically through tools based on Natural Language Processing(NLP). In this article, we present how automatic annotation results can be exploited to detect textual transmission errors. Drawing on our experience withDISCO, we present observations towards the creation of workflows assisted byNLP-based tools, which can help detect possible textual errors, thus allowing usto focus on specific passages for our manual correction effort.