A framework of analysis for the evaluation of automatic term extractors

  1. Carlos Periñán-Pascual 1
  2. Ricardo Mairal Usón 2
  1. 1 Universidad Politécnica de Valencia
    info

    Universidad Politécnica de Valencia

    Valencia, España

    ROR https://ror.org/01460j859

  2. 2 Universidad Nacional de Educación a Distancia
    info

    Universidad Nacional de Educación a Distancia

    Madrid, España

    ROR https://ror.org/02msb5n36

Revista:
VIAL, Vigo international journal of applied linguistics

ISSN: 1697-0381

Año de publicación: 2018

Número: 15

Páginas: 105-126

Tipo: Artículo

Otras publicaciones en: VIAL, Vigo international journal of applied linguistics

Información de financiación

Financial support for this research has been provided by the Spanish Ministry of Economy, Competitiveness and Science, grant FFI2014-53788-C3-1-P.

Financiadores

    • FFI2014-53788-C3-1-P

Referencias bibliográficas

  • Ahrenberg, L. (2009) Term extraction: A review. Retrieved from http://www.ida. liu.se/~lah/Publications/tereview_v2.pdf
  • Barcala, M., Domínguez-Noya, E., Gamallo, P., López, M., Moscoso, E., Rojo, G., Santalla, P., and Sotelo, S. (2007) A corpus and lexical resources for multi-word terminology extraction in the field of economy. In Proceedings of the 3rd Language and Technology Conference, Poznan: 355-359.
  • Bouamor, D., Semmar, N., and Zweigenbaum, P. (2012). Identifying bilingual multi-word expressions for statistical machine translation. In Proceedings of the 8th International Conference on Language Resources and Evaluation. Estambul: European Language Resources Association: 674-679.
  • Carrión Delgado, M. G. (2012) Extracción y análisis de unidades léxicoconceptuales del dominio jurídico: un acercamiento metodológico desde FunGramKB. RaeL 11: 25-39.
  • Church, K.W., and Hanks, P. (1990) Word association norms, mutual information and lexicography. Computational Linguistics 6 (1): 22-29.
  • Cortés, F. J. and R. Mairal (2016). “Building an RRG computational grammar” Onomazein (34):86-117.
  • Diedrichsen, E. (2014) A Role and Reference Grammar Parser for German. In Brian Nolan and Carlos Periñán-Pascual (eds): Language Processing and Grammars. Amsterdam/Philadelphia: John Benjamins, 105-142.
  • Drouin, P. (2003) Term extraction using non-technical corpora as a point of leverage. Terminology 9 (1): 99-117.
  • Dunning, T. (1994) Accurate methods for the statistics of surprise and coincidence. Computational Linguistics 19 (1): 61-74.
  • EAGLES Work Group. (1999) EAGLES evaluation of natural language processing systems. Technical Report. Center for Sprogteknologi, Copenhagen.
  • Everitt, B. (1992) The Analysis of Contingency Tables. London: Chapman & Hall/ CRC.
  • Fan, X., Shimizu, N., and Nakagawa, H. (2009) Automatic extraction of bilingual terms from a Chinese-Japanese parallel corpus. In Proceedings of the 3rd International Universal Communication Symposium, 41-45.
  • Felices Lago, A., and Ureña Gómez-Moreno, P. (2012) Fundamentos metodológicos de la creación subontológica en FunGramKB. Onomázein 26: 49-67.
  • Gaizauskas, R., Paramita, M.L., Barker, E., Pinnis, M., Aker, A., and Pahisa Solé, M. (2015) Extracting bilingual terms from the Web. Terminology 21 (2): 205-236.
  • Guest, E. (2009) Parsing using the Role and Reference Grammar paradigm. [http://eprints.leedsbeckett.ac.uk/778/6/Parsing%20Using%20the%20Role%20 and%20Reference%20Grammar%20Paradigm.pdf, accessed 19 February 2016].
  • ISO/IEC 25010. (2011) Systems and Software Engineering – Systems and Software Quality Requirements and Evaluation (SQuaRE) – System and Software Quality Models. Geneva: International Organization for Standardization International Electrotechnical Commission.
  • ISO/IEC 9126-1. (2001) Information Technology – Software Product Quality. Part 1: Quality Model. Geneva: International Organization for Standardization. International Electrotechnical Commission.
  • Lee, L., Aw, A., Zhang, M., and Li, H. (2010) EM-based hybrid model for bilingual terminology extraction from comparable corpora. In Proceedings of the 23rd International Conference on Computational Linguistics, 639-646.
  • Lee, J., and Paek, I. (2014) In search of the optimal number of response categories in a rating scale. Journal of Psychoeducational Assessment 32: 663-673.
  • Lefever, E., Macken, L., and Hoste, V. (2009) Language-independent bilingual terminology extraction from a multilingual parallel corpus. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, 496-504.
  • Lossio-Ventura, J.A., Jonquet, C., Roche, M., and Teisseire M. (2014a) BioTex: a system for biomedical terminology extraction, ranking and validation. In Proceedings of the 13th International Semantic Web Conference, 157-160.
  • Lossio-Ventura, J.A., Jonquet, C., Roche, M., and Teisseire M. (2014b) Towards a mixed approach to extract biomedical terms from text corpus. International Journal of Knowledge Discovery in Bioinformatics 4 (1): 1-15.
  • Lossio-Ventura, J.A., Jonquet, C., Roche, M., and Teisseire M. (2014c) Yet another ranking function to automatic multi-word term extraction. In Proceedings of the 9th International Conference on Natural Language Processing, Warsaw.
  • Mairal-Usón, R., Lilian Guerrero and Carlos González (eds.) (2012) El funcionalismo en la teoría lingüística. La Gramática del Papel y la Referencia. Introducción, avances y aplicaciones. Madrid: Akal.
  • Mairal-Usón, R., and Periñán-Pascual, C. (2009) The anatomy of the lexicon component within the framework of a conceptual knowledge base. Revista Española de Lingüística Aplicada 22: 217-244.
  • Nagao, M., Mizutani, M., and Ikeda, H. (1976) An automated method of the extraction of important words from Japanese scientific documents. Transactions of the Information Processing Society of Japan 17 (2): 110-117.
  • Nolan, B. and C. Periñán-Pascual (eds.) (2014) Language Processing and Grammars. Amsterdam: John Benjamins.
  • Nolan, B. and Y. Salem (2011) “UniArab: RRG Arabic-to-English machine translation” in Wataru Nakamura (ed.): New Perspectives in Role and Reference Grammar. Newcastle upon Tyne: Cambridge Scholars, 312-346.
  • Park, Y., Byrd, R. J., and Boguraev, B. (2002) Automatic glossary extraction: beyond terminology identification. In Proceedings of the 19th International Conference on Computational Linguistics. Taipei: Howard International House and Academia Sinica, 1-7.
  • Periñán-Pascual, C. (2013) A knowledge-engineering approach to the cognitive categorization of lexical meaning. VIAL: Vigo International Journal of Applied Linguistics 10:85-104.
  • Periñán-Pascual, C. (2015) The underpinnings of a composite measure for automatic term extraction: the case of SRC. Terminology 21 (2): 151-179.
  • Periñán-Pascual, C., and Arcas Túnez, F. (2004) Meaning postulates in a lexicoconceptual knowledge base. In Proceedings of the 15th International Workshop on Databases and Expert Systems Applications. Los Alamitos: Institute of Electrical and Electronics Engineers, 38-42.
  • Periñán-Pascual, C., and Arcas Túnez, F. (2005) Microconceptual-Knowledge Spreading in FunGramKB. In Proceedings of the 9th IASTED International Conference on Artificial Intelligence and Soft Computing. Anaheim-Calgary-Zurich: ACTA Press, 239-244.
  • Periñán-Pascual, C., and Arcas Túnez, F. (2007) Cognitive modules of an NLP knowledge base for language understanding. Procesamiento del Lenguaje Natural 39: 197-204.
  • Periñán-Pascual, C., and Arcas Túnez, F. (2010). Ontological commitments in FunGramKB. Procesamiento del Lenguaje Natural 44: 27-34.
  • Periñán-Pascual, C., and Arcas Túnez, F. (2014) La ingeniería del conocimiento en el dominio legal: La construcción de una Ontología Satélite en FunGramKB. Revista Signos: Estudios de Lingüística 47 (84): 113-139.
  • Preston, C.C., and Colman, A.M. (2000) Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences. Acta Psychologica 104: 1-15.
  • Saaty, T. L. (1977) A scaling method for priorities in a hierarchical structure. Journal of Mathematical Psychology 15: 234-281.
  • Saaty, T. L. (1980) The Analytic Hierarchy Process. New York: McGraw-Hill.
  • Salem, Y., A. Hensman and B. Nolan (2008) Towards Arabic to English machine translation, ITB Journal 17, 20–31.
  • Sauron, V. (2002) Tearing out the terms: evaluating term extractors. In Proceedings of the Aslib Conference Translating and the Computer 24. London: The Association for Information Management, 1-18.
  • Silva, J.F., and Lopes, G.P. (1999) A local maxima method and a fair dispersion normalization for extracting multiword units. In Proceedings of the 6th Meeting on the Mathematics of Language, Orlando, 369-381.
  • Tzeng, G. H., and Huang, J. J. (2011) Multiple Attribute Decision Making: Methods and Applications. Boca Raton: CRC Press.
  • Van Valin, Robert D. Jr. (2005) Exploring the Syntax-Semantics Interface. Cambridge: Cambridge University Press.
  • Van Valin, R.D. Jr and R. Mairal Usón (2014) “Interfacing the Lexicon and an Ontology in a Linking Algorithm” In M. Ángeles Gómez, F. Ruiz de Mendoza y F. Gonzálvez-García (eds.) Theory and Practice in Functional-Cognitive Space. Amsterdam: John Benjamins, 205-228
  • Zielinski, D., and Safar, Y. R. (2005) T-survey 2005: An online survey on terminology extraction and terminology management. In Proceedings of the Aslib Conference Translating and the Computer 27. London: The Association for Information Management,1-27.