Cross-lingual Training for Multiple-Choice Question Answering

  1. Guillermo Echegoyen
  2. Alvaro Rodrigo
  3. Anselmo Peñas
Procesamiento del lenguaje natural

ISSN: 1135-5948

Year of publication: 2020

Issue: 65

Pages: 37-44

Type: Article

More publications in: Procesamiento del lenguaje natural


In this work we explore to what extent multilingual models can be trained for one language and applied to a different one for the task of Multiple Choice Question Answering. We employ the RACE dataset to fine-tune both a monolingual and a multilingual models and apply these models to another different collections in different languages. The results show that both monolingual and multilingual models can be zero-shot transferred to a different dataset in the same language maintaining its performance. Besides, the multilingual model still performs good when it is applied to a different target language. Additionally, we find that exams that are more difficult to humans are harder for machines too. Finally, we advance the state-of-the-art for the QA4MRE Entrance Exams dataset in several languages.

Funding information

This work has been funded by the Span ish Research Agency under CHIST-ERA LIHLITH project (PCIN-2017-085/AEI) and deepReading (RTI2018-096846-B-C21 / MCIU/AEI/FEDER,UE).


    • RTI2018-096846-B-C21 / MCIU/AEI/FEDER

Bibliographic References

  • Agerri, R., I. S. Vicente, J. A. Campos, A. Barrena, X. Saralegi, A. Soroa, and E. Agirre. 2020. Give your Text Representation Models some Love: the Case for Basque. mar.
  • Artetxe, M., S. Ruder, and D. Yogatama. 2019. On the Cross-lingual Transferability of Monolingual Representations. oct.
  • Asai, A., A. Eriguchi, K. Hashimoto, and Y. Tsuruoka. 2018. Multilingual extractive reading comprehension by runtime machine translation. CoRR, abs/1809.03275.
  • Cañete, J., G. Chaperon, R. Fuentes, and J. P´erez. 2020. Spanish Pre-Trained BERT Model and Evaluation Data. In to appear in PML4DC at ICLR 2020.
  • Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, oct. Association for Computational Linguistics.
  • Fader, A., L. Zettlemoyer, and O. Etzioni. 2013. Paraphrase-driven learning for open question answering. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1608–1618, Sofia, Bulgaria, August. Association for Computational Linguistics.
  • Hermann, K. M., T. Kocisky, E. Grefenstette, L. Espeholt, W. Kay, M. Suleyman, and P. Blunsom. 2015. Teaching machines to read and comprehend. In Advances in neural information processing systems, pages 1693–1701.
  • Hsu, T.-Y., C.-L. Liu, and H.-y. Lee. 2019. Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5933–5940, Hong Kong, China. Association for Computational Linguistics.
  • Kociský, T., J. Schwarz, P. Blunsom, C. Dyer, K. M. Hermann, G. Melis, and E. Grefenstette. 2018. The NarrativeQA reading comprehension challenge. Transactions of the Association for Computational Linguistics, 6:317–328.
  • Lai, G., Q. Xie, H. Liu, Y. Yang, and E. Hovy. 2017. RACE: Large-scale ReAding Comprehension Dataset From Examinations. EMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings, pages 785–794, apr.
  • Laurent, D., B. Chardon, S. Nègre, C. Pradel, and P. Séguéla. 2015. Reading comprehension at entrance exams 2015. In L. Cappellato, N. Ferro, G. J. F. Jones, and E. SanJuan, editors, Working Notes of CLEF 2015 - Conference and Labs of the Evaluation forum, Toulouse, France, September 8-11, 2015, volume 1391 of CEUR Workshop Proceedings.
  • Laurent, D., B. Chardon, S. Nègre, and P. Séguéla. 2014. French Run of Synapse Développement at Entrance Exams 2014. In CLEF (Working Notes), pages 1415–1426.
  • Li, X., R. Tian, N. L. T. Nguyen, Y. Miyao, and A. Aizawa. 2013. Question Answering System for Entrance Exams in QA4MRE. In CLEF (Working Notes). Citeseer.
  • Martin, L., B. Muller, P. J. O. Suárez, Y. Dupont, L. Romary, E. V. de la Clergerie, D. Seddah, and B. Sagot. 2019. CamemBERT: a Tasty French Language Model. nov.
  • Otegi, A., A. Agirre, J. A. Campos, A. Soroa, and E. Agirre. 2020. Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque. In 12th International Conference on Language Resources and Evaluation.
  • Rajpurkar, P., R. Jia, and P. Liang. 2018. Know What You Don{’}t Know: Unanswerable Questions for {SQ}u{AD}. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 784–789, Melbourne, Australia. Association for Computational Linguistics.
  • Richardson, M., C. J. C. Burges, and E. Renshaw. 2013. MCTest: A Challenge Dataset for the Open-Domain Machine Comprehension of Text. Technical report, nov.
  • Rodrigo, A., A. Peñas, Y. Miyao, and N. Kando. 2018. Do systems pass university entrance exams? Information Processing & Management, 54(4):564–575, jul.
  • Rogers, A., O. Kovaleva, M. Downey, and A. Rumshisky. 2020. Getting closer to ai complete question answering: A set of prerequisite real tasks.
  • Rogers, A., O. Kovaleva, and A. Rumshisky. 2020. A Primer in BERTology: What we know about how BERT works. feb.
  • Trischler, A., T. Wang, X. Yuan, J. Harris, A. Sordoni, P. Bachman, and K. Suleman. 2017. NewsQA: A machine comprehension dataset. In Proceedings of the 2nd Workshop on Representation Learning for NLP, pages 191–200, Vancouver, Canada, August. Association for Computational Linguistics.
  • Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. 2017. Attention is All you Need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30. Curran Associates, Inc., pages 5998–6008.
  • Wang, A., A. Singh, J. Michael, F. Hill, O. Levy, and S. Bowman. 2018. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In Proceedings of the 2018 {EMNLP} Workshop {B}lackbox{NLP}: Analyzing and Interpreting Neural Networks for {NLP}, pages 353–355, Brussels, Belgium. Association for Computational Linguistics.
  • Yang, Z., Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alch´e Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32. Curran Associates, Inc., jun, pages 5754–5764.