UNED LSI en TASS 2013Consideraciones acerca de la representación textual para la clasificación de tweets basada en recuperación de Información

  1. Ángel Castellanos González 1
  2. Juan Cigarrán Recuero 1
  3. Ana García Serrano 1
  1. 1 Universidad Nacional de Educación a Distancia
    info

    Universidad Nacional de Educación a Distancia

    Madrid, España

    ROR https://ror.org/02msb5n36

Llibre:
XXIX Congreso de la Sociedad Española de Procesamiento de Lenguaje Natural: SEPLN 2013
  1. Alberto Díaz Esteban (coord.)
  2. Iñaki Alegria Loinaz (coord.)
  3. Julio Villena Román (coord.)

Editorial: Sociedad Española para el Procesamiento del Lenguaje Natural

ISBN: 978-84-695-8349-4

Any de publicació: 2013

Pàgines: 213-219

Congrés: Sociedad Española para el Procesamiento del Lenguaje Natural. Congreso (29. 2013. Madrid)

Tipus: Aportació congrés

Resum

This article summarizes the work proposed for our participation at TASS 2013, which is proposed as an extension of work done for TASS 2012. The work carried out the previous year was focused on the tweet classification based on an Information Retrieval (IR) approach: the classes are modeled according to the textual information of the tweets belonging to each class, and the tweets to be classified are used as query. This year we have applied this approach on Sentiment Analysis and Topic Classification tasks, but this year our work is focused on analyzing the type of tweet information to use to carry out the classification and what process should be followed to take this information into account. In this sense, we have proposed different types of modeling as well as different ways of performing the information retrieval process according to the different types of information. The results suggest that although the use of this type of information is valuable (especially named entities), it should always be done in conjunction with the overall content of the tweets.