UNED LSI en TASS 2013Consideraciones acerca de la representación textual para la clasificación de tweets basada en recuperación de Información
- Ángel Castellanos González 1
- Juan Cigarrán Recuero 1
- Ana García Serrano 1
-
1
Universidad Nacional de Educación a Distancia
info
- Alberto Díaz Esteban (coord.)
- Iñaki Alegria Loinaz (coord.)
- Julio Villena Román (coord.)
Editorial: Sociedad Española para el Procesamiento del Lenguaje Natural
ISBN: 978-84-695-8349-4
Ano de publicación: 2013
Páxinas: 213-219
Congreso: Sociedad Española para el Procesamiento del Lenguaje Natural. Congreso (29. 2013. Madrid)
Tipo: Achega congreso
Resumo
This article summarizes the work proposed for our participation at TASS 2013, which is proposed as an extension of work done for TASS 2012. The work carried out the previous year was focused on the tweet classification based on an Information Retrieval (IR) approach: the classes are modeled according to the textual information of the tweets belonging to each class, and the tweets to be classified are used as query. This year we have applied this approach on Sentiment Analysis and Topic Classification tasks, but this year our work is focused on analyzing the type of tweet information to use to carry out the classification and what process should be followed to take this information into account. In this sense, we have proposed different types of modeling as well as different ways of performing the information retrieval process according to the different types of information. The results suggest that although the use of this type of information is valuable (especially named entities), it should always be done in conjunction with the overall content of the tweets.