Recommender systems and time context: Characterization of a robust evaluation protocol to increase reliability of measured improvements

Campos, Pedro G.

Recommender systems and time context: Characterization of a robust evaluation protocol to increase reliability of measured improvements

Campos, Pedro G.

Dirigida por:

Fernando Rubio Díez Director/a
Iván Cantador Director/a

Universidad de defensa: Universidad Autónoma de Madrid

Fecha de defensa: 14 de octubre de 2013

Tribunal:

Roberto Moriyón Salomón Presidente/a
Pablo Castells Azpilicueta Secretario/a
Julio Gonzalo Arroyo Vocal
Roi Blanco González Vocal
Juan Francisco Huete Guadix Vocal

Tipo: Tesis

Teseo: 351435 DIALNET Biblos-e Archivo editor

Resumen

Recommender Systems (RS) aim to help users with information access and retrieval tasks, suggesting items ¿products or services¿ according to past preferences ¿interests, tastes¿ in certain contexts. For such purpose, one of the most studied contexts is the so-called temporal context, which has originated an already extensive research area, known as Time-Aware Recommender Systems (TARS). Despite the large number of approaches and advances on TARS, in the literature, reported results and conclusions about how to exploit time information seem to be contradictory. Although several reasons could explain such contradictory findings, in this thesis we hypothesize that TARS evaluation plays a fundamental role. The existence of multiple evaluation methodologies and metrics makes it possible to find some evaluation protocol suitable for a particular recommendation approach, but ineligible or non-retributive for others. Problems that arise from this situation represent an impediment to fairly compare results and conclusions reported in different studies, making complex the identification of the best recommendation approach for a given task. Moreover, the review of published work shows that most of the existing TARS have been developed for diminishing the error in the prediction of user preferences (ratings) for items. However, nowadays the RS focus is shifting towards finding (lists of) items relevant for the target user. Also, the use of RS in diverse tasks lets develop new applications where time context information can serve as a distinctive input. In this thesis we analyze how time context information has been exploited in the RS literature, in order to a) characterize a robust protocol that lets conduct fair evaluations of new TARS, and facilitate comparisons between published performance results; and b) better exploit time context information in different recommendation tasks. Aiming to accomplish such goals, we have identified key methodological issues regarding offline evaluation of TARS, and propose a methodological framework that lets precisely describe conditions used in the evaluation of TARS. From the analysis of these conditions, we provide a number of guidelines for a robust evaluation of RS in general, and TARS in particular. Moreover, we propose adaptations and new methods for different recommendation tasks, based on the proper exploitation of available time context information. By using fair evaluation settings, we are able to reliably assess the performance of different methods, identifying the circumstances under which some of them outperform the others. In summary, by means of the proposed methodological characterization and the conducted experiments, we show the importance of using a robust evaluation method to measure the performance of TARS, issue which had not been addressed in depth so far.