Recommender systems in social settings: proposal, development and testing in real scenarios

Castellanos González, Ángel

Recommender systems in social settingsproposal, development and testing in real scenarios

Castellanos González, Ángel

Dirigida por:

Ana M. García Serrano Directora
Juan Manuel Cigarrán Recuero Director

Universidad de defensa: UNED. Universidad Nacional de Educación a Distancia

Fecha de defensa: 30 de noviembre de 2015

Tribunal:

Lourdes Araujo Presidenta
Federico Alvarez García Secretario/a
Andreas Nürnberger Vocal

Tipo: Tesis

Teseo: 500633 DIALNET Acceso abierto editor

Resumen

Since the earlier works in recommender systems, the main aim of this research area is to assist users in the finding of relevant content among the overwhelming amount of data available on the Web. Recommender systems research interest started in the 90s with the rise of the Internet and the increasing of available data that it entailed. Nowadays, with the explosion of user-generated content in the context of the Web 2.0, the necessity of recommender systems is the same than in the 90s, but the related problems that they have to face are more challenging every day. This context of user-generated content and social web hinders the implementation of recommender systems, being one of the most acute the accurate modelling of user preferences. The initial works on the literature mainly addressed this issue from the perspective of Collaborative Filtering systems; however, the use of Content-based features is becoming more widespread. Among these Content-based systems, most of the works in the literature usually rely on the modelling of user and item dimensions by separate: user profiles are analyzed and modelled according to their Content-based features to then find the items that are most closely related to this model. This methodology introduces the problem of the user-item gap; i.e., the gap between both representation spaces. To overcome this problem, this thesis proposes a common representation space for recommendation. The modelling of both dimensions together in a common representation space appears to be, conceptually, the most sensible choice. In particular, we propose on a concept-based user-item modelling generated through the application of Formal Concept Analysis (FCA). Our main hypothesis is that the concept-based abstraction of user and item profiles that FCA generates will facilitate the better identification of useritem relationships, which can be understood by user preferences. Therefore, users and items will be represented in a common space by means of the unfolding user preferences (in the form of formal concepts), hierarchically organized in a natural way according to this specificity. In this way, it is expected to overcome the user-item gap problem, thus improving the recommendation process. In order to test our claim, we have isolated the evaluation of the performance of our proposal. The rationale is to firstly evaluate the performance of FCA for data representation to then evaluate this representation when applied for the recommendation task. To that end, we have applied the proposed FCA-modelling to two different scenarios independently of the recommendation task (Topic Detection @ Replab 2013 and Image Diversification @ MediaEVAL 2014 and 2015). The evaluation of FCA in these scenarios proves its overall suitability, achieving state-of-the-art results for both scenarios. This evaluation proves as well that, in contrast to other proposals in the literature, our system is barely affected by the different parameters related to its operation. Finally, we have addressed an extensive comparison to other well-known data representation methodologies (namely, Hierarchical Agglomerative Clustering and Latent Dirichlet Allocation) in relation to the quality of the generated representations. As proven by this comparison, the FCA-based representation has more quality and presents a more homogeneous behaviour than the rest of methodologies. In a later step, we have extended this modelling by integrating semantic features related to the item content. Not only does this enhanced model improve the modelling step, but it also enables a higher-level and more abstract representation, which results in lighter and more compact model. This aspect facilitates the overcoming of the challenges related to the application of our proposal to social-based real scenarios (i.e., Topic Detection @ Replab 2013). We have finally applied our FCA-based model to the recommendation task. We have firstly conducted a preliminary experimentation to prove the suitability of our proposal in social-based recommendation scenarios (NEWSREEL 2014 and ESWC LOD-RecSys 2014). From the analysis of the outcome of this preliminary experimentation, we have refined our FCA-based recommendation approach to create a common representation space for recommendation. Throughout its evaluation carried out in different social-based scenarios (UMAP 2011 Dataset and ESWC LOD-RecSys 2015), we have analysed the different aspect involved in the recommendation process, proving that, when available, higher-level semantic features entails more accurate recommendations than when raw textual descriptions are applied. We have confirmed as well that, as stated by other experimental works in the literature, in these social-based environments, systems using Content-based features outperform Collaborative Filtering systems Finally, this extensive analysis confirms our initial hypothesis in regards to our proposal. The high performance of our model for data representation remains when applied to the recommendation task. In particular, our FCA-based common representation space outperforms other recommender systems reported in the literature for the addressed tasks.