Comparativa de aproximaciones a SVM semisupervisado multiclase para clasificación de páginas Web

  1. Zubiaga, Arkaitz
  2. Fresno Fernández, Víctor
  3. Martínez Unanue, Raquel
Revista:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Ano de publicación: 2009

Número: 42

Páxinas: 63-70

Tipo: Artigo

Outras publicacións en: Procesamiento del lenguaje natural

Resumo

In this paper we present a study on semi-supervised multiclass web page classification using SVM. Due to the binary and supervised nature of the classical SVM algorithms, and trying to avoid complex optimization problems, we propose an approach based on the combination of classifiers, not only binary semi-supervised classifiers but also multiclass supervised ones. The results of our experiments over three benchmark datasets show noticeably higher performance for the combination of multiclass supervised classifiers. On the other hand, we analyze the contribution of unlabeled documents during the learning process for these environments. In our case, and unlike for binary tasks, we get higher effectiveness for multiclass tasks when no unlabeled documents are taken into account.