Comparativa de aproximaciones a SVM semisupervisado multiclase para clasificación de páginas Web

  1. Zubiaga, Arkaitz
  2. Fresno Fernández, Víctor
  3. Martínez Unanue, Raquel
Zeitschrift:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Datum der Publikation: 2009

Nummer: 42

Seiten: 63-70

Art: Artikel

Andere Publikationen in: Procesamiento del lenguaje natural

Zusammenfassung

In this paper we present a study on semi-supervised multiclass web page classification using SVM. Due to the binary and supervised nature of the classical SVM algorithms, and trying to avoid complex optimization problems, we propose an approach based on the combination of classifiers, not only binary semi-supervised classifiers but also multiclass supervised ones. The results of our experiments over three benchmark datasets show noticeably higher performance for the combination of multiclass supervised classifiers. On the other hand, we analyze the contribution of unlabeled documents during the learning process for these environments. In our case, and unlike for binary tasks, we get higher effectiveness for multiclass tasks when no unlabeled documents are taken into account.