Clasificación de páginas web en dominio específico
- Rangel Pardo, Francisco Manuel
- Peñas Padilla, Anselmo
ISSN: 1135-5948
Ano de publicación: 2008
Número: 41
Páxinas: 89-96
Tipo: Artigo
Outras publicacións en: Procesamiento del lenguaje natural
Resumo
This paper obtains a novel representation that provides high performance in the automatic classification of web pages in specific domains. For this the study is focused on obtaining a formal representation of the author's intent to convey information about the web pages that he creates and that is reflected in the meta-information of the same page, in the structure of links, and in the URL. A dataset has been built in the specific domain of theater and the approach presented has obtained a performance raiting, measured both by statistical F and by the interval committed error, higher than existing methods in the state of the art.