Web Page Classification using an ensemble of support vector machine classifiers
|
Title | Web Page Classification using an ensemble of support vector machine classifiers |
Authors | |
Abstract | Web Page Classification (WPC) is both an important and challenging topic in data mining. The knowledge of WPC can help users to obtain useable information from the huge internet dataset automatically and efficiently. Many efforts have been made to WPC. However, there is still room for improvement of current approaches. One particular challenge in training classifiers comes from the fact that the available dataset is usually unbalanced. Standard machine learning algorithms tend to be overwhelmed by the major class and ignore the minor one and thus lead to high false negative rate. In this paper, a novel approach for Web page classification was proposed to address this problem by using an ensemble of support vector machine classifiers to perform this work. Principal Component Analysis (PCA) is used for feature reduction and Independent Component Analysis (ICA) for feature selection. The experimental results indicate that the proposed approach outperforms other existing classifiers widely used in WPC. |
Publisher | ACADEMY PUBLISHER |
Date | 2011-11-01 |
Source | Journal of Networks Vol 6, No 11 (2011) |
Rights | Copyright © ACADEMY PUBLISHER - All Rights Reserved.To request permission, please check out URL: http://www.academypublisher.com/copyrightpermission.html. |