Web Page Classification Using Relational Learning Algorithm and Unlabeled Data
|
Title | Web Page Classification Using Relational Learning Algorithm and Unlabeled Data |
Authors | |
Abstract | Applying relational tri-training (R-tri-training for short) to web page classification is investigated in this paper. R-tri-training, as a new relational semi-supervised learning algorithm, is well suitable for learning in web page classification. The semi-supervised component of R-tri-training allows it to exploit unlabeled web pages to enhance the learning performance effectively. In addition, the relational component of R-tri-training is able to describe how the neighboring web pages are related to each other by hyperlinks. Experiments on Web-Kb dataset show that: 1) a large amount of unlabeled web pages (the unlabeled data) can be used by R-tri-training to enhance the performance of the learned hypothesis; 2) the performance of R-tri-training is better than the other algorithms compared with it. |
Publisher | ACADEMY PUBLISHER |
Date | 2011-03-01 |
Source | Journal of Computers Vol 6, No 3 (2011): Special Issue: E-Service and Applications |
Rights | Copyright © ACADEMY PUBLISHER - All Rights Reserved.To request permission, please check out URL: http://www.academypublisher.com/copyrightpermission.html. |