Logo Goletty

Web Clustering Based On Tag Set Similarity
Journal Title Journal of Computers
Journal Abbreviation jcp
Publisher Group Academy Publisher
Website http://ojs.academypublisher.com
PDF (516 kb)
   
Title Web Clustering Based On Tag Set Similarity
Authors Zhu, Jianfeng; Qin, Leihua; Nie, Xuejun; Zhou, Jingli
Abstract Tagging is a service that allows users to associate a set of freely determined tags with web content. Clustering web documents with tag sets can eliminate the time-consuming preprocess of word stemming. This paper proposes a novel method to compute the similarity between tag sets and use it as the distance measure to cluster web documents into groups. Major steps in this method include computing a tag similarity matrix with set-based vector space model, smoothing the similarity matrix to obtain a set of linearly independent vectors and compute the tag set similarity based on these vectors. The experimental results show that the proposed tag set similarity measures surpasses other common similarity measures not only in the reliable derivation of clustering results, but also in clustering accuracies and efficiencies. 
Publisher ACADEMY PUBLISHER
Date 2011-01-01
Source Journal of Computers Vol 6, No 1 (2011): Special Issue: Research Findings in Computer Science-Technology and Application
Rights Copyright © ACADEMY PUBLISHER - All Rights Reserved.To request permission, please check out URL: http://www.academypublisher.com/copyrightpermission.html.

 

See other article in the same Issue


Goletty © 2024