Logo Goletty

GENERATION OF A SET OF KEY TERMS CHARACTERISING TEXT DOCUMENTS
Journal Title Journal of Information and Organizational Sciences
Journal Abbreviation jios
Publisher Group University of Zagreb
Website http://jios.foi.hr/index.php/jios/index
PDF (245 kb)
   
Title GENERATION OF A SET OF KEY TERMS CHARACTERISING TEXT DOCUMENTS
Authors Machova, Kristina; Szaboova, Andrea; Bednar, Peter
Abstract The presented paper describes statistical methods (information gain, mutual X^2 statistics, and TF-IDF method) for key words generation from a text document collection. These key words should characterise the content of text documents and can be used to retrieve relevant documents from a document collection. Term relations were detected on the base of conditional probability of term occurrences. The focus is on the detection of those words, which occur together very often. Thus, key words, which consist from two terms were generated additionally. Several tests were carried out using the 20 News Groups collection of text documents.
Publisher University o Zagreb, Faculty of Organization and Informatics, Varaždin
Date 2007-06-17
Source Journal of Information and Organizational Sciences Vol 31, No 1 (2007)

 

See other article in the same Issue


Goletty © 2024