The Internet makes it possible to share and manipulate a vast quantity of information efficiently and effectively, but the rapid and chaotic growth experienced by the Net has gener...
In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...
Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...
Abstract. Current text classification systems typically use term stems for representing document content. Semantic Web technologies allow the usage of features on a higher semantic...
The requirements imposed on information retrieval systems are increasing steadily. The vast number of documents in today's large databases and especially on World Wide Web ca...
The Web contains a large amount of documents and increasingly, also semantic data in the form of RDF triples. Many of these triples are annotations that are associated with docume...