Most of the Web-based methods for lexicon augmenting consist in capturing global semantic features of the targeted domain in order to collect relevant documents from the Web. We s...
Software publishers and information service providers publish information about their own products and about other products and people. Additional content might be incidental, suc...
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
In this paper, we argue that the agglomerative clustering with vector cosine similarity measure performs poorly due to two reasons. First, the nearest neighbors of a document belo...
A major challenge in developing models for hypertext retrieval is to effectively combine content information with the link structure available in hypertext collections. Although s...