In many domains there are specific attributes in documents that carry more weight than the general words in the document. This paper proposes the use of information extraction tec...
We present an approach for detecting link spam common in blog comments by comparing the language models used in the blog post, the comment, and pages linked by the comments. In co...
In the logical approach to information retrieval (IR), retrieval is considered as uncertain inference. Whereas classical IR models are based on propositional logic, we combine Dat...
To ease the retrieval of documents published on the Web, the documents should be classified in a way that users find helpful and meaningful. This paper presents an approach to sema...
One aspect in which retrieving named entities is different from retrieving documents is that the items to be retrieved – persons, locations, organizations – are only indirect...