Several problems in text categorization are too hard to be solved by standard bag-of-words representations. Work in kernel-based learning has approached this problem by (i) consid...
Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources o...
In this paper, we describe a fully automatic video retrieval prototype system that uses an image or a video sequence of an interested identity as probe. The system is based on fac...
While the Internet community recognized early on the need to store and preserve past content of the Web for future use, the tools developed so far for retrieving information from ...
Adam Jatowt, Yukiko Kawai, Satoshi Nakamura, Yutak...
Templates in web sites hurt search engine retrieval performance, especially in content relevance and link analysis. Current template removal methods suffer from processing speed ...