This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Abstract. The requirements for effective search and management of the WWW are stronger than ever. Currently Web documents are classified based on their content not taking into acco...
Maria Halkidi, Benjamin Nguyen, Iraklis Varlamis, ...
Text streams are becoming more and more ubiquitous, in the forms of news feeds, weblog archives and so on, which result in a large volume of data. An effective way to explore the...
Xiang Wang 0002, Kai Zhang, Xiaoming Jin, Dou Shen
Translingual information retrieval (TLIR) consists of providing a query in one language and searching document collections in one or more di erent languages. This paper introduces...
Yiming Yang, Jaime G. Carbonell, Ralf D. Brown, Ro...
Image databases are nowadays widely exploited in a number of different contexts, ranging from history of art, through medicine, to education. Existing querying paradigms are based ...