Automated detection of the first document reporting each new event in temporally-sequenced streams of documents is an open challenge. In this paper we propose a new approach which...
Yiming Yang, Jian Zhang, Jaime G. Carbonell, Chun ...
Multimedia ranking algorithms are usually user-neutral and measure the importance and relevance of documents by only using the visual contents and meta-data. However, users’ int...
Liang Gou, Hung-Hsuan Chen, Jung-Hyun Kim, Xiaolon...
Large volume public comment campaigns and web portals that encourage the public to customize form letters produce many near-duplicate documents, which increases processing and sto...
Topics in prior-art patent search are typically full patent applications and relevant items are patents often taken from sources in different languages. Cross language patent retr...
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...