More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
We are presenting a text analysis tool set that allows analysts in various fields to sieve through large collections of multilingual news items quickly and to find information that...
The Web is a hypertextual environment in permanent evolution. There are new technologies and Web publishing behaviors emerging everyday. This study presents trends on the evolutio...
In this paper, we propose a Web image search result organizing method to facilitate user browsing. We formalize this problem as a salient image region pattern extraction problem. ...
Measuring the similarity between implicit semantic relations is an important task in information retrieval and natural language processing. For example, consider the situation whe...