We illustrate that Web searches can often be utilized to generate background text for use with text classification. This is the case because there are frequently many pages on the...
We demonstrate a system to automatically grab data from data intensive web sites. The system first infers a model that describes at the intensional level the web site as a collec...
Valter Crescenzi, Giansalvatore Mecca, Paolo Meria...
The VideoCLEF track, introduced in 2008, aims to develop and evaluate tasks related to analysis of and access to multilingual multimedia content. In its first year, VideoCLEF pilo...
Abstract. As a consequence of the success of the Web, methodologies for information system development need to consider systems that use the Web paradigm. These Web Information Sys...
Geert-Jan Houben, Peter Barna, Flavius Frasincar, ...
A major obstacle to the construction of a probabilistic translation model is the lack of large parallel corpora. In this paper we first describe a parallel text mining system that...