More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
Abstract. As any other classification task, Word Sense Disambiguation requires a large number of training examples. These examples, which are easily obtained for most of the tasks,...
The Web has been rapidly "deepened" by myriad searchable databases online, where data are hidden behind query interfaces. As an essential task toward integrating these m...
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of experimentally characterised eukaryotic POL II promoters. The underlying definition of a promote...
The PC Desktop is a very rich repository of personal information, efficiently capturing user's interests. In this paper we propose a new approach towards an automatic persona...
Paul-Alexandru Chirita, Claudiu S. Firan, Wolfgang...