We address the problem of identifying the domain of online databases. More precisely, given a set F of Web forms automatically gathered by a focused crawler and an online database...
The explosion of user-generated content on the Web has led to new opportunities and significant challenges for companies, that are increasingly concerned about monitoring the disc...
Web users display their preferences implicitly by navigating through a sequence of pages or by providing numeric ratings to some items. Web usage mining techniques are used to ext...
We consider the problem of improving named entity recognition (NER) systems by using external dictionaries--more specifically, the problem of extending state-of-the-art NER system...
Mining data streams is important in both science and commerce. Two major challenges are (1) the data may grow without limit so that it is difficult to retain a long history; and (...