The debate within the Web community over the optimal means by which to organize information often pits formalized classifications against distributed collaborative tagging systems...
We address the problem of identifying the domain of online databases. More precisely, given a set F of Web forms automatically gathered by a focused crawler and an online database...
Digital content is not only stored by servers on the Internet, but also on various embedded devices belonging to ubiquitous networks. In this paper, we propose a content processin...
We address the problem of integrating objects from a source taxonomy into a master taxonomy. This problem is not only currently pervasive on the web, but also important to the eme...
Recent work on incremental crawling has enabled the indexed document collection of a search engine to be more synchronized with the changing World Wide Web. However, this synchron...
Lipyeow Lim, Min Wang, Sriram Padmanabhan, Jeffrey...