We consider the problem of automatically extracting general lists from the web. Existing approaches are mostly dependent upon either the underlying HTML markup or the visual struc...
Fabio Fumarola, Tim Weninger, Rick Barber, Donato ...
Dataspaces are collections of heterogeneous and partially unstructured data. Unlike data-integration systems that also offer uniform access to heterogeneous data sources, dataspac...
Versioned document collections are collections that contain multiple versions of each document. Important examples are Web archives, Wikipedia and other wikis, or source code and ...
Abstract. There is a significant commercial and research interest in locationbased web search engines. Given a number of search keywords and one or more locations that a user is in...
With the technology advance and the growth of Internet, the information that can be found in this net, as well as the number of users that access to look for specific data is big...