Finding information about people on the Web using a search engine is difficult because there is a many-to-many mapping between person names and specific persons (i.e. referents). ...
In the ocean of Web data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized sys...
Ricardo A. Baeza-Yates, Carlos Castillo, Flavio Ju...
In this paper, we propose a new approach to automatically clustering e-commerce search engines (ESEs) on the Web such that ESEs in the same cluster sell similar products. This all...
The rise of social interactions on the Web requires developing new methods of information organization and discovery. To that end, we propose a generative community-based probabil...
We present GoGetIt!, a tool for generating structure-driven crawlers that requires a minimum effort from the users. The tool takes as input a sample page and an entry point to a W...
Altigran Soares da Silva, Edleno Silva de Moura, J...