In the ocean of Web data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized sys...
Ricardo A. Baeza-Yates, Carlos Castillo, Flavio Ju...
The high quality, structured data from Web structured sources is invaluable for many applications. Hidden Web databases are not directly crawlable by Web search engines and are on...
We are building a biomedical information resource consisting of digitized x-ray images and associated textual data from national health surveys. This resource, the Web-based Medic...
Recommendation algorithms aim at proposing “next” pages to a user based on her current visit and the past users’ navigational patterns. In the vast majority of related algor...
While scalable data mining methods are expected to cope with massive Web data, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stopp...