We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Social media are becoming increasingly popular and have attracted considerable attention from spammers. Using a sample of more than ninety thousand known spam Web sites, we found ...
We present a strategy for answering fact-based natural language questions that is guided by a characterization of realworld user queries. Our approach, implemented in a system cal...
Our work is motivated by the problem of managing data on storage devices, typically a set of disks. Such storage servers are used as web servers or multimedia servers, for handling...
Leana Golubchik, Samir Khuller, Yoo Ah Kim, Svetla...
Web search components such as ranking and query suggestions analyze the user data provided in query and click logs. While this data is easy to collect and provides information abo...
Jeff Huang, Ryen W. White, Georg Buscher, Kuansan ...