We introduce a new, powerful class of text proximity queries: find an instance of a given "answer type" (person, place, distance) near "selector" tokens matchi...
A collaborative crawler is a group of crawling nodes, in which each crawling node is responsible for a specific portion of the web. We study the problem of collecting geographical...
Through a variety of means, including a range of browser cache methods and inspecting the color of a visited hyperlink, client-side browser state can be exploited to track users a...
Collin Jackson, Andrew Bortz, Dan Boneh, John C. M...
This research examines the privacy comfort levels of participants if others can view traces of their web browsing activity. During a week-long field study, participants used an el...
Large-scale text categorization is an important research topic for Web data mining. One of the challenges in large-scale text categorization is how to reduce the amount of human e...