I show that the World Wide Web is a small world, in the sense that sites are highly clustered yet the path length between them is small. I also demonstrate the advantages of a sear...
Finding information about people on the Web using a search engine is difficult because there is a many-to-many mapping between person names and specific persons (i.e. referents). ...
Fully automatic methods that extract lists of objects from the Web have been studied extensively. Record extraction, the first step of this object extraction process, identifies...
Despite the many solutions proposed by industry and the research community to address phishing attacks, this problem continues to cause enormous damage. Because of our inability t...
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...