This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
Parallel browsing describes a behavior where users visit Web pages in multiple concurrent threads. Web browsers explicitly support this by providing tabs. Although parallel browsi...
Existing tools intended to build and deploy engaging complex Web sites (including functionality) have shown to be inadequate to face the software production process in an unified a...
In this paper, we study the problem of Web forum crawling. Web forum has now become an important data source of many Web applications; while forum crawling is still a challenging ...
Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei ...
This is a system demo for a set of tools for translating texts between multiple languages in real time with high quality. The translation works on restricted languages, and is bas...