The Web has become an excellent source for gathering consumer opinions. There are now numerous Web sites containing such opinions, e.g., customer reviews of products, forums, disc...
: Everyone realizes how powerful the few big Web search engine companies have become, both in terms of financial resources due to soaring stock quotes and in terms of the still hid...
The recent trend in the Internet traffic is increasing in requests for dynamic and personalized content. To efficiently serve this trend, several serverside and cache-side fragme...
We consider the problem of template-independent news extraction. The state-of-the-art news extraction method is based on template-level wrapper induction, which has two serious li...
Junfeng Wang, Xiaofei He, Can Wang, Jian Pei, Jiaj...
The Online Database of Interlinear Text (ODIN)1 is a database of interlinear text "snippets", harvested mostly from scholarly documents posted to the Web. Although large...