Parallel corpus is a rich linguistic resource for various multilingual text management tasks, including crosslingual text retrieval, multilingual computational linguistics and mul...
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
Social annotations on a Web document are highly generalized description of topics contained in that page. Their tagged frequency indicates the user attentions with various degrees...
Junyan Zhu, Can Wang, Xiaofei He, Jiajun Bu, Chun ...
In this paper, we present our online summarization system of web topics. The user defines the topic by a set of keywords. Then the system searches the Web for the relevant documen...
In Internet marketing, Web audience analysis is essential to understanding the visitors’ needs. However, the existing analysis tools fail to deliver summarized and conceptual me...