In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
Topic models have been studied extensively in the context of monolingual corpora. Though there are some attempts to mine topical structure from cross-lingual corpora, they require ...
Numerous researchers have proposed to use relational databases to store and query XML documents. In these systems, the elements selected by an XML query are returned to an applica...
Artem Chebotko, Mustafa Atay, Shiyong Lu, Farshad ...
Multi-document summarization is a challenge to information overload problem to provide a condensed text for a number of documents. Most multi-document summarization systems make u...
We study a novel shallow information extraction problem that involves extracting sentences of a given set of topic categories from medical forum data. Given a corpus of medical fo...