The past few years have experienced an explosive growth in scientific and regulatory documents related to the patent system. Relevant information is siloed into many heterogeneous...
Siddharth Taduri, Gloria T. Lau, Kincho H. Law, Ha...
The ability to find tables and extract information from them is a necessary component of many information retrieval tasks. Documents often contain tables in order to communicate d...
This paper presents the advantages of combining multiple document representation schemes for query processing of XML queries on content and structure. We show how extending the Te...
Social annotations on a Web document are highly generalized description of topics contained in that page. Their tagged frequency indicates the user attentions with various degrees...
Junyan Zhu, Can Wang, Xiaofei He, Jiajun Bu, Chun ...
The exponential growth of data demands scalable infrastructures capable of indexing and searching rich content such as text, music, and images. A promising direction is to combine...