In this paper, we propose a machine learning approach to title extraction from general documents. By general documents, we mean documents that can belong to any one of a number of...
Yunhua Hu, Hang Li, Yunbo Cao, Dmitriy Meyerzon, Q...
The ability to find tables and extract information from them is a necessary component of data mining, question answering, and other information retrieval tasks. Documents often c...
David Pinto, Andrew McCallum, Xing Wei, W. Bruce C...
Describing an application as a simple composition of services allows advanced features that exploit different platforms to be conceived e formalized at a high abstraction level. S...
This paper presents a Chinese word segmentation system that uses improved sourcechannel models of Chinese sentence generation. Chinese words are defined as one of the following fo...
The performance of automatic speech recognition (ASR) systems in the presence of noise is an area that has attracted a lot of research interest. Additive noise from interfering no...