In this paper, a supervised learning system of word sense disambiguation is presented. It is based on conditional maximum entropy models. This system acquires the linguistic knowl...
In this paper we will present a set of experiments using large digitalized collections of books to show that logical structures can be extracted with good quality when working at ...
The World Wide Web is a vast resource for information. At the same time it is extremely distributed. A particular type of data such as restaurant lists maybe scattered across thous...
Rapid growth of digital data collections is overwhelming the capabilities of humans to comprehend them without aid. The extraction of useful data from large raw data sets is someth...
Form reading technology based on form-type identification and form-data recognition is proposed. This technology can solve difficulties in variety for reading different items on f...