This paper describes our first participation in the Indian language sub-task of the main Adhoc monolingual and bilingual track in CLEF1 competition. In this track, the task is to...
In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. O...
Electronic publishing of material digitized using imaging and OCR calls for a special delivery format capable of reconstructing original documents in a well-usable electronic form...
The purpose of this work is to develop a pattern recognition system simulating the human vision. A transparent neural network, with context returns is used. The context returns co...
In recent years, statistical language models are being proposed as alternative to the vector space model. Viewing documents as language samples introduces the issue of defining a...