In this paper, we investigate the use of words and subwords (including both characters and syllables) in audio indexing for Mandarin Chinese spoken document retrieval. Two retrieva...
Reliable indexing of documents having seal instances can be achieved by recognizing seal information. This paper presents a novel approach for detecting and classifying such multi...
Abstract. Text interpretation can be considered as the process of extracting deep-level semantics from unstructured text documents. Deeplevel semantics represent abstract index str...
Irma Sofia Espinosa Peraldi, Atila Kaya, Sylvia Me...
: We describe our participation in the TREC 2003 Robust and Web tracks. For the Robust track, we experimented with the impact of stemming and feedback on the worst scoring topics. ...
Jaap Kamps, Christof Monz, Maarten de Rijke, B&oum...
In this paper, we propose a novel document clustering method based on the non-negative factorization of the termdocument matrix of the given document corpus. In the latent semanti...