Identifying highlights in multimedia content such as video and audio is currently a very difficult technical problem. We present and evaluate a novel algorithm that identifies hig...
This paper presents domain-independent methods of spoken document retrieval. Both a continuous-speech large vocabulary recognition system, and a phone-lattice word spotter, are us...
Gareth J. F. Jones, J. T. Foote, Karen Sparck Jone...
This paper describes and evaluates various general stemming approaches for the French, Portuguese (Brazilian), German and Hungarian languages. Based on the CLEF test-collections, ...
In this paper, we describe a capture-recapture experiment conducted on Google's and MSN's cached directories. The anticipated outcome of this work was to monitor evoluti...
Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Index...