Document-centric XML is a mixture of text and structure. With the increased availability of document-centric XML content comes a need for query facilities in which both structural...
Jaap Kamps, Maarten Marx, Maarten de Rijke, Bö...
Splitting compound words has proved to be useful in areas such as Machine Translation, Speech Recognition or Information Retrieval (IR). Furthermore, real-time IR systems (such as...
As a principled approach to capturing semantic relations of words in information retrieval, statistical translation models have been shown to outperform simple document language m...
We propose new methods to exploit contemporaneous text, such as on-line news articles, to improve language models for automatic speech recognition and other natural language proce...
As XML documents contain both content and structure information, taking advantage of the document structure in the retrieval process can lead to better identify relevant informati...
Karen Sauvagnat, Mohand Boughanem, Claude Chrismen...