Frequency counts from very large corpora, such as the Web 1T dataset, have recently become available for language modeling. Omission of low frequency n-gram counts is a practical ...
: We describe our participation in the TREC 2004 Web and Terabyte tracks. For the web track, we employ mixture language models based on document full-text, incoming anchortext, and...
We present a novel discriminative training algorithm for n-gram language models for use in large vocabulary continuous speech recognition. The algorithm uses large margin estimati...
We describe a preliminary implementation of the high-level modelling language Zinc. This language supports a modelling methodology in which the same Zinc model can be automatically...
Abstract. Test processes in the automotive industry are tool-intensive and affected by technologically heterogeneous test infrastructures. In the industrial practice a product has ...
Juergen Grossmann, Ines Fey, Alexander Krupp, Mirk...