In this work, we present a new semantic language modeling approach to model news stories in the Topic Detection and Tracking (TDT) task. In the new approach, we build a unigram la...
The Noise Sensitivity Signature (NSS), originally introduced by Grossman and Lapedes (1993), was proposed as an alternative to cross validation for selecting network complexity. I...
Translingual information retrieval (TLIR) consists of providing a query in one language and searching document collections in one or more di erent languages. This paper introduces...
Yiming Yang, Jaime G. Carbonell, Ralf D. Brown, Ro...
In this article, we present a test environment for a word analysis system that is used for reliable and sense-conveying hyphenation of German words. A crucial task is the hyphenati...
AdaBoost is a well known, effective technique for increasing the accuracy of learning algorithms. However, it has the potential to overfit the training set because its objective i...