In languages that use diacritical characters, if these special signs are stripped-off from a word, the resulted string of characters may not exist in the language, and therefore i...
Time series data abounds in real world problems. Measuring the similarity of time series is a key to solving these problems. One state of the art measure is the longest common sub...
Formal languages for probabilistic modeling enable re-use, modularity, and descriptive clarity, and can foster generic inference techniques. We introduce Church, a universal langu...
Noah Goodman, Vikash K. Mansinghka, Daniel M. Roy,...
Corpus-based methods for natural language processing often use supervised training, requiring expensive manual annotation of training corpora. This paper investigates methods for ...
Parallel independent disks can enhance the performance of external memory (EM) algorithms, but the programming task is often di cult. In this paper we develop randomized variants ...