Automatic word alignment is a key step in training statistical machine translation systems. Despite much recent work on word alignment methods, alignment accuracy increases often ...
We propose using large-scale clustering of dependency relations between verbs and multiword nouns (MNs) to construct a gazetteer for named entity recognition (NER). Since dependen...
We present a probabilistic model of diachronic phonology in which individual word forms undergo stochastic edits along the branches of a phylogenetic tree. Our approach allows us ...
In this paper we present a procedure to learn a topological model of Situated Public Displays from data of people traveling between these displays. This model encompasses the dista...
In this paper, we apply weighted ridge regression to tackle the highly unbalanced data issue in automatic largescale ICD-9 coding of medical patient records. Since most of the ICD...