Pronunciation information is available in large quantities on the Web, in the form of IPA and ad-hoc transcriptions. We describe techniques for extracting candidate pronunciations...
Arnab Ghoshal, Martin Jansche, Sanjeev Khudanpur, ...
Extracting geographical information from various web sources is likely to be important for a variety of applications. One such use for this information is to enable the study of v...
HHrep is a web server for the de novo identification of repeats in protein sequences, which is based on the pairwise comparison of profile hidden Markov models (HMMs). Its main st...
The ability to find tables and extract information from them is a necessary component of many information retrieval tasks. Documents often contain tables in order to communicate d...
Fact collections are mostly built using semi-supervised relation extraction techniques and wisdom of the crowds methods, rendering them inherently noisy. In this paper, we propose...