We investigate the impact of input data scale in corpus-based learning using a study style of Zipf's law. In our research, Chinese word segmentation is chosen as the study ca...
This paper proposes a new approach for the automatic extraction of bilingual terms from a domain-specific bilingual parallel corpus. We combine existing monolingual term extractor...
Protein-protein interaction (PPI) identification is an integral component of many biomedical research and database curation tools. Automation of this task through classification ...
This paper describes a study in which a corpus of spoken Danish annotated with focus and topic tags was used to investigate the relation between information structure and pauses. ...
How much can we infer about the pronunciation of a language – past or present – by observing which words its speakers rhyme? This paper explores the connection between pronunc...