Social network analysis became a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore th...
In this paper, we analyze whether dictionaries from the World Wide Web which contain phonetic notations, may support the rapid creation of pronunciation dictionaries within the sp...
We describe a syntactically annotated parallel corpus containing English, Swedish and Turkish. The corpus consists of approximately 300 000 tokens in Swedish, 160 000 in Turkish a...
The paper presents the fourth, "Mondilex" edition of the MULTEXT-East language resources, a multilingual dataset for language engineering research and development, focus...
The Enron Email Corpus provides "Real World" text in the business email domain, which is a target domain for many speech and language applications. We present a section ...