Natural language processing technology has developed remarkably, but it is still difficult for computers to understand contextual meanings as humans do. The purpose of our work ha...
Linguistic Data Consortium (LDC) at the University of Pennsylvania has participated as a data provider in a variety of governmentsponsored programs that support development of Hum...
Kazuaki Maeda, Haejoong Lee, Stephen Grimes, Jonat...
In Chinese texts, words composed of single or multiple characters are not separated by spaces, unlike most western languages. Therefore Chinese word segmentation is considered an ...
This article describes the preparation, recording and orthographic transcription of a new speech corpus, the Nijmegen Corpus of Casual Spanish (NCCSp). The corpus contains around ...
As huge quantities of documents have become available, services using natural language processing technologies trained by huge corpora have emerged, such as information retrieval ...