A new type of language resource 'BAStat' has been released by the Bavarian Archive for Speech Signals. In contrast to primary resources like speech and text corpora BASt...
Embedding algorithms are a method for revealing low dimensional structure in complex data. Most embedding algorithms are designed to handle objects of a single type for which pair...
Amir Globerson, Gal Chechik, Fernando Pereira, Naf...
The core of our question-answering mechanism is searching for predefined patterns of textual expressions that may be interpreted as answers to certain types of questions. The pres...
: We describe our participation in the TREC 2004 Web and Terabyte tracks. For the web track, we employ mixture language models based on document full-text, incoming anchortext, and...
Chinese NE (Named Entity) recognition is a difficult problem because of the uncertainty in word segmentation and flexibility in language structure. This paper proposes the use of ...