Current statistical machine translation systems usually extract rules from bilingual corpora annotated with 1-best alignments. They are prone to learn noisy rules due to alignment...
This paper describes a study in which a corpus of spoken Danish annotated with focus and topic tags was used to investigate the relation between information structure and pauses. ...
Abstract. Kanazawa has shown that several non-trivial classes of categorial grammars are learnable in Gold’s model. We propose in this article to adapt this kind of symbolic lear...
An experimental adaptive filtering system, built on the Okapi search engine, is described. In addition to the regular text retrieval functions, the system requires a complex set o...
Many tasks of information extraction or natural language processing have a property that the data naturally consist of several views--disjoint subsets of features. Specifically, a ...