In this paper we are interested in describing Web pages by how users interact within their contents. Thus, an alternate but complementary way of labelling and classifying Web docu...
This paper describes a newly created text corpus of news articles that has been annotated for cross-document co-reference. Being able to robustly resolve references to entities ac...
David Day, Janet Hitzeman, Michael L. Wick, Keith ...
There is considerable interest in interdisciplinary combinations of automatic speech recognition (ASR), machine learning, natural language processing, text classification and info...
Mark Dredze, Aren Jansen, Glen Coppersmith, Ken Wa...
In this paper we propose a new strategy for combining the outputs of several alignment systems. Based on the word boundaries retrieved from a number of individual alignment system...
This paper describes a method for hiding data inside printed text documents that is resilient to print/scan and photocopying operations. Using the principle of channel coding with...