Text segmentation is important for text analysis, while text alignment is to determine shared sub-topics among similar documents. Multi-task text segmentation and alignment is the...
This paper describes a new approach towards detecting plagiarism and scientific documents that have been read but not cited. In contrast to existing approaches, which analyze docu...
The paper considers the problem of defining concepts within formal ontologies. A number of distinct modes of definition are identified, which represent alternative viewpoints on t...
In this paper, we investigate how modeling content structure can benefit text analysis applications such as extractive summarization and sentiment analysis. This follows the lingu...
We present CAVaT, a tool that performs Corpus Analysis and Validation for TimeML. CAVaT is an open source, modular checking utility for statistical analysis of features specific t...