We describe a system we developed for identifying trends in text documents collected over a period of time. Trends can be used, for example, to discover that a company is shifting...
We report on the evaluation of information structural annotation according to the Linguistic Information Structure Annotation Guidelines (LISA, (Dipper et al., 2007)). The annotat...
This paper proposes a method for extracting bilingual text pairs from a comparable corpus. The basic idea of the method is to apply bootstrapping to an existing corpusbased cross-...
Hiroshi Masuichi, Raymond Flournoy, Stefan Kaufman...
Background: Topic detection is a task that automatically identifies topics (e.g., "biochemistry" and "protein structure") in scientific articles based on infor...
Mixture models form one of the most widely used classes of generative models for describing structured and clustered data. In this paper we develop a new approach for the analysis...