This paper presents the design of a new interface for interactive Topic Detection and Tracking (TDT) called Ievent. It is composed of 3 main views; a Cluster View, a Document View,...
This document describes the CLIPS experiments in the CLEF 2005 campaign. We used a surface-syntactic parser in order to extract new indexing terms. These terms are considered synta...
Abstract. We present a hybrid machine learning approach for information extraction from unstructured documents by integrating a learned classifier based on the Maximum Entropy Mod...
Abstract. The singular value decomposition, or SVD , has been studied in the past as a tool for detecting and understanding patterns in a collection of documents. We show how the m...
Abstract. Discourse segmentation is the division of a text into minimal discourse segments, which form the leaves in the trees that are used to represent discourse structures. A de...