Sciweavers

CORR
2006
Springer

A tool set for the quick and efficient exploration of large document collections

13 years 4 months ago
A tool set for the quick and efficient exploration of large document collections
: We are presenting a set of multilingual text analysis tools that can help analysts in any field to explore large document collections quickly in order to determine whether the documents contain information of interest, and to find the relevant text passages. The automatic tool, which currently exists as a fully functional prototype, is expected to be particularly useful when users repeatedly have to sieve through large collections of documents such as those downloaded automatically from the internet. The proposed system takes a whole document collection as input, carries out some automatic analysis tasks, annotates the texts with the generated meta-information, stores the meta-information in a database, and provides the users with an interface that allows them to search and view the most pertinent text passages. In the first step, named entities (names of people, organisations and places) are recognised and stored. Then, highly similar documents are grouped into clusters of documents...
Camelia Ignat, Bruno Pouliquen, Ralf Steinberger,
Added 11 Dec 2010
Updated 11 Dec 2010
Type Journal
Year 2006
Where CORR
Authors Camelia Ignat, Bruno Pouliquen, Ralf Steinberger, Tomaz Erjavec
Comments (0)