In this paper, we show how a domain dependent know-how textual database of advices and warnings can be constructed from procedural texts. We show how arguments of type warnings an...
In this paper, we present a trainable approach to discriminate between machine-printed and handwritten text. An integrated system able to localize text areas and split them in tex...
Table is a very common presentation scheme, but few papers touch on table extraction in text data mining. This paper focuses on mining tables from large-scale HTML texts. Table fi...
Leximancer is a software system for performing conceptual analysis of text data in a largely language independent manner. The system is modelled on Content Analysis and provides u...
Abstract. In this paper we report on a series of experiments investigating the path from text summarisation to style-specific summarisation of spoken news stories. We show that the...