The amount of legal information is continuously growing. New legislative documents appear everyday in the Web. Legal documents are produced on a daily basis in briefingformat, cont...
Crawling the web is deceptively simple: the basic algorithm is (a) Fetch a page (b) Parse it to extract all linked URLs (c) For all the URLs not seen before, repeat (a)?(c). Howev...
This paper presents a near real-time multilingual news monitoring and analysis system that forms the backbone of our research work. The system integrates technologies to address t...
We apply statistical machine translation (SMT) tools to generate novel paraphrases of input sentences in the same language. The system is trained on large volumes of sentence pair...
Abstract. Extracting information automatically from texts for database representation requires previously well-grouped phrases so that entities can be separated adequately. This pr...