The needs for managing similar documents in different languages increases with the growing amounts of electronic information available in documents of the same type (e.g. news str...
Roberto Basili, Maria Teresa Pazienza, Fabio Massi...
This paper addresses a problem of natural language text alignment, from a humanities discipline called textual genetic criticism where different text versions must be compared. The...
Named Entity Recognition (NER) is an important subtask of document processing such as Information Extraction. This paper describes a NER algorithm which uses a Multi-Layer Percept...
—We present LAIR: A domain-specific language that enables users to specify actions to be taken upon meeting specific semantic frames in a text, in particular to rephrase and re...
Classification of documents by genre is typically done either using linguistic analysis or term frequency based techniques. The former provides better classification accuracy than...