Sciweavers

EACL
2006
ACL Anthology

Multilingual Term Extraction from Domain-specific Corpora Using Morphological Structure

13 years 5 months ago
Multilingual Term Extraction from Domain-specific Corpora Using Morphological Structure
Morphologically complex terms composed from Greek or Latin elements are frequent in scientific and technical texts. Word forming units are thus relevant cues for the identification of terms in domainspecific texts. This article describes a method for the automatic extraction of terms relying on the detection of classical prefixes and word-initial combining forms. Word-forming units are identified using a regular expression. The system then extracts terms by selecting words which either begin or coalesce with these elements. Next, terms are grouped in families which are displayed as a weighted list in HTML format.
Delphine Bernhard
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where EACL
Authors Delphine Bernhard
Comments (0)