Sciweavers

COLING
2010

Multiword Expressions in the wild? The mwetoolkit comes in handy

12 years 11 months ago
Multiword Expressions in the wild? The mwetoolkit comes in handy
The mwetoolkit is a tool for automatic extraction of Multiword Expressions (MWEs) from monolingual corpora. It both generates and validates MWE candidates. The generation is based on surface forms, while for the validation, a series of criteria for removing noise are provided, such as some (language independent) association measures.1 In this paper, we present the use of the mwetoolkit in a standard configuration, for extracting MWEs from a corpus of general-purpose English. The functionalities of the toolkit are discussed in terms of a set of selected examples, comparing it with related work on MWE extraction. 1 MWEs in a nutshell One of the factors that makes Natural Language Processing (NLP) a challenging area is the fact that some linguistic phenomena are not entirely compositional or predictable. For instance, why do we prefer to say full moon instead of total moon or entire moon if all these words can be considered synonyms to transmit the idea of completeness? This is an exampl...
Carlos Ramisch, Aline Villavicencio, Christian Boi
Added 14 May 2011
Updated 14 May 2011
Type Journal
Year 2010
Where COLING
Authors Carlos Ramisch, Aline Villavicencio, Christian Boitet
Comments (0)