Sciweavers

AAAI
2015

A Sequence Labeling Approach to Deriving Word Variants

8 years 1 months ago
A Sequence Labeling Approach to Deriving Word Variants
This paper describes a learning-based approach for automatic derivation of word variant forms by the suffixation process. We employ the sequence labeling technique, which entails learning when to preserve, delete, substitute, or add a letter to form a new word from a given word. The features used by the learner are based on characters, phonetics, and hyphenation positions of the given word. To ensure that our system is robust to word variants that can arise from different forms of a root word, we generate multiple variant hypothesis for each word based on the sequence labeler’s prediction. We then filter out ill-formed predictions, and create clusters of word variants by merging together a word and its predicted variants with other words and their predicted variants provided the groups share a word in common. Our results show that this learning-based approach is feasible for the task and warrants further exploration.
Jennifer D'Souza
Added 27 Mar 2016
Updated 27 Mar 2016
Type Journal
Year 2015
Where AAAI
Authors Jennifer D'Souza
Comments (0)