Sciweavers

AIC
2015

Information-Theoretic Segmentation of Natural Language

8 years 14 days ago
Information-Theoretic Segmentation of Natural Language
Abstract. We present computational experiments on language segmentation using a general information-theoretic cognitive model. We present a method which uses the statistical regularities of language to segment a continuous stream of symbols into “meaningful units” at a range of levels. Given a string of symbols—in the present approach, textual representations of phonemes—we attempt to find the syllables such as grea and sy (in the word greasy); words such as in, greasy, wash, and water; and phrases such as in greasy wash water. The approach is entirely information-theoretic, and requires no knowledge of the units themselves; it is thus assumed to require only general cognitive abilities, and has previously been applied to music. We tested our approach on two spoken language corpora, and we discuss our results in the context of learning as a statistical processes.
Sascha S. Griffiths, Mariano Mora McGinity, Jamie
Added 14 Apr 2016
Updated 14 Apr 2016
Type Journal
Year 2015
Where AIC
Authors Sascha S. Griffiths, Mariano Mora McGinity, Jamie Forth, Matthew Purver, Geraint A. Wiggins
Comments (0)