Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

82

Voted

NAACL
2007

favoriteEmaildiscussreport

105views Computational Linguistics» more NAACL 2007»

A Random Text Model for the Generation of Statistical Language Invariants

15 years 2 months ago

A Random Text Model for the Generation of Statistical Language Invariants

Download wortschatz.uni-leipzig.de

A novel random text generation model is introduced. Unlike in previous random text models, that mainly aim at producing a Zipfian distribution of word frequencies, our model also takes the properties of neighboring co-occurrence into account and introduces the notion of sentences in random text. After pointing out the deficiencies of related models, we provide a generation process that takes neither the Zipfian distribution on word frequencies nor the small-world structure of the neighboring co-occurrence graph as a constraint. Nevertheless, these distributions emerge in the process. The distributions obtained with the random generation model are compared to a sample of natural language data, showing high agreement also on word length and sentence length. This work proposes a plausible model for the emergence of large-scale characteristics of language without assuming a grammar or semantics.

Chris Biemann

Real-time Traffic

Computational Linguistics | Generation Model | NAACL 2007 | Neighboring Co-occurrence | Random Text |

claim paper

Related Content

» Improved Text Generation Using Ngram Statistics

» A Sequential Model for Discourse Segmentation

» Enhancing Domain Portability of Chinese Segmentation Model Using ChiSquare Statistics and ...

» Active Learning for Multilingual Statistical Machine Translation

» Text normalization based on statistical machine translation and internet user support

» Toward text message normalization Modeling abbreviation generation

» Abbreviated text input using language modeling

» Improving Grammaticality in Statistical Sentence Generation Introducing a Dependency Spann...

» Combining Statistical and KnowledgeBased Spoken Language Understanding in Conditional Mode...

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2007
Where	NAACL
Authors	Chris Biemann

Comments (0)