Sciweavers

ACL
2012
11 years 7 months ago
Syntactic Annotations for the Google Books NGram Corpus
We present a new edition of the Google Books Ngram Corpus, which describes how often words and phrases were used over a period of five centuries, in eight languages; it reflects...
Yuri Lin, Jean-Baptiste Michel, Erez Aiden Lieberm...
ACL
2011
12 years 8 months ago
Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments
We address the problem of part-of-speech tagging for English data from the popular microblogging service Twitter. We develop a tagset, annotate data, develop features, and report ...
Kevin Gimpel, Nathan Schneider, Brendan O'Connor, ...
IAT
2006
IEEE
13 years 10 months ago
Semantic Labeling of Data by Using the Web
The Web consists of a large amount of unstructured information that hardly can be elaborated by automatic agents. In recent years, a considerable number of techniques for informat...
Leonardo Rigutini, Ernesto Di Iorio, Marco Ernande...
ARTCOM
2009
IEEE
13 years 11 months ago
Chunker for Tamil
This paper presents the Part Of Speech tagger and Chunker for Tamil using Machine learning techniques. Part Of Speech tagging and chunking are the fundamental processing steps for...
V. Dhanalakshmi, P. Padmavathy, M. Anand Kumar, K....