Sciweavers

41 search results - page 2 / 9
» Text Genre Detection Using Common Word Frequencies
Sort
View
LREC
2010
150views Education» more  LREC 2010»
13 years 6 months ago
Design, Compilation, and Preliminary Analyses of Balanced Corpus of Contemporary Written Japanese
Compilation of a 100 million words balanced corpus called the Balanced Corpus of Contemporary Written Japanese (or BCCWJ) is underway at the National Institute for Japanese Langua...
Kikuo Maekawa, Makoto Yamazaki, Takehiko Maruyama,...
CIKM
2009
Springer
13 years 9 months ago
Improving binary classification on text problems using differential word features
We describe an efficient technique to weigh word-based features in binary classification tasks and show that it significantly improves classification accuracy on a range of proble...
Justin Martineau, Tim Finin, Anupam Joshi, Shamit ...
TKDE
2008
103views more  TKDE 2008»
13 years 5 months ago
Detecting Word Substitutions in Text
Searching for words on a watchlist is one way in which large-scale surveillance of communication can be done, for example, in intelligence and counterterrorism settings. One obviou...
SzeWang Fong, Dmitri Roussinov, David B. Skillicor...
COLING
1996
13 years 6 months ago
The Automatic Extraction of Open Compounds from Text Corpora
This paper describes a new method for extracting open compounds (uninterrupted sequences of words) from text corpora of languages, such as Thai, Japanese and Korea that exhibit un...
Virach Sornlertlamvanich, Hozumi Tanaka
ICVGIP
2004
13 years 6 months ago
Robust Segmentation of Unconstrained Online Handwritten Documents
A segmentation algorithm, which can detect different regions of a handwritten document such as text lines, tables and sketches will be extremely useful in a variety of application...
Anoop M. Namboodiri, Anil K. Jain