Many real-world datasets can be clustered along multiple dimensions. For example, text documents can be clustered not only by topic, but also by the author's gender or sentim...
A collection of 3208 reported errors of Chinese words were analyzed. Among which, 7.2% involved rarely used character, and 98.4% were assigned common classifications of their caus...
Psychologically preparing for upcoming events can be a difficult task, particularly when switching social contexts, e.g., from office work to a family event. To help with such tra...
Timothy Sohn, Leila Takayama, Dean Eckles, Rafael ...
We use the technique of SVM anchoring to demonstrate that lexical features extracted from a training corpus are not necessary to obtain state of the art results on tasks such as N...
Distinguishing speculative statements from factual ones is important for most biomedical text mining applications. We introduce an approach which is based on solving two sub-probl...