This paper describes a new method for extracting open compounds (uninterrupted sequences of words) from text corpora of languages, such as Thai, Japanese and Korea that exhibit un...
The problem addressed in this paper is to predict a user's numeric rating in a product review from the text of the review. Unigram and n-gram representations of text are comm...
We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a...
Michal Rosen-Zvi, Chaitanya Chemudugunta, Thomas L...
In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...
We describe a new method for extracting Negative Polarity Item candidates (NPI candidates) from dependency-parsed German text corpora. Semi-automatic extraction of NPIs is a chall...