In this research, a systematic study is conducted of four dimension reduction techniques for the text clustering problem, using five benchmark data sets. Of the four methods -- Ind...
Bin Tang, Michael A. Shepherd, Malcolm I. Heywood,...
tra Statistical machine translation systems are usually trained on large amounts of bilingual text and monolingual text. In this paper, we propose a method to perform domain adapta...
Readability assessment is a method to measure the difficulty of a piece of text material, and it is widely used in educational field to assist instructors to prepare appropriate m...
Recent development of location technologies enables us to obtain the location history of users. This paper proposes a new method to infer users’ longterm properties from their r...
The number of patent documents is currently rising rapidly worldwide, creating the need for an automatic categorization system to replace time-consuming and labor-intensive manual...