The Grammatical Knowledge base of Contemporary Chinese contains detailed feature descriptions of the morphological and syntactic behavior of a more than fifty thousand Chinese wor...
Chinese characters that are similar in their pronunciations or in their internal structures are useful for computer-assisted language learning and for psycholinguistic studies. Al...
The main problems in text classification are lack of labeled data, as well as the cost of labeling the unlabeled data. We address these problems by exploring co-training - an algo...
: Support Vector Machines (SVMs) have become an increasingly popular tool for machine learning tasks involving classi cation, regression or novelty detection. They exhibit good gen...
The knowledge discovery process encounters the difficulties to analyze large amount of data. Indeed, some theoretical problems related to high dimensional spaces then appear and de...