Text classification poses some specific challenges. One such challenge is its high dimensionality where each document (data point) contains only a small subset of them. In this pap...
With more and more reviews on the web, browsing through a mass of the related reviews becomes a heavy work. How to effectively analyzing and organizing these reviews attracts more...
Shu Zhang, Wen-Jie Jia, Yingju Xia, Yao Meng, Hao ...
This report explains our plagiarism detection method using fuzzy semantic-based string similarity approach. The algorithm was developed through four main stages. First is pre-proce...
As mobile computing becomes widespread, so will the need for digital document delivery by hypertextual means. A further trend will be the provision of the ability for devices to de...
Abstract. XML documents are increasingly being used to mark up various kinds of data from web content to scientific data. Often these documents need to be collaboratively created a...