This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
We propose a new graph-based semisupervised learning (SSL) algorithm and demonstrate its application to document categorization. Each document is represented by a vertex within a ...
Developing effective content recognition methods for diverse imagery continues to challenge computer vision researchers. We present a new approach for document image content catego...
Guangyu Zhu, Xiaodong Yu, Yi Li, David S. Doermann
Dynamic Miss-Countingalgorithms are proposed, which find all implication and similarity rules with confidence pruning but without support pruning. To handle data sets with a large...
Shinji Fujiwara, Jeffrey D. Ullman, Rajeev Motwani