Sciweavers

241 search results - page 33 / 49
» Detecting Co-Derivative Documents in Large Text Collections
Sort
View
SIGIR
1998
ACM
15 years 1 months ago
Boosting and Rocchio Applied to Text Filtering
We discuss two learning algorithms for text filtering: modified Rocchio and a boosting algorithm called AdaBoost. We show how both algorithms can be adapted to maximize any gene...
Robert E. Schapire, Yoram Singer, Amit Singhal
ICDAR
2011
IEEE
13 years 9 months ago
Touching Character Separation in Chinese Handwriting Using Visibility-Based Foreground Analysis
Abstract—In offline handwritten text recognition, the separation of touching characters remains a challenge due to the variability of touching structures. This paper proposes a ...
Liang Xu, Fei Yin, Qiu-Feng Wang, Cheng-Lin Liu
SIGSOFT
2007
ACM
15 years 10 months ago
Training on errors experiment to detect fault-prone software modules by spam filter
The fault-prone module detection in source code is of importance for assurance of software quality. Most of previous fault-prone detection approaches are based on software metrics...
Osamu Mizuno, Tohru Kikuno
DEXA
2006
Springer
193views Database» more  DEXA 2006»
15 years 1 months ago
Understanding and Enhancing the Folding-In Method in Latent Semantic Indexing
Abstract. Latent Semantic Indexing(LSI) has been proved to be effective to capture the semantic structure of document collections. It is widely used in content-based text retrieval...
Xiang Wang 0002, Xiaoming Jin
WEBDB
2004
Springer
170views Database» more  WEBDB 2004»
15 years 3 months ago
Content and Structure in Indexing and Ranking XML
Rooted in electronic publishing, XML is now widely used for modelling and storing structured text documents. Especially in the WWW, retrieval of XML documents is most useful in co...
Felix Weigel, Holger Meuss, Klaus U. Schulz, Fran&...