Sciweavers

241 search results - page 19 / 49
» Detecting Co-Derivative Documents in Large Text Collections
Sort
View
FTDCS
2003
IEEE
15 years 2 months ago
pFilter: Global Information Filtering and Dissemination Using Structured Overlay Networks
The exponential data growth rate of the Internet makes it increasingly difficult for people to find desired information in a timely fashion. Information filtering and dissemina...
Chunqiang Tang, Zhichen Xu
ICDAR
2009
IEEE
15 years 4 months ago
HMM-Based Handwritten Amharic Word Recognition with Feature Concatenation
Amharic is the official language of Ethiopia and uses Ethiopic script for writing. In this paper, we present writer-independent HMM-based Amharic word recognition for offline hand...
Yaregal Assabie, Josef Bigün
IJAIT
2002
122views more  IJAIT 2002»
14 years 9 months ago
Comparing Keyword Extraction Techniques for WEBSOM Text Archives
The WEBSOM methodology for building very large text archives has a very slow method for extracting meaningful unit labels. This is because the method computes for the relative fre...
Arnulfo P. Azcarraga, Teddy N. Yap Jr., Tat-Seng C...
82
Voted
AND
2010
14 years 7 months ago
Document: a useful level for facing noisy data
In this paper we will present a set of experiments using large digitalized collections of books to show that logical structures can be extracted with good quality when working at ...
Hervé Déjean, Jean-Luc Meunier
DIS
2007
Springer
15 years 3 months ago
Unsupervised Spam Detection Based on String Alienness Measures
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Mas...