Sciweavers

170 search results - page 18 / 34
» Text Retrieval from Document Images based on N-Gram Algorith...
Sort
View
SEMCO
2007
IEEE
15 years 6 months ago
Intelligent Parsing of Scanned Volumes for Web Based Archives
The proliferation of digital libraries and the large amount of existing documents raise important issues in efficient handling of documents. Printed texts in documents need to be...
Xiaonan Lu, James Ze Wang, C. Lee Giles
CIVR
2008
Springer
166views Image Analysis» more  CIVR 2008»
15 years 1 months ago
A probabilistic ranking framework using unobservable binary events for video search
Recent content-based video retrieval systems combine output of concept detectors (also known as high-level features) with text obtained through automatic speech recognition. This ...
Robin Aly, Djoerd Hiemstra, Arjen P. de Vries, Fra...
DIS
2007
Springer
15 years 5 months ago
Unsupervised Spam Detection Based on String Alienness Measures
We propose an unsupervised method for detecting spam documents from Web page data, based on equivalence relations on strings. We propose 3 measures for quantifying the alienness (...
Kazuyuki Narisawa, Hideo Bannai, Kohei Hatano, Mas...
MM
2006
ACM
167views Multimedia» more  MM 2006»
15 years 5 months ago
Image annotation by large-scale content-based image retrieval
Image annotation has been an active research topic in recent years due to its potentially large impact on both image understanding and Web image search. In this paper, we target a...
Xirong Li, Le Chen, Lei Zhang, Fuzong Lin, Wei-Yin...
DOCENG
2007
ACM
15 years 3 months ago
Elimination of junk document surrogate candidates through pattern recognition
A surrogate is an object that stands for a document and enables navigation to that document. Hypermedia is often represented with textual surrogates, even though studies have show...
Eunyee Koh, Daniel Caruso, Andruid Kerne, Ricardo ...