Sciweavers

2929 search results - page 460 / 586
» Models of English Text
Sort
View
ICIP
2003
IEEE
15 years 11 months ago
An entropy based segmentation algorithm for computer-generated document images
This paper presents an efficient compression-oriented segmentation algorithm for computer-generated document images. In this algorithm, a document image is represented in a block-...
Lijie Liu, Yan Dong, Xiaomu Song, Guoliang Fan
79
Voted
ICIP
1997
IEEE
15 years 11 months ago
An off-line large vocabulary hand-written Chinese character recognizer
A n off-line hand-written Chinese character recognizer based on Contextual Vector Quantization (CVQ) supporting a vocabulary of 4,616 Chinese characters, alphanumerics and punctua...
Pak-Kwong Wong, Chorkin Chan
WWW
2009
ACM
15 years 11 months ago
A densitometric analysis of web template content
What makes template content in the Web so special that we need to remove it? In this paper I present a large-scale aggregate analysis of textual Web content, corroborating statist...
Christian Kohlschütter
WWW
2009
ACM
15 years 11 months ago
SOFIE: a self-organizing framework for information extraction
This paper presents SOFIE, a system for automated ontology extension. SOFIE can parse natural language documents, extract ontological facts from them and link the facts into an on...
Fabian M. Suchanek, Mauro Sozio, Gerhard Weikum
WWW
2007
ACM
15 years 11 months ago
A no-frills architecture for lightweight answer retrieval
In a new model for answer retrieval, document collections are distilled offline into large repositories of facts. Each fact constitutes a potential direct answer to questions seek...
Marius Pasca