Sciweavers

3090 search results - page 89 / 618
» Document Processing with LinkIT
Sort
View
AAAI
1997
14 years 11 months ago
Template-Based Information Mining from HTML Documents
Tools for mining information from data can create added value for the Internet. As the majority of electronic documents available over the network are in unstructured textual form...
Jane Yung-jen Hsu, Wen-tau Yih
DAS
2010
Springer
15 years 2 months ago
Binarization of historical document images using the local maximum and minimum
This paper presents a new document image binarization technique that segments the text from badly degraded historical document images. The proposed technique makes use of the imag...
Bolan Su, Shijian Lu, Chew Lim Tan
MVA
1994
138views Computer Vision» more  MVA 1994»
14 years 11 months ago
A High-Speed Document Image Classifier
In this paper, a high-speed document image classification algorithm is presented. The algorithm is based on the bottom-up strategy which can successfully segment and classify any ...
Lejun Shao
SIGIR
2004
ACM
15 years 3 months ago
The document as an ergodic markov chain
In recent years, statistical language models are being proposed as alternative to the vector space model. Viewing documents as language samples introduces the issue of defining a...
Eduard Hoenkamp, Dawei Song
DEXAW
1995
IEEE
101views Database» more  DEXAW 1995»
15 years 1 months ago
Principles and Tools for Authoring Knowledge-Rich Documents
Digital libraries can take advantage of documents that have their content (semantics) explicitly represented as knowledge structures. These knowledge-rich documents can be created ...
Robert P. Futrelle, Natalya Fridman Noy