Sciweavers

1261 search results - page 173 / 253
» Extracting Text from PostScript
Sort
View
CIKM
2008
Springer
14 years 11 months ago
A densitometric approach to web page segmentation
Web Page segmentation is a crucial step for many applications in Information Retrieval, such as text classification, de-duplication and full-text search. In this paper we describe...
Christian Kohlschütter, Wolfgang Nejdl
PKDD
2005
Springer
125views Data Mining» more  PKDD 2005»
15 years 3 months ago
A Propositional Approach to Textual Case Indexing
Abstract. Problem solving with experiences that are recorded in text form requires a mapping from text to structured cases, so that case comparison can provide informed feedback fo...
Nirmalie Wiratunga, Robert Lothian, Sutanu Chakrab...
MUC
1991
15 years 1 months ago
University of Massachusetts: description of the CIRCUS system as used for MUC-3
ind this work was to extract a relatively abstract level of information from each sentence , using only a limited vocabulary that was hand-crafted to handle a restricted set of tar...
Wendy G. Lehnert, Claire Cardie, David Fisher, Ell...
SKG
2006
IEEE
15 years 3 months ago
Embedding the Semantic Knowledge in Convolution Kernels
Convolution kernels, such as tree kernel and subsequence kernel are useful for natural language processing tasks. However, most of them ignore the semantic knowledge. In order to ...
Kebin Liu, Fang Li, Ying Han, Lei Liu
KDD
1998
ACM
159views Data Mining» more  KDD 1998»
15 years 2 months ago
A Robust System Architecture for Mining Semi-Structured Data
The value of extracting knowledge from semi-structured data is readily apparent with the explosion of the WWW and the advent of digital libraries. This paper proposes a versatile ...
Lisa Singh, Bin Chen, Rebecca Haight, Peter Scheue...