Sciweavers

502 search results - page 61 / 101
» Extracting Partial Structures from HTML Documents
Sort
View
RIAO
2000
15 years 19 days ago
An Intelligent Text Extraction and Navigation System
We present sppc, a high-performance system for intelligent text extraction and navigation from German free text documents. sppc consists of a set of domainindependent shallow core...
Jakub Piskorski, Günter Neumann
WEBI
2005
Springer
15 years 4 months ago
A Semi-Supervised Document Clustering Algorithm Based on EM
Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the cat...
Leonardo Rigutini, Marco Maggini
IJCAI
2003
15 years 20 days ago
Domain Event Extraction and Representation with Domain Ontology
With domain ontology, a meaningful index of document indexing, such as the domain events structure in this paper, can be defined. Since the construction of domain ontology is cost...
Shih-Hung Wu, Tzong-Han Tsai, Wen-Lian Hsu
CIKM
2008
Springer
15 years 1 months ago
Identifying table boundaries in digital documents via sparse line detection
Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...
Ying Liu, Prasenjit Mitra, C. Lee Giles
DSVIS
2003
Springer
15 years 4 months ago
An Empirical Study of Personal Document Spaces
The way people use computers has changed in recent years, from desktop single-machine settings to many computers and personal assistants in widely different contexts. Personal Docu...
Daniel Gonçalves, Joaquim A. Jorge