Sciweavers

602 search results - page 14 / 121
» Integrating Data and Probabilistically Structured Text Docum...
Sort
View
95
Voted
ML
2000
ACM
124views Machine Learning» more  ML 2000»
14 years 9 months ago
Text Classification from Labeled and Unlabeled Documents using EM
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. ...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...
BTW
2007
Springer
131views Database» more  BTW 2007»
15 years 1 months ago
TopX - Efficient and Versatile Top-k Query Process-ing for Text, Semistructured, and Structured Data
: This paper presents a comprehensive overview of the TopX search engine, an extensive framework for unified indexing and querying large collections of unstructured, semistructured...
Martin Theobald, Ralf Schenkel, Gerhard Weikum
DDEP
2000
Springer
15 years 1 months ago
GODDAG: A Data Structure for Overlapping Hierarchies
Abstract. Notations like SGML and XML represent document structures using tree structures; while this is in general a step forward from earlier systems, it creates certain difficul...
C. M. Sperberg-McQueen, Claus Huitfeldt
CIKM
2008
Springer
14 years 11 months ago
Mapping enterprise entities to text segments
Today, valuable business information is increasingly stored as unstructured data (documents, emails, etc.). For example, documents exchanged between business partners capture info...
Falk Brauer, Alexander Löser, Hong-Hai Do
MM
1999
ACM
141views Multimedia» more  MM 1999»
15 years 1 months ago
Passive capture and structuring of lectures
Despite recent advances in authoring systems and tools, creating multimedia presentations remains a labor-intensive process. This paper describes a system for automatically constr...
Sugata Mukhopadhyay, Brian Christopher Smith