Sciweavers

602 search results - page 16 / 121
» Integrating Data and Probabilistically Structured Text Docum...
Sort
View
CIKM
2011
Springer
13 years 9 months ago
Towards noise-resilient document modeling
We introduce a generative probabilistic document model based on latent Dirichlet allocation (LDA), to deal with textual errors in the document collection. Our model is inspired by...
Tao Yang, Dongwon Lee
ICDM
2003
IEEE
138views Data Mining» more  ICDM 2003»
15 years 2 months ago
Ontologies Improve Text Document Clustering
Text document clustering plays an important role in providing intuitive navigation and browsing mechanisms by organizing large sets of documents into a small number of meaningful ...
Andreas Hotho, Steffen Staab, Gerd Stumme
CIT
2005
Springer
14 years 9 months ago
Simple Classification into Large Topic Ontology of Web Documents
The paper presents an approach to classifying Web documents into large topic ontology. The main emphasis is on having a simple approach appropriate for handling a large ontology an...
Marko Grobelnik, Dunja Mladenic
KDD
2005
ACM
185views Data Mining» more  KDD 2005»
15 years 9 months ago
Mining comparable bilingual text corpora for cross-language information integration
Integrating information in multiple natural languages is a challenging task that often requires manually created linguistic resources such as a bilingual dictionary or examples of...
Tao Tao, ChengXiang Zhai
DOCENG
2007
ACM
15 years 1 months ago
A document object modeling method to retrieve data from a very large XML document
Document Object Modeling (DOM) is widely used approach for retrieving data from an XML document. If the size of the XML document is very large, however, using the DOM approach for...
Seung Min Kim, Suk I. Yoo, Eunji Hong, Tae Gwon Ki...