Sciweavers

367 search results - page 5 / 74
» Indexing Text Documents Based on Topic Identification
Sort
View
ICDM
2007
IEEE
184views Data Mining» more  ICDM 2007»
15 years 3 months ago
Bayesian Folding-In with Dirichlet Kernels for PLSI
Probabilistic latent semantic indexing (PLSI) represents documents of a collection as mixture proportions of latent topics, which are learned from the collection by an expectation...
Alexander Hinneburg, Hans-Henning Gabriel, Andr&eg...
CIKM
2005
Springer
15 years 3 months ago
Biasing web search results for topic familiarity
Depending on a web searcher’s familiarity with a query’s target topic, it may be more appropriate to show her introductory or advanced documents. The TREC HARD [1] track defi...
Giridhar Kumaran, Rosie Jones, Omid Madani
LREC
2010
138views Education» more  LREC 2010»
14 years 11 months ago
Evaluating a Text Mining Based Educational Search Portal
In this paper, we present the main features of a text mining based search engine for the UK Educational Evidence Portal available at the UK National Centre for Text Mining (NaCTeM...
Sophia Ananiadou, John McNaught, James Thomas, Mar...
CIKM
2008
Springer
14 years 11 months ago
Modeling hidden topics on document manifold
Topic modeling has been a key problem for document analysis. One of the canonical approaches for topic modeling is Probabilistic Latent Semantic Indexing, which maximizes the join...
Deng Cai, Qiaozhu Mei, Jiawei Han, Chengxiang Zhai
ICPR
2008
IEEE
15 years 4 months ago
Unsupervised categorization of heterogeneous text images based on fractals
This paper deals about text extraction from heterogeneous documents for categorizing documents and indexing tasks. The purpose of this work is to find similar text regions basing ...
Badreddine Khelifi, Nizar Zaghden, Adel M. Alimi, ...