Sciweavers

374 search results - page 24 / 75
» Modeling Chinese Documents with Topical Word-Character Model...
Sort
View
DEXAW
2010
IEEE
202views Database» more  DEXAW 2010»
15 years 26 days ago
Identifying Sentence-Level Semantic Content Units with Topic Models
Abstract--Statistical approaches to document content modeling typically focus either on broad topics or on discourselevel subtopics of a text. We present an analysis of the perform...
Leonhard Hennig, Thomas Strecker, Sascha Narr, Ern...
ICDM
2009
IEEE
109views Data Mining» more  ICDM 2009»
15 years 6 months ago
Knowledge Discovery from Citation Networks
—Knowledge discovery from scientific articles has received increasing attentions recently since huge repositories are made available by the development of the Internet and digit...
Zhen Guo, Zhongfei Zhang, Shenghuo Zhu, Yun Chi, Y...
ECIR
2009
Springer
15 years 9 months ago
A Topic-Based Measure of Resource Description Quality for Distributed Information Retrieval
The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...
Mark Baillie, Mark James Carman, Fabio Crestani
CIKM
2005
Springer
15 years 5 months ago
Biasing web search results for topic familiarity
Depending on a web searcher’s familiarity with a query’s target topic, it may be more appropriate to show her introductory or advanced documents. The TREC HARD [1] track defi...
Giridhar Kumaran, Rosie Jones, Omid Madani
NIPS
2001
15 years 1 months ago
Latent Dirichlet Allocation
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian m...
David M. Blei, Andrew Y. Ng, Michael I. Jordan