Sciweavers

167 search results - page 17 / 34
» Statistical Relational Learning for Document Mining
Sort
View
KDD
2010
ACM
218views Data Mining» more  KDD 2010»
15 years 1 months ago
Online multiscale dynamic topic models
We propose an online topic model for sequentially analyzing the time evolution of topics in document collections. Topics naturally evolve with multiple timescales. For example, so...
Tomoharu Iwata, Takeshi Yamada, Yasushi Sakurai, N...
71
Voted
WWW
2009
ACM
15 years 4 months ago
Near real time information mining in multilingual news
This paper presents a near real-time multilingual news monitoring and analysis system that forms the backbone of our research work. The system integrates technologies to address t...
Martin Atkinson, Erik Van der Goot
ACL
2003
14 years 11 months ago
Generalized Algorithms for Constructing Statistical Language Models
Recent text and speech processing applications such as speech mining raise new and more general problems related to the construction of language models. We present and describe in...
Cyril Allauzen, Mehryar Mohri, Brian Roark
76
Voted
KDD
2004
ACM
195views Data Mining» more  KDD 2004»
15 years 10 months ago
Improved robustness of signature-based near-replica detection via lexicon randomization
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
CIKM
2004
Springer
15 years 3 months ago
Hierarchical document categorization with support vector machines
Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques ...
Lijuan Cai, Thomas Hofmann