This paper presents a new algorithm for sequence prediction over long categorical event streams. The input to the algorithm is a set of target event types whose occurrences we wis...
The blogosphere has unique structural and temporal properties since blogs are typically used as communication media among human individuals. In this paper, we propose a novel tech...
Users attempt to express their search goals through web search queries. When a search goal has multiple components or aspects, documents that represent all the aspects are likely ...
KDD is a complex and demanding task. While a large number of methods has been established for numerous problems, many challenges remain to be solved. New tasks emerge requiring th...
Ingo Mierswa, Michael Wurst, Ralf Klinkenberg, Mar...
We propose a new unsupervised learning technique for extracting information from large text collections. We model documents as if they were generated by a two-stage stochastic pro...
Mark Steyvers, Padhraic Smyth, Michal Rosen-Zvi, T...