Topic models with power-law using Pitman-Yor process

13 years 3 months ago
Topic models with power-law using Pitman-Yor process
One of the important approaches for Knowledge discovery and Data mining is to estimate unobserved variables because latent variables can indicate hidden and specific properties of observed data. The latent factor model assumes that each item in a record has a latent factor; the co-occurrence of items can then be modeled by latent factors. In document modeling, a record indicates a document represented as a “bag of words”, meaning that the order of words is ignored and an item indicates a word. Latent Dirichlet allocation (LDA) has stimulated the use of the Dirichlet distribution over the latent topic distribution of a document. LDA assumes that latent topics, i.e., discrete latent variables, are distributed according to a multinomial distribution whose parameters are generated from the Dirichlet distribution. In an experiment using real data, this model outperformed LDA in document modeling. Keywords Topic Model, Latent Dirichlet Allocation, Nonparametric Bayes,PitmanYor Process,...
Issei Sato, Hiroshi Nakagawa
Added 15 Aug 2010
Updated 15 Aug 2010
Type Conference
Year 2010
Where KDD
Authors Issei Sato, Hiroshi Nakagawa
Comments (0)