Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 559
  • Download : 0
Since Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) were introduced, many revised or extended topic models have appeared. Due to the intractable likelihood of these models, training any topic model requires to use some approximation algorithm such as variational approximation, Laplace approximation, or Markov chain Monte Carlo (MCMC). Although these approximation algorithms perform well, training a topic model is still computationally expensive given the large amount of data it requires. In this paper, we propose a new method, called non-simultaneous sampling deactivation, for efficient approximation of parameters in a topic model. While each random variable is normally sampled or obtained by a single predefined burn-in period in the traditional approximation algorithms, our new method is based on the observation that the random variable nodes in one topic model have all different periods of convergence. During the iterative approximation process, the proposed method allows each random variable node to be terminated or deactivated when it is converged. Therefore, compared to the traditional approximation ways in which usually every node is deactivated concurrently, the proposed method achieves the inference efficiency in terms of time and memory. We do not propose a new approximation algorithm, but a new process applicable to the existing approximation algorithms. Through experiments, we show the time and memory efficiency of the method, and discuss about the tradeoff between the efficiency of the approximation process and the parameter consistency.
Publisher
KSII-KOR SOC INTERNET INFORMATION
Issue Date
2013-01
Language
English
Article Type
Article
Citation

KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, v.7, no.1, pp.81 - 98

ISSN
1976-7277
DOI
10.3837/tiis.2013.01.006
URI
http://hdl.handle.net/10203/174602
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0