Non-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 560
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorJeong, Young-Seobko
dc.contributor.authorJin, Sou-Youngko
dc.contributor.authorChoi, Ho-Jinko
dc.date.accessioned2013-08-08T05:46:04Z-
dc.date.available2013-08-08T05:46:04Z-
dc.date.created2013-04-09-
dc.date.created2013-04-09-
dc.date.issued2013-01-
dc.identifier.citationKSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, v.7, no.1, pp.81 - 98-
dc.identifier.issn1976-7277-
dc.identifier.urihttp://hdl.handle.net/10203/174602-
dc.description.abstractSince Probabilistic Latent Semantic Analysis (PLSA) and Latent Dirichlet Allocation (LDA) were introduced, many revised or extended topic models have appeared. Due to the intractable likelihood of these models, training any topic model requires to use some approximation algorithm such as variational approximation, Laplace approximation, or Markov chain Monte Carlo (MCMC). Although these approximation algorithms perform well, training a topic model is still computationally expensive given the large amount of data it requires. In this paper, we propose a new method, called non-simultaneous sampling deactivation, for efficient approximation of parameters in a topic model. While each random variable is normally sampled or obtained by a single predefined burn-in period in the traditional approximation algorithms, our new method is based on the observation that the random variable nodes in one topic model have all different periods of convergence. During the iterative approximation process, the proposed method allows each random variable node to be terminated or deactivated when it is converged. Therefore, compared to the traditional approximation ways in which usually every node is deactivated concurrently, the proposed method achieves the inference efficiency in terms of time and memory. We do not propose a new approximation algorithm, but a new process applicable to the existing approximation algorithms. Through experiments, we show the time and memory efficiency of the method, and discuss about the tradeoff between the efficiency of the approximation process and the parameter consistency.-
dc.languageEnglish-
dc.publisherKSII-KOR SOC INTERNET INFORMATION-
dc.titleNon-Simultaneous Sampling Deactivation during the Parameter Approximation of a Topic Model-
dc.typeArticle-
dc.identifier.wosid000315022000006-
dc.identifier.scopusid2-s2.0-84873435594-
dc.type.rimsART-
dc.citation.volume7-
dc.citation.issue1-
dc.citation.beginningpage81-
dc.citation.endingpage98-
dc.citation.publicationnameKSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS-
dc.identifier.doi10.3837/tiis.2013.01.006-
dc.contributor.localauthorChoi, Ho-Jin-
dc.contributor.nonIdAuthorJeong, Young-Seob-
dc.contributor.nonIdAuthorJin, Sou-Young-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorTopic mining-
dc.subject.keywordAuthorunsupervised learning-
dc.subject.keywordAuthorefficient parameter approximation-
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0