Guided HTM: Hierarchical Topic Model with Dirichlet Forest Priors

Cited 5 time in webofscience Cited 0 time in scopus
  • Hit : 856
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorShin, Su-Jinko
dc.contributor.authorMoon, Il-Chulko
dc.date.accessioned2017-04-14T08:14:29Z-
dc.date.available2017-04-14T08:14:29Z-
dc.date.created2016-11-22-
dc.date.created2016-11-22-
dc.date.created2016-11-22-
dc.date.issued2017-02-
dc.identifier.citationIEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, v.29, no.2, pp.330 - 343-
dc.identifier.issn1041-4347-
dc.identifier.urihttp://hdl.handle.net/10203/223034-
dc.description.abstractDespite the proliferation of topic models, the organization of topics from the probabilistic models needs improvement in two ways: the better structured presentation of topics and the incorporation of domain knowledge on the corpus. The structured presentation, i.e., the hierarchical topic model, helps in categorizing similar topics; the incorporation of domain knowledge enables the concentrated sampling of predefined keywords in the mixture parameter learning. This paper presents a hierarchical topic models with incorporated domain knowledge, called Guided Hierarchical Topic Model (GHTM). Specifically, we allocated the prior information from the knowledge to the Dirichlet Forest prior. From the prior adjustment, we obtained the topic tree guided by the domain knowledge. This paper also contributes in enumerating four different knowledge extraction methods and applying the extracted knowledge to GHTM. We evaluated the performance of GHTM in terms of the hierarchical clustering accuracy, and we found a significant improvement of hierarchical clustering measured by F-measures. This improvement is also verified by the perplexity analyses. Additionally, we measured topic quality with KL-divergence and visualization, and these confirm the ability to better separate topic distributions. Finally, we tested the hierarchical topic quality through human experiments, and this also revealed significant improvements originating from the guidance.-
dc.languageEnglish-
dc.publisherIEEE COMPUTER SOC-
dc.titleGuided HTM: Hierarchical Topic Model with Dirichlet Forest Priors-
dc.typeArticle-
dc.identifier.wosid000393987400007-
dc.identifier.scopusid2-s2.0-85009959808-
dc.type.rimsART-
dc.citation.volume29-
dc.citation.issue2-
dc.citation.beginningpage330-
dc.citation.endingpage343-
dc.citation.publicationnameIEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING-
dc.identifier.doi10.1109/TKDE.2016.2625790-
dc.contributor.localauthorMoon, Il-Chul-
dc.description.isOpenAccessN-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorHierarchical topic model-
dc.subject.keywordAuthorDirichlet Forest priors-
dc.subject.keywordAuthordomain knowledge-
dc.subject.keywordPlusDISTRIBUTIONS-
Appears in Collection
IE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 5 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0