BIGMiner: a fast and scalable distributed frequent pattern miner for big data

Cited 22 time in webofscience Cited 22 time in scopus
  • Hit : 449
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorChon, Kang-Wookko
dc.contributor.authorKim, Min-Sooko
dc.date.accessioned2020-03-19T02:25:30Z-
dc.date.available2020-03-19T02:25:30Z-
dc.date.created2020-03-10-
dc.date.created2020-03-10-
dc.date.issued2018-09-
dc.identifier.citationCLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, v.21, no.3, pp.1507 - 1520-
dc.identifier.issn1386-7857-
dc.identifier.urihttp://hdl.handle.net/10203/272667-
dc.description.abstractFrequent itemset mining is widely used as a fundamental data mining technique. Recently, there have been proposed a number of MapReduce-based frequent itemset mining methods in order to overcome the limits on data size and speed of mining that sequential mining methods have. However, the existing MapReduce-based methods still do not have a good scalability due to high workload skewness, large intermediate data, and large network communication overhead. In this paper, we propose BIGMiner, a fast and scalable MapReduce-based frequent itemset mining method. BIGMiner generates equal-sized sub-databases called transaction chunks and performs support counting only based on transaction chunks and bitwise operations without generating and shuffling intermediate data. As a result, BIGMiner achieves very high scalability due to no workload skewness, no intermediate data, and small network communication overhead. Through extensive experiments using large-scale datasets of up to 6.5 billion transactions, we have shown that BIGMiner consistently and significantly outperforms the state-of-the-art methods without any memory problems.-
dc.languageEnglish-
dc.publisherSPRINGER-
dc.titleBIGMiner: a fast and scalable distributed frequent pattern miner for big data-
dc.typeArticle-
dc.identifier.wosid000457275200004-
dc.identifier.scopusid2-s2.0-85041818619-
dc.type.rimsART-
dc.citation.volume21-
dc.citation.issue3-
dc.citation.beginningpage1507-
dc.citation.endingpage1520-
dc.citation.publicationnameCLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS-
dc.identifier.doi10.1007/s10586-018-1812-0-
dc.contributor.localauthorKim, Min-Soo-
dc.contributor.nonIdAuthorChon, Kang-Wook-
dc.description.isOpenAccessN-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorFrequent pattern mining-
dc.subject.keywordAuthorBig data-
dc.subject.keywordAuthorScalable algorithm-
dc.subject.keywordAuthorDistributed algorithm-
dc.subject.keywordAuthorMapReduce-
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 22 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0