SGMiner: A Fast and Scalable GPU-Based Frequent Pattern Miner on SSDs

Cited 1 time in webofscience Cited 0 time in scopus
  • Hit : 208
  • Download : 0
Frequent itemset mining is extensively employed as an essential data mining technique. Nevertheless, as the data size grows, the applicability of this method decreases owing to the relatively poor performance of the existing methods. Though numerous efficient sequential frequent itemset mining methods have been developed, the performance that can be achieved is clearly limited by the fact that they exploit only one thread. To overcome these limitations, a number of parallel methods using multi-core central processing units (CPUs), multiple machines or many-core graphic processing units (GPU) have been proposed. However, these methods are relatively slow in performance and have low scalability, mainly owing to large memory requirements for intermediate data, significant disk I/Os, and heavy computation. In this study, to resolve the aforementioned problems, we propose SGMiner, which is a new, fast, and scalable GPU-and disk-based method on a single machine equipped with multiple graphic processing units (GPUs) and multiple solid-state drives (SSDs) for extracting frequent patterns. It is based on an algorithm similar to the Apriori algorithm and neither has intermediate data nor large disk I/O overheads owing to its exploitation of SSDs. Moreover, we propose storing transaction databases, namely bitmap transaction chunks, in SSDs, streaming the chunks to GPU device memory via the main memory with reduced I/O overhead, and performing fast support counting with GPUs based on the chunks. In addition, when exploiting multiple GPUs and SSDs, it proposes a concept of replicating bitmap transaction chunks stored in SSDs to GPUs in a streaming fashion. This could allow an almost equal workload to be distributed evenly across multiple GPUs with reduced I/O overheads. The experiments we conducted demonstrate that SGMiner outperforms the existing methods in terms of scalability and performance with enhanced robustness.
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Issue Date
2022-06
Language
English
Article Type
Article
Citation

IEEE ACCESS, v.10, pp.62502 - 62519

ISSN
2169-3536
DOI
10.1109/ACCESS.2022.3179592
URI
http://hdl.handle.net/10203/297293
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 1 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0