(A) sparse-dense GEMM accelerator based on row-wise product for memory-efficient graph convolutional neural networks메모리 효율적인 그래프 합성곱 신경망 가속을 위한 행별 곱 기반 희소-밀집 행렬 곱셈 가속기
Graph Convolutional Neural Networks (GCNs) has emerged as one of the powerful methods of representing a relationship between input data based on Deep Neural Network (DNN). GCN is consists of aggregation and combination. Because of the heterogeneous characteristics of these phases, general computing resources (CPU, GPU) do not achieve sufficient performance. Prior works proposed hardware architectures to accelerate GCN inference, but they still suffer memory-limited characteristics of GCN, because of 2D tiling’s inefficient dataflow and data reuse. This thesis proposes a tile-free sparse matrix for sparse-dense GEMM and to employ the row-wise product to eliminate disadvantages of 2D tiling. Also, this thesis proposes a data reuse strategy to support row-wise dataflow and micro-architectures to maximize memory-level parallelism and hardware utilization. This thesis implements cycle-accurate simulators and test on wide-range real-world graph datasets for evaluation. The proposed architecture achieves 2× reduction of off-chip memory access, 2.8× speedup, and 2.3× energy-efficiency compared to the prior GCN accelerator, GCNAX.