This article proposes the TSUNAMI, which supports an energy-efficient deep-neural-network training. The TSUNAMI supports multi-modal iterative pruning to generate zeros in activation and weight. Tile-based dynamic activation pruning unit and weight memory shared pruning unit eliminate additional memory access. Coarse-zero skipping controller skips multiple unnecessary multiply-and-accumulation (MAC) operations at once, and fine-zero skipping controller skips randomly located unnecessary MAC operations. Weight sparsity balancer solves a utilization degradation caused by weight sparsity imbalance, and the workload of each convolution core is allocated by a random channel allocator. The TSUNAMI achieves an energy efficiency of 3.42 TFLOPS/W at 0.78V and 50MHz with floating-point 8-bit activation and weight. Also, it achieves an energy efficiency of 405.96 TFLOPS/W at 90% sparsity condition.