DSpace at KOASAS: (A) fast distributed deep learning platform based on virtual shared memory framework for high performance computing system

DSpace at KOASAS

College of Engineering(공과대학)Dept. of Information and Communications Engineering(정보통신공학과)ICE-Theses_Ph.D.(박사논문)

(A) fast distributed deep learning platform based on virtual shared memory framework for high performance computing system고성능 컴퓨팅 시스템을 위한 가상 공유 메모리 프레임워크 기반 고속 분산 딥러닝 플랫폼

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 1115
Download : 0

Export

Ahn, Shinyoung

Deep learning is one of the major promising machine learning methodologies. Deep learning is widely used, e.g., in image recognition, voice recognition, and natural language processing. In order to improve learning accuracy, deep neural networks have evolved by (i) increasing the number of layers and also by (ii) increasing the number of parameters in massive models. This implies that distributed deep learning platforms need to evolve to deal with huge/complex deep learning models and process with high performance computing resources for massive training data. The problems that the distributed deep learning platforms should address is to communicate deep learning parameters at high speed between distributed deep learning processes and to reduce the parameter traffic.To exchange deep learning parameters fast, we have to overcome inherent inefficiency of existing communication libraries and protocols.First, this thesis proposes a novel virtual shared memory framework, called Soft Memory Box~(SMB), which enables distributed processes in the computing servers share the memory of remote servers with lower overheads so as to improve communication performance. Second, this thesis proposes a new distributed deep learning platform, named as ShmCaffe, which utilizes remote shared memory for communication overhead reduction in massive deep neural network training parameter sharing. ShmCaffe is designed based on the SMB, a virtual shared memory framework. In the ShmCaffe platform, the remote shared memory is used as a shared buffer for asynchronous massive parameter sharing among many distributed deep learning processes. Moreover, a hybrid method that combines asynchronous and synchronous parameter update methods is also discussed in this platform to improve scalability. According to the first performance evaluation results, the communication time of the SMB is 2.1 times faster than that of the massage passing interface (MPI) in the scenario where computation and communication is sequential. In addition, in the parallel computation-communication scenario, the communication time of the SMB-based asynchronous parameter update becomes 2 through 7 times faster than that using the MPI depending on deep learning models and the number of deep learning workers. As a result of second evaluation, This paper verifies that the Inception_v1 model training using ShmCaffe converge by varying the number of workers. The scalability of ShmCaffe is evaluated by comparing the Inception_v1 training time of asynchrnous ShmCaffe and hybrid ShmCaffe. ShmCaffe is 10.1 times faster than Caffe, 2.8 times faster than Caffe-MPI, and 2.6 times faster than Tensorflow in the training of Inception_v1 with 16 GPUs. The main benefits of communication traffic, and by scaling out the deep learning workers. As a results, ShmCaffe improves the productivity of deep learning network developer, reduce the cost by increasing the utilization of the computation resources, and overcome heterogeneity of GPU servers.

Advisors: Kang, Sungwon researcher; 강성원 researcher

Description: 한국과학기술원 :정보통신공학과,

Publisher: 한국과학기술원

Issue Date: 2018

Identifier: 325007

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 정보통신공학과, 2018.8,[vii, 104 p. :]

Keywords: High performance computing▼adistributed computing▼asoft memory box▼ashared memory▼adeep neural network▼adistributed deep learning; 고성능 컴퓨팅▼a분산 컴퓨팅▼a소프트 메모리 박스▼a공유 메모리▼a심층신경망▼a분산 딥러닝

URI: http://hdl.handle.net/10203/265371

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=828239&flag=dissertation

Appears in Collection: ICE-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

(A) fast distributed deep learning platform based on virtual shared memory framework for high performance computing system고성능 컴퓨팅 시스템을 위한 가상 공유 메모리 프레임워크 기반 고속 분산 딥러닝 플랫폼

KOASAS

Communities & Collections