DSpace at KOASAS: Hardware and software systems for accelerating large-scale deep learning recommendation models

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Ph.D.(박사논문)

Hardware and software systems for accelerating large-scale deep learning recommendation models딥러닝 기반 대규모 추천시스템 가속을 위한 하드웨어 및 소프트웨어 시스템

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 1
Download : 0

Export

Kwon, Youngeun / 권영은

Deep learning-based recommendation models (DLRMs) are widely used for conducting personalized recommendations, which employ learnable vector parameters, known as embeddings, representing individualized characteristics of users and recommended items such as media contents, products, and ads. A unique characteristic of DLRMs is that due to the embedding layer, the size of the recommender model scales proportional to the size of the online service. Consequently, the size of the DLRMs reaches terabyte-scale for massive-scale online services like Facebook, far exceeding the capacity of bandwidth-optimized accelerator memory. The memory capacity and bandwidth demand from these enlarged embedding layers bring new system-level challenges in training and deploying large-scale recommendation models. This dissertation addresses the bottlenecks of the large-scale deep learning recommendation models by proposing novel hardware and software systems. This dissertation first identifies that enlarged embedding layers cause major performance challenges in DLRMs. The study clarifies the computational characteristics of such layers and proposes a near-memory processing (NMP) based accelerator hardware that efficiently stores and processes these embeddings. The proposed vertically integrated hardware/software co-design encompasses the required microarchitecture, instruction set architecture (ISA), system architecture, software stack, and a workload parallelization algorithm. Furthermore, to expand the research scope of the NMP-based embedding acceleration to the training context, this dissertation presents an algorithm-architecture co-design, which establishes a theoretical foundation for hardware accelerator design for the embedding layer. Since such specialized hardware-based acceleration systems can fundamentally address the challenges posed by large embedding layers, developing and maintaining these systems require non-trivial costs. As a cost-effective solution, this dissertation also presents software optimization techniques. By utilizing the highly sparse and skewed access patterns of the embedding layers, this dissertation presents a software-managed caching system using high-bandwidth GPU memory to cache frequently accessed embedding entries. The proposed software system leverages a unique characteristic of the recommendation model training to perfectly prefetches soon-to-be-accessed embedding entries in advance to boost training speed. Lastly, the study analyzes challenges in developing software systems for utilizing the locality of the embedding layer during inference and proposes a new type of caching technique for the embedding layer. The proposed caching mechanism leverages massively parallelized address translation hardware in the accelerator to eliminate bottlenecks in the software-managed embedding cache, which is highly effective for recommendation inference acceleration.

Advisors: 유민수 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2024

Identifier: 325007

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2024.2,[vi, 79 p. :]

Keywords: 딥러닝▼a추천 시스템▼a컴퓨터 아키텍처▼a메모리 중심 아키텍처▼a가속컴퓨팅▼a임베딩; Deep learning▼aRecommendation system▼aComputer architecture▼aMemory-centric architecture▼aAccelerated computing▼aEmbedding

URI: http://hdl.handle.net/10203/322136

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1100036&flag=dissertation

Appears in Collection: EE-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Hardware and software systems for accelerating large-scale deep learning recommendation models딥러닝 기반 대규모 추천시스템 가속을 위한 하드웨어 및 소프트웨어 시스템

KOASAS

Communities & Collections