DSpace at KOASAS: SAL-PIM: a subarray-level processing-in-memory architecture for accelerating end-to-end generative transformer with LUT-based linear interpolation

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

SAL-PIM: a subarray-level processing-in-memory architecture for accelerating end-to-end generative transformer with LUT-based linear interpolation생성 트랜스포머의 종단간 가속을 위한 룩-업 테이블 기반 선형 보간을 이용하는 서브어레이-레벨 프로세싱-인-메모리 구조

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 265
Download : 0

Export

DC Field	Value	Language
dc.contributor.advisor	Kim, Joo-Young	-
dc.contributor.advisor	김주영	-
dc.contributor.author	Han, Wontak	-
dc.date.accessioned	2023-06-26T19:33:48Z	-
dc.date.available	2023-06-26T19:33:48Z	-
dc.date.issued	2023	-
dc.identifier.uri	http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1032949&flag=dissertation	en_US
dc.identifier.uri	http://hdl.handle.net/10203/309864	-
dc.description	학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2023.2,[iv, 26 p. :]	-
dc.description.abstract	Text generation is one of the representative applications that employ machine learning. Various deep-learning models have been presented and studied for text generation, but transformer-based models show state-of-the-art accuracy currently. Among the models, the transformer-decoder-based generative model, such as the generative pretrained model (GPT), has two stages in text generation: summarization and generation. The generation stage is a memory-bound operation, unlike the summarization stage, due to its sequentially operating feature. Therefore, accelerators based-processing-in-memory (PIM) have been suggested many times to address the von-Neumann bottleneck. However, existing PIM accelerators utilize limited memory bandwidth or cannot accelerate the entire model. The SAL-PIM is the first PIM architecture to accelerate the end-to-end transformer-decoder-based generative model. With an optimized mapping scheme, SAL-PIM utilizes higher bandwidth using the subarray-level arithmetic logic unit (S-ALU). To minimize area overhead for S-ALU, S-ALU uses shared MACs utilizing slow clock frequency of commands for the same bank. In addition, in order to support vector functions in PIM, the DRAM cells are used as a look-up table (LUT), and the vector functions are computed by linear interpolation. Then, an LUT-embedded subarray is proposed to optimize LUT operation in DRAM. Lastly, the channel-level arithmetic logic unit (C-ALU) performs the accumulation and reduce-sum operations of data and enables end-to-end inference on PIM. We implemented SAL-PIM on the TSMC 28-nm CMOS technology and scaled it to DRAM technology to verify the feasibility of SAL-PIM. SAL-PIM has a 23.43% additional area overhead compared to the original DRAM, which is smaller than the threshold mentioned in previous work. As a result, the SAL-PIM architecture achieves a maximum of 73.17x speedup on the GPT-2 medium model and an average of 27.74x speedup using the SAL-PIM simulator for text generation compared to GPU.	-
dc.language	eng	-
dc.publisher	한국과학기술원	-
dc.subject	Processing-in-memory▼aDRAM▼aTransformer▼aText generation	-
dc.subject	프로세싱-인-메모리	-
dc.subject	디램▼a트랜스포머 모델▼a텍스트 생성	-
dc.title	SAL-PIM: a subarray-level processing-in-memory architecture for accelerating end-to-end generative transformer with LUT-based linear interpolation	-
dc.title.alternative	생성 트랜스포머의 종단간 가속을 위한 룩-업 테이블 기반 선형 보간을 이용하는 서브어레이-레벨 프로세싱-인-메모리 구조	-
dc.type	Thesis(Master)	-
dc.identifier.CNRN	325007	-
dc.description.department	한국과학기술원 :전기및전자공학부,	-
dc.contributor.alternativeauthor	한원탁	-

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

KOASAS

Communities & Collections