DSpace at KOASAS: (A) preemptible neural processing unit architecture and its applicability for QoS-aware scheduling

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

(A) preemptible neural processing unit architecture and its applicability for QoS-aware scheduling선점 가능한 뉴럴 프로세서 구조 및 서비스 품질을 높이는 스케줄링 알고리즘

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 182
Download : 0

Export

DC Field	Value	Language
dc.contributor.advisor	Rhu, Minsoo	-
dc.contributor.advisor	유민수	-
dc.contributor.author	Choi, Yujeong	-
dc.date.accessioned	2021-05-13T19:39:36Z	-
dc.date.available	2021-05-13T19:39:36Z	-
dc.date.issued	2020	-
dc.identifier.uri	http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=925235&flag=dissertation	en_US
dc.identifier.uri	http://hdl.handle.net/10203/285071	-
dc.description	학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2020.8,[iv, 28 p. :]	-
dc.description.abstract	To meet the high demands for Deep Neural Network (DNN) computation, major cloud vendors offer Machine Learning (ML) acceleration as a service. Due to the high compute requirement of DNN application, GPUs and chips specially designed for DNN computation called Neural Processing Units (NPUs) are generally used for the computation. Service providers utilize multi-tenancy by co-locating multiple DNN models inside a single accelerator to achieve high throughput and reduce the Total Cost of Ownership (TCO). However, current NPUs lack the ability to preempt an on-going task and cannot provide fast response time to high priority requests leading to Service Level Agreement (SLA) violation. To improve the Quality of Service (QoS) of ML services, this dissertation explores three possible preemption mechanisms for NPUs and proposes a scheduling algorithm (PREMA) working on top of the preemptible NPU. Overall, the proposed scheduler achieves an average of 7.8$\times$, 1.4$\times$, and 4.8$\times$ improvement in latency, throughput, and SLA satisfaction, respectively.	-
dc.language	eng	-
dc.publisher	한국과학기술원	-
dc.subject	DNN (Deep Neural Network)▼aNPU (Neural Processing Unit)▼apreemption	-
dc.subject	QoS (Quality of Service)▼aTCO (Total Cost of Ownership)	-
dc.subject	심층 학습▼a뉴럴 프로세서▼a선점▼a서비스 품질▼a총 소유 비용	-
dc.title	(A) preemptible neural processing unit architecture and its applicability for QoS-aware scheduling	-
dc.title.alternative	선점 가능한 뉴럴 프로세서 구조 및 서비스 품질을 높이는 스케줄링 알고리즘	-
dc.type	Thesis(Master)	-
dc.identifier.CNRN	325007	-
dc.description.department	한국과학기술원 :전기및전자공학부,	-
dc.contributor.alternativeauthor	최유정	-

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

(A) preemptible neural processing unit architecture and its applicability for QoS-aware scheduling선점 가능한 뉴럴 프로세서 구조 및 서비스 품질을 높이는 스케줄링 알고리즘

KOASAS

Communities & Collections