DSpace at KOASAS: (An) adaptive spatial-temporal GPU scheduling scheme for multi-domain DNNs services

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

(An) adaptive spatial-temporal GPU scheduling scheme for multi-domain DNNs services다중 도메인 DNN 서비스를 위한 적응형 시공간 GPU 스케줄링 기법

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 123
Download : 0

Export

DC Field	Value	Language
dc.contributor.advisor	Youn, Chan-Hyun	-
dc.contributor.advisor	윤찬현	-
dc.contributor.author	Dinh, Khac Tuyen	-
dc.date.accessioned	2023-06-26T19:34:31Z	-
dc.date.available	2023-06-26T19:34:31Z	-
dc.date.issued	2023	-
dc.identifier.uri	http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1032940&flag=dissertation	en_US
dc.identifier.uri	http://hdl.handle.net/10203/309994	-
dc.description	학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2023.2,[iii, 37 p. :]	-
dc.description.abstract	High throughput Deep Neural Networks (DNNs) serving servers are essential for online service applications as deep learning techniques are used in a wider range of applications. Serving DNNs services on serving servers require the key requirement: they need to serve multiple heterogenous DNNs services models that guarantee the service-level objective (SLO) of each model as well as improve the system utilization and improve the system-wide throughput. Therefore, it is required for an GPU scheduler to orchestrate the GPU resources for DNNs models. To address the requirements of DNNs serving servers for multi-domain DNNs services model, this thesis proposes an adaptive combined spatial-temporal GPU scheduling scheme. We first show the existing limitations of current works including both conventional temporal scheduling and spatial scheduling on GPU. By our experiments, we show that the existing spatial scheduling approaches usually lead to high resources reconfiguring time and has not fully utilize the GPU computation resources. To tackle the problem, we propose a combine spatial-temporal GPU scheduling strategy that first deploying an adaptive GPU spatial partitioning strategy to partition the GPU computation to multiple computation parts, which we call GPU partition, to schedule concurrent running models, and secondly define a strategy to share the GPU partition temporally to further enhance the utilization of the GPU system. We investigate the factors that might affect the performance of model when running concurrently and formulate the latency function of spatial-temporal running models and propose a heuristic approach for the problem. Our evaluation shows that the proposed scheme can significantly reduce the resource reconfiguring overhead as well as enhance the system throughput compare to the other baseline works.	-
dc.language	eng	-
dc.publisher	한국과학기술원	-
dc.subject	Deep Learning Inference▼aGPU Scheduling▼aSpatial sharing▼aTemporal sharing▼aGPU resources partitioning	-
dc.subject	딥 러닝 추론▼aGPU 스케줄링▼a공간 공유▼a시간 공유▼aGPU 리소스 파티셔닝	-
dc.title	(An) adaptive spatial-temporal GPU scheduling scheme for multi-domain DNNs services	-
dc.title.alternative	다중 도메인 DNN 서비스를 위한 적응형 시공간 GPU 스케줄링 기법	-
dc.type	Thesis(Master)	-
dc.identifier.CNRN	325007	-
dc.description.department	한국과학기술원 :전기및전자공학부,	-
dc.contributor.alternativeauthor	딩 칵 투옌	-

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

(An) adaptive spatial-temporal GPU scheduling scheme for multi-domain DNNs services다중 도메인 DNN 서비스를 위한 적응형 시공간 GPU 스케줄링 기법

KOASAS

Communities & Collections