DSpace at KOASAS: (An) adaptive spatial-temporal GPU scheduling scheme for multi-domain DNNs services

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

(An) adaptive spatial-temporal GPU scheduling scheme for multi-domain DNNs services다중 도메인 DNN 서비스를 위한 적응형 시공간 GPU 스케줄링 기법

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 122
Download : 0

Export

Dinh, Khac Tuyen

High throughput Deep Neural Networks (DNNs) serving servers are essential for online service applications as deep learning techniques are used in a wider range of applications. Serving DNNs services on serving servers require the key requirement: they need to serve multiple heterogenous DNNs services models that guarantee the service-level objective (SLO) of each model as well as improve the system utilization and improve the system-wide throughput. Therefore, it is required for an GPU scheduler to orchestrate the GPU resources for DNNs models. To address the requirements of DNNs serving servers for multi-domain DNNs services model, this thesis proposes an adaptive combined spatial-temporal GPU scheduling scheme. We first show the existing limitations of current works including both conventional temporal scheduling and spatial scheduling on GPU. By our experiments, we show that the existing spatial scheduling approaches usually lead to high resources reconfiguring time and has not fully utilize the GPU computation resources. To tackle the problem, we propose a combine spatial-temporal GPU scheduling strategy that first deploying an adaptive GPU spatial partitioning strategy to partition the GPU computation to multiple computation parts, which we call GPU partition, to schedule concurrent running models, and secondly define a strategy to share the GPU partition temporally to further enhance the utilization of the GPU system. We investigate the factors that might affect the performance of model when running concurrently and formulate the latency function of spatial-temporal running models and propose a heuristic approach for the problem. Our evaluation shows that the proposed scheme can significantly reduce the resource reconfiguring overhead as well as enhance the system throughput compare to the other baseline works.

Advisors: Youn, Chan-Hyun researcher; 윤찬현 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2023

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2023.2,[iii, 37 p. :]

Keywords: Deep Learning Inference▼aGPU Scheduling▼aSpatial sharing▼aTemporal sharing▼aGPU resources partitioning; 딥 러닝 추론▼aGPU 스케줄링▼a공간 공유▼a시간 공유▼aGPU 리소스 파티셔닝

URI: http://hdl.handle.net/10203/309994

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1032940&flag=dissertation

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

(An) adaptive spatial-temporal GPU scheduling scheme for multi-domain DNNs services다중 도메인 DNN 서비스를 위한 적응형 시공간 GPU 스케줄링 기법

KOASAS

Communities & Collections