DSpace at KOASAS: (An) efficient partitioning and scheduling algorithm for GPUs in designing machine learning inference server

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

(An) efficient partitioning and scheduling algorithm for GPUs in designing machine learning inference server인공지능 추론용 서버를 위한 GPU의 분할 및 스케줄링 방법

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 57
Download : 0

Export

Kim, Yunseong

Today's cloud vendors offer Machine Learning as a Service (MLaaS). Unlike the training process, inference does not require high computational power, and inference using GPUs does not fully utilize the computational power of the device. The recently proposed GPU allows providers to partition single GPU into units of a size suitable for the degree of user's request and provides the ability to lower their Total Cost of Ownership (TCO) through increased computational utilization. This dissertation proposes a method of improving the compute utilization through heterogeneity of the multi-GPU server. The sophisticated partitioning algorithm proposed (PARIS) heterogeneizes inference servers based on the model and the characteristics of the environment, and guarantees Service Level Agreement (SLA) through the appropriate scheduling method (ELSA). The proposed partitioning and scheduling algorithm achieves an maximum 17.4x and 1.8x improvement in latency and throughput, respectively.

Advisors: Rhu, Minsoo researcher; 유민수 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2022

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2022.2,[iv, 28 p. :]

URI: http://hdl.handle.net/10203/309959

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=997183&flag=dissertation

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

(An) efficient partitioning and scheduling algorithm for GPUs in designing machine learning inference server인공지능 추론용 서버를 위한 GPU의 분할 및 스케줄링 방법

KOASAS

Communities & Collections