(An) efficient partitioning and scheduling algorithm for GPUs in designing machine learning inference server인공지능 추론용 서버를 위한 GPU의 분할 및 스케줄링 방법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 56
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorRhu, Minsoo-
dc.contributor.advisor유민수-
dc.contributor.authorKim, Yunseong-
dc.date.accessioned2023-06-26T19:34:20Z-
dc.date.available2023-06-26T19:34:20Z-
dc.date.issued2022-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=997183&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/309959-
dc.description학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2022.2,[iv, 28 p. :]-
dc.description.abstractToday's cloud vendors offer Machine Learning as a Service (MLaaS). Unlike the training process, inference does not require high computational power, and inference using GPUs does not fully utilize the computational power of the device. The recently proposed GPU allows providers to partition single GPU into units of a size suitable for the degree of user's request and provides the ability to lower their Total Cost of Ownership (TCO) through increased computational utilization. This dissertation proposes a method of improving the compute utilization through heterogeneity of the multi-GPU server. The sophisticated partitioning algorithm proposed (PARIS) heterogeneizes inference servers based on the model and the characteristics of the environment, and guarantees Service Level Agreement (SLA) through the appropriate scheduling method (ELSA). The proposed partitioning and scheduling algorithm achieves an maximum 17.4x and 1.8x improvement in latency and throughput, respectively.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.title(An) efficient partitioning and scheduling algorithm for GPUs in designing machine learning inference server-
dc.title.alternative인공지능 추론용 서버를 위한 GPU의 분할 및 스케줄링 방법-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전기및전자공학부,-
dc.contributor.alternativeauthor김윤성-
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0