Cost-effective, SLO-aware machine learning inference system for heterogeneous instances in public cloud퍼블릭 클라우드의 이기종 인스턴스를 위한 비용 효율적인 SLO 인식 머신 러닝 추론 시스템

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 199
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorHuh, Jaehyuk-
dc.contributor.advisor허재혁-
dc.contributor.authorKim, Jaehong-
dc.date.accessioned2023-06-23T19:34:43Z-
dc.date.available2023-06-23T19:34:43Z-
dc.date.issued2023-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1030579&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/309278-
dc.description학위논문(박사) - 한국과학기술원 : 전산학부, 2023.2,[iv, 51 p. :]-
dc.description.abstractRecently, several cloud companies have released various types of hardware accelerator cloud products. For example, AWS GPU-type instances and inference-specific instances were released, and Google released dedicated instances such as TPU (Tensor Processing Unit). Cloud instances for inference are becoming more diverse as the years go by, and heterogeneity is also emerging in the cloud environment that constitutes a group of instances. In addition, various Machine Learning (ML) models are emerging in various fields for text analysis, text generation, and sound classification as well as image classification. In situations where there is a need for large-scale ML, it is necessary to analyze the performance/cost correlation between various cloud instances, and various ML models, for efficiency. This study introduces the StageH system. StageH was implemented in a distributed and heterogeneous cloud environment. Various ML models (e.g., RESNET, BERT, GPT, YAMNET, INCEPTI ON) keep the SLO as much as possible in the cloud environment where it is executed. In addition, cost-effective autoscaling algorithm saves costs in a cloud environment.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectMachine learning▼aInference▼aCloud cost▼aHeterogeneous cloud-
dc.subject머신러닝▼a추론▼a클라우드 비용▼a이기종 클라우드-
dc.titleCost-effective, SLO-aware machine learning inference system for heterogeneous instances in public cloud-
dc.title.alternative퍼블릭 클라우드의 이기종 인스턴스를 위한 비용 효율적인 SLO 인식 머신 러닝 추론 시스템-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전산학부,-
dc.contributor.alternativeauthor김재홍-
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0