DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Kim, Joo-Young | - |
dc.contributor.advisor | 김주영 | - |
dc.contributor.author | Yoo, Sungyeob | - |
dc.date.accessioned | 2023-06-26T19:33:46Z | - |
dc.date.available | 2023-06-26T19:33:46Z | - |
dc.date.issued | 2022 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=997214&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/309856 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2022.2,[iii, 28 p. :] | - |
dc.description.abstract | With the dominance of machine learning and artificial intelligence in today's technology, designing an accelerator platform for fast and efficient completion of inference workloads in datacenters is becoming essential. General-purpose processors such as CPU and GPU have been mainly used in datacenters, but they are not suitable for ML inference workloads due to low performance and high power consumption. This paper proposes a vector-array architecture with heterogeneity-aware scheduling for multi-user/multi-DNN workloads in datacenters. It features a load balancer and multiple vector-array clusters, where each cluster consists of a scheduler, array processors, and vector processors. The main contribution is threefold. First, we devise the unified model format (UMF) to describe DNN models in a hardware-amenable packet form. Second, we propose a scheduling algorithm that efficiently allocates the concurrent tasks to available resources at run-time by estimating the computation and external memory access time. Third, we implement an analysis framework based on the implementation results of the proposed architecture. Using this framework, we conduct a design space exploration for this architecture and provide insights for advanced ML accelerator design. As a result, the proposed heterogeneity-aware scheduling algorithm improves the throughput and energy efficiency by 82% and 21%, respectively, compared to a standard round-robin algorithm. This research is conducted in collaboration with Jung-Hoon Kim, a master's student at KAIST. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.title | Exploration of vector-array architecture with heterogeneity-aware scheduling for multi-user/multi-DNN workloads | - |
dc.title.alternative | 다중 사용자와 다중 심층 신경망 워크로드를 위한 이기종 인식 스케줄링을 갖는 벡터-어레이 구조 탐색 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :전기및전자공학부, | - |
dc.contributor.alternativeauthor | 유성엽 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.