(A) high energy-efficiency adaptive fixed-point DNN training processor고에너지 효율의 적응형 고정 소수점 DNN 학습 프로세서

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 96
  • Download : 0
DF-LNPU which focused on the acceleration of a specific application and HNPU which is a general-purpose DNN training processor, which is an energy-efficient DNN training processor by adopting algorithm-hardware co-design. The first DNN training processor, DF-LNPU, is proposed using direct feedback alignment (DFA). The proposed processor achieves a 2.2 × faster DNN training speed compared with the previous processors by the pipelined DFA. In order to enhance the energy efficiency by 38.7%, the heterogeneous learning core architecture is optimized with the 11-stage pipeline data-path. Furthermore, the direct error propagation core utilizes random number generators to remove external memory access caused by error propagation and improve the energy efficiency by 19.9%. The DF-LNPU is evaluated on the object tracking application, and as a result, it shows 34.4 frames-per-second throughput with 1.32 TOPS/W energy efficiency. The second processor, HNPU, realizes energy-efficient DNN training by focusing on general-purpose DNN training. It supports stochastic dynamic fixed-point representation and layer-wise adaptive precision searching unit for low-bit-precision training. It additionally utilizes slice-level reconfigurability and sparsity to maximize its efficiency both in DNN inference and training. Adaptive-bandwidth reconfigurable accumulation network enables reconfigurable DNN allocation and maintains its high core utilization even in various bit-precision conditions. Fabricated in a 28nm process, the HNPU accomplished at least 5.9 × higher energy efficiency and 2.5 × higher area efficiency in general DNN training benchmark such as ImageNet compared with the previous state-of-the-art on-chip learning processors.; Deep learning becomes the mainstream of artificial intelligence applications and its demand is increasing day by day. Previously, deep learning was only considered for cloud-server applications because of the huge computation amount. Many edge/mobile devices are now able to utilize deep neural networks (DNNs) thanks to the development of mobile DNN accelerators. Mobile DNN accelerators overcame the problem of limited computing resources and battery capacity by realizing energy-efficient inference. However, it shows passive behavior so it makes AI hard to be actively interacting with individual users or its service environment. The importance of on-chip training becomes rising more and more because of this limitation. Despite its advantages, DNN training has more constraints than inference so it was hard to be realized in mobile/edge devices. This paper suggests two mobile DNN training processors
Advisors
Yoo, Hoi-Junresearcher유회준researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2023
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2023.2,[xviii, 278 p. :]

Keywords

Deep learning▼aOn-device training▼aFixed-point▼aASIC▼aBit-slice▼aSparsity exploitation▼aTRNG▼aWeight pruning▼aBackward locking▼aDirect feedback alignment; 심층신경망▼a온-디바이스 학습▼a고정소수점▼a반도체▼a비트-슬라이스▼a희소성 활용▼a순수난수생성기▼a가중치 가지치기▼a역방향 잠금▼a직접 오류 전사

URI
http://hdl.handle.net/10203/309158
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1030570&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0