VLSI architecture for deep convolutional neural networks심층 컨볼루셔널 신경망을 위한 VLSI 구조

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 282
  • Download : 0
In recent decades, the neural network-based inference algorithms have become one integral part in nearly all the recognition systems due to its high recognition accuracy. As the algorithms implemented in recent recognition systems can achieve better recognition accuracy at the expense of large computational complexity, it is important to devise energy-efficient hardware architectures specialized for the neural network operations. Among various algorithms, this dissertation is devoted to developing efficient VLSI architectures for convolution operations of convolutional neural networks (CNNs) which have been widely utilized in image recognition systems. First of all, analyzing the dependency of data handled in convolution operations, we propose energy-efficient dataflow of convolution and its hardware architecture to minimize energy consumption by maximizing data reuse rate. Previous convolution accelerators, which were mainly focused on high data processing speed, considered limited information on data reuse, resulting in large redundant on-chip memory accesses. To minimize the memory access overhead, we reschedule the dataflow of convolution operations so that the data loaded from on-chip memories are utilized as maximally as possible. As a result, the overall energy consumption is reduced by a factor of 5.9 compared to the previous convolution accelerator. For the inference process, a programmable CNN processor called master-slave core is newly proposed to achieve high flexibility, processing speed and energy efficiency. The heterogeneous instruction set architecture of the master-slave core maximizes computation speed by overlapping the off-chip data transmission and the CNN operations, and reduces power consumption by performing the convolution incrementally to reuse input and partial-sum data as maximally as possible. An inference system can be configured by connecting multiple DSIPs in a form of either 1D or 2D chain structure in order to enhance computation speed further. Compared to the state-of-the-art accelerator, the proposed system enhances the energy efficiency by 2.17x.
Advisors
Park, In-Cheolresearcher박인철researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2018
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2018.2,[iv, 59 p. :]

Keywords

Deep neural networks▼aconvolutional neural networks▼aenergy-efficient accelerators▼aVLSI architecture; 신경망▼a컨볼루셔널 신경망▼a에너지 효율적인 가속기▼aVLSI 구조

URI
http://hdl.handle.net/10203/265194
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=734504&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0