Accelerating Deep Convolutional Neural Networks Using Number Theoretic Transform

Cited 1 time in webofscience Cited 0 time in scopus
  • Hit : 101
  • Download : 0
Modern deep convolutional neural networks (CNNs) suffer from high computational complexity due to excessive convolution operations. Recently, fast convolution algorithms such as fast Fourier transform (FFT) and Winograd transform have gained attention to address this problem. They reduce the number of multiplications required in the convolution operation by replacing it with element-wise multiplication in the transform domain. However, fast convolution-based CNN accelerators have three major concerns: expensive domain transform, large memory overhead, and limited flexibility in kernel size. In this paper, we present a novel CNN accelerator based on number theoretic transform (NTT), which overcomes the existing limitations. We propose the low-cost NTT and inverse-NTT converter that only use adders and shifters for on-chip domain transform, which solves the inflated bandwidth problem and enables more parallel computations in the accelerator. We also propose the accelerator architecture that includes multiple tile engines with the optimized data flow and mapping. Finally, we implement the proposed NTT-based CNN accelerator on the Xilinx Alveo U50 FPGA and evaluate it for popular deep CNN models. As a result, the proposed accelerator achieves 2859.5, 990.3, and 805.6 GOPS throughput for VGG-16, GoogLeNet, and Darknet-19, respectively. It outperforms the existing fast convolution-based CNN accelerators up to $9.6\times $ .
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Issue Date
2023-01
Language
English
Article Type
Article
Citation

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, v.70, no.1, pp.315 - 326

ISSN
1549-8328
DOI
10.1109/TCSI.2022.3214528
URI
http://hdl.handle.net/10203/305951
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 1 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0