CompAcc: Efficient Hardware Realization for Processing Compressed Neural Networks Using Accumulator Arrays

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 64
  • Download : 0
Compression of neural networks is an effective way to satisfy the requirement of memory-constrained edge devices. We propose a novel array microarchitecture that exploits compressed neural networks with nonlinearly quantized weights and supports variable activation and compressed weight bit widths. Computation is made more efficient by accumulating all the activations multiplied by the same weight prior to multiplication. This design has been fabricated in TSMC 28nm technology. It achieves 3.4 TOPS/W with 16b activations and 16b weights (4b compressed) and 3.7 TOPS/W on the convolutional layers of AlexNet (8b activations, 4b compressed weights) with the ImageNet dataset, consuming 15.6mW at 44fps. This is comparable to state-of-the-art chip implementations, while introducing increased flexibility with a simple array structure.
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Issue Date
2020-11-11
Language
English
Citation

Asian Solid-State Circuits Conference 2020 (A-SSCC)

URI
http://hdl.handle.net/10203/279206
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0