NID: Processing Binary Convolutional Neural Network in Commodity DRAM

Cited 9 time in webofscience Cited 0 time in scopus
  • Hit : 270
  • Download : 0
Recent large-scale CNNs suffer from a severe memory wall problem as their number of weights range from tens to hundreds of millions. Processing in-memory (PIM) and binary CNN have been proposed to alleviate the number of memory accesses and footprints, respectively. By combining the two separate concepts, we propose a novel processing in-DRAM framework for binary CNN, called NID, where dominant convolution operations are processed using in-DRAM bulk bitwise operations. We first identify the problem that the bitcount operations with only bulk bitwise AND/OR/NOT incur significant overhead in terms of delay when the size of kernels gets larger. Then, we not only optimize the performance by efficiently allocating inputs and kernels to DRAM banks for both convolutional and fully-connected layers through design space explorations, but also mitigate the overhead of bitcount operations by splitting kernels into multiple parts. Partial sum accumulations and tasks of the other layers such as max-pooling and normalization layers are processed in the peripheral area of DRAM with negligible overheads. In results, our NID framework achieves 19x-36x performance and 9x-14x EDP improvements for convolutional layers, and 9x-17x performance and 1.4x-4.5x EDP improvements for fully-connected layers over previous PIM technique in four large-scale CNN models.
Publisher
IEEE/ACM
Issue Date
2018-11-05
Language
English
Citation

2018 ACM/IEEE International Conference On Computer Aided Design, pp.10:1 - 10:8

DOI
10.1145/3240765.3240831
URI
http://hdl.handle.net/10203/246434
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 9 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0