TSUNAMI: Triple Sparsity-Aware Ultra Energy-Efficient Neural Network Training Accelerator With Multi-Modal Iterative Pruning

Cited 4 time in webofscience Cited 0 time in scopus
  • Hit : 707
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorKim, Sangyeobko
dc.contributor.authorLee, Juhyoungko
dc.contributor.authorKang, Sanghoonko
dc.contributor.authorHan, Donghyeonko
dc.contributor.authorJo, Wooyoungko
dc.contributor.authorYoo, Hoi-Junko
dc.date.accessioned2022-04-14T06:41:08Z-
dc.date.available2022-04-14T06:41:08Z-
dc.date.created2022-01-18-
dc.date.created2022-01-18-
dc.date.created2022-01-18-
dc.date.issued2022-04-
dc.identifier.citationIEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, v.69, no.4, pp.1494 - 1506-
dc.identifier.issn1549-8328-
dc.identifier.urihttp://hdl.handle.net/10203/292743-
dc.description.abstractThis article proposes the TSUNAMI, which supports an energy-efficient deep-neural-network training. The TSUNAMI supports multi-modal iterative pruning to generate zeros in activation and weight. Tile-based dynamic activation pruning unit and weight memory shared pruning unit eliminate additional memory access. Coarse-zero skipping controller skips multiple unnecessary multiply-and-accumulation (MAC) operations at once, and fine-zero skipping controller skips randomly located unnecessary MAC operations. Weight sparsity balancer solves a utilization degradation caused by weight sparsity imbalance, and the workload of each convolution core is allocated by a random channel allocator. The TSUNAMI achieves an energy efficiency of 3.42 TFLOPS/W at 0.78V and 50MHz with floating-point 8-bit activation and weight. Also, it achieves an energy efficiency of 405.96 TFLOPS/W at 90% sparsity condition.-
dc.languageEnglish-
dc.publisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC-
dc.titleTSUNAMI: Triple Sparsity-Aware Ultra Energy-Efficient Neural Network Training Accelerator With Multi-Modal Iterative Pruning-
dc.typeArticle-
dc.identifier.wosid000740068900001-
dc.identifier.scopusid2-s2.0-85122567552-
dc.type.rimsART-
dc.citation.volume69-
dc.citation.issue4-
dc.citation.beginningpage1494-
dc.citation.endingpage1506-
dc.citation.publicationnameIEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS-
dc.identifier.doi10.1109/TCSI.2021.3138092-
dc.contributor.localauthorYoo, Hoi-Jun-
dc.contributor.nonIdAuthorJo, Wooyoung-
dc.description.isOpenAccessN-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorTraining-
dc.subject.keywordAuthorIP networks-
dc.subject.keywordAuthorTsunami-
dc.subject.keywordAuthorIterative methods-
dc.subject.keywordAuthorHardware-
dc.subject.keywordAuthorMemory management-
dc.subject.keywordAuthorDegradation-
dc.subject.keywordAuthorDNN training accelerator-
dc.subject.keywordAuthorstochastic coarse-fine level pruning-
dc.subject.keywordAuthortile-based dynamic activation pruning-
dc.subject.keywordAuthorweight sparsity balancing-
dc.subject.keywordAuthoradaptive triple-zero skipping-
dc.subject.keywordPlusFACE RECOGNITION-
dc.subject.keywordPlusCNN ACCELERATOR-
dc.subject.keywordPlusPROCESSOR-
dc.subject.keywordPlusHARDWARE-
dc.subject.keywordPlusPOWER-
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 4 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0