MGen: A Framework for Energy-Efficient In-ReRAM Acceleration of Multi-Task BERT

Cited 1 time in webofscience Cited 0 time in scopus
  • Hit : 65
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorKang, Myeongguko
dc.contributor.authorShin, Hyeinko
dc.contributor.authorKim, Joon Gyumko
dc.contributor.authorKim, Lee-Supko
dc.date.accessioned2023-11-20T03:00:27Z-
dc.date.available2023-11-20T03:00:27Z-
dc.date.created2023-11-20-
dc.date.created2023-11-20-
dc.date.issued2023-11-
dc.identifier.citationIEEE TRANSACTIONS ON COMPUTERS, v.72, no.11, pp.3140 - 3152-
dc.identifier.issn0018-9340-
dc.identifier.urihttp://hdl.handle.net/10203/314832-
dc.description.abstractRecently, multiple transformer models, such as BERT, have been utilized together to support multiple natural language processing (NLP) tasks in a system, also known as multi-task BERT. Multi-task BERT with very high weight parameters increases the area requirement of a processing in resistive memory (ReRAM) architecture, and several works have attempted to address this model size issue. Despite the reduced parameters, the number of multi-task BERT computations remains the same, leading to massive energy consumption in ReRAM-based deep neural network (DNN) accelerators. Therefore, we suggest a framework for better energy efficiency during the ReRAM acceleration of multi-task BERT. First, we analyze the inherent redundancies of multi-task BERT and the computational properties of the ReRAM-based DNN accelerator, after which we propose what is termed the model generator, which produces optimal BERT models supporting multiple tasks. The model generator reduces multi-task BERT computations while maintaining the algorithmic performance. Furthermore, we present task scheduler, which adjusts the execution order of multiple tasks, to run the produced models efficiently. As a result, the proposed framework achieves maximally 4.4x higher energy efficiency over the baseline, and it can also be combined with the previous multi-task BERT works to achieve both a smaller area and higher energy efficiency.-
dc.languageEnglish-
dc.publisherIEEE COMPUTER SOC-
dc.titleMGen: A Framework for Energy-Efficient In-ReRAM Acceleration of Multi-Task BERT-
dc.typeArticle-
dc.identifier.wosid001089178700009-
dc.identifier.scopusid2-s2.0-85163478279-
dc.type.rimsART-
dc.citation.volume72-
dc.citation.issue11-
dc.citation.beginningpage3140-
dc.citation.endingpage3152-
dc.citation.publicationnameIEEE TRANSACTIONS ON COMPUTERS-
dc.identifier.doi10.1109/TC.2023.3288749-
dc.contributor.localauthorKim, Lee-Sup-
dc.description.isOpenAccessN-
dc.type.journalArticleArticle-
dc.subject.keywordAuthorMulti-task BERT-
dc.subject.keywordAuthortransformer-based model-
dc.subject.keywordAuthorReRAM-based DNN accelerator-
dc.subject.keywordAuthordeep learning-
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 1 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0