DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kang, Myeonggu | ko |
dc.contributor.author | Shin, Hyein | ko |
dc.contributor.author | Kim, Joon Gyum | ko |
dc.contributor.author | Kim, Lee-Sup | ko |
dc.date.accessioned | 2023-11-20T03:00:27Z | - |
dc.date.available | 2023-11-20T03:00:27Z | - |
dc.date.created | 2023-11-20 | - |
dc.date.created | 2023-11-20 | - |
dc.date.issued | 2023-11 | - |
dc.identifier.citation | IEEE TRANSACTIONS ON COMPUTERS, v.72, no.11, pp.3140 - 3152 | - |
dc.identifier.issn | 0018-9340 | - |
dc.identifier.uri | http://hdl.handle.net/10203/314832 | - |
dc.description.abstract | Recently, multiple transformer models, such as BERT, have been utilized together to support multiple natural language processing (NLP) tasks in a system, also known as multi-task BERT. Multi-task BERT with very high weight parameters increases the area requirement of a processing in resistive memory (ReRAM) architecture, and several works have attempted to address this model size issue. Despite the reduced parameters, the number of multi-task BERT computations remains the same, leading to massive energy consumption in ReRAM-based deep neural network (DNN) accelerators. Therefore, we suggest a framework for better energy efficiency during the ReRAM acceleration of multi-task BERT. First, we analyze the inherent redundancies of multi-task BERT and the computational properties of the ReRAM-based DNN accelerator, after which we propose what is termed the model generator, which produces optimal BERT models supporting multiple tasks. The model generator reduces multi-task BERT computations while maintaining the algorithmic performance. Furthermore, we present task scheduler, which adjusts the execution order of multiple tasks, to run the produced models efficiently. As a result, the proposed framework achieves maximally 4.4x higher energy efficiency over the baseline, and it can also be combined with the previous multi-task BERT works to achieve both a smaller area and higher energy efficiency. | - |
dc.language | English | - |
dc.publisher | IEEE COMPUTER SOC | - |
dc.title | MGen: A Framework for Energy-Efficient In-ReRAM Acceleration of Multi-Task BERT | - |
dc.type | Article | - |
dc.identifier.wosid | 001089178700009 | - |
dc.identifier.scopusid | 2-s2.0-85163478279 | - |
dc.type.rims | ART | - |
dc.citation.volume | 72 | - |
dc.citation.issue | 11 | - |
dc.citation.beginningpage | 3140 | - |
dc.citation.endingpage | 3152 | - |
dc.citation.publicationname | IEEE TRANSACTIONS ON COMPUTERS | - |
dc.identifier.doi | 10.1109/TC.2023.3288749 | - |
dc.contributor.localauthor | Kim, Lee-Sup | - |
dc.description.isOpenAccess | N | - |
dc.type.journalArticle | Article | - |
dc.subject.keywordAuthor | Multi-task BERT | - |
dc.subject.keywordAuthor | transformer-based model | - |
dc.subject.keywordAuthor | ReRAM-based DNN accelerator | - |
dc.subject.keywordAuthor | deep learning | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.