Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

Cited 101 time in webofscience Cited 0 time in scopus
  • Hit : 163
  • Download : 0
DC FieldValueLanguage
dc.contributor.authorRhu, Minsooko
dc.contributor.authorO'Connor, Mikeko
dc.contributor.authorChatterjee, Niladrishko
dc.contributor.authorPool, Jeffko
dc.contributor.authorKwon, Youngeunko
dc.contributor.authorKeckler, Steveko
dc.date.accessioned2018-12-20T02:18:32Z-
dc.date.available2018-12-20T02:18:32Z-
dc.date.created2018-11-29-
dc.date.created2018-11-29-
dc.date.issued2018-02-26-
dc.identifier.citation24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018, pp.78 - 91-
dc.identifier.issn1530-0897-
dc.identifier.urihttp://hdl.handle.net/10203/247539-
dc.description.abstractPopular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory. Prior work tries to address this restriction by virtualizing the memory usage of DNNs, enabling both CPU and GPU memory to be utilized for memory allocations. Despite its merits, virtualizing memory can incur significant performance overheads when the time needed to copy data back and forth from CPU memory is higher than the latency to perform DNN computations. We introduce a high-performance virtualization strategy based on a 'compressing DMA engine' (cDMA) that drastically reduces the size of the data structures that are targeted for CPU-side allocations. The cDMA engine offers an average 2.6x (maximum 13.8x) compression ratio by exploiting the sparsity inherent in offloaded data, improving the performance of virtualized DNNs by an average 53% (maximum 79%) when evaluated on an NVIDIA Titan Xp.-
dc.languageEnglish-
dc.publisherIEEE Computer Society-
dc.titleCompressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks-
dc.typeConference-
dc.identifier.wosid000440297700007-
dc.identifier.scopusid2-s2.0-85046798314-
dc.type.rimsCONF-
dc.citation.beginningpage78-
dc.citation.endingpage91-
dc.citation.publicationname24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018-
dc.identifier.conferencecountryAU-
dc.identifier.conferencelocationHotel Pyramide Congress Center, Vienna-
dc.identifier.doi10.1109/HPCA.2018.00017-
dc.contributor.localauthorRhu, Minsoo-
dc.contributor.nonIdAuthorO'Connor, Mike-
dc.contributor.nonIdAuthorChatterjee, Niladrish-
dc.contributor.nonIdAuthorPool, Jeff-
dc.contributor.nonIdAuthorKeckler, Steve-
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 101 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0