NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units

Cited 23 time in webofscience Cited 12 time in scopus
  • Hit : 230
  • Download : 0
To satisfy the compute and memory demands of deep neural networks (DNNs), neural processing units (NPUs) are widely being utilized for accelerating DNNs. Similar to how GPUs have evolved from a slave device into a mainstream processor architecture, it is likely that NPUs will become first-class citizens in this fast-evolving heterogeneous architecture space. This paper makes a case for enabling address translation in NPUs to decouple the virtual and physical memory address space. Through a careful data-driven application characterization study, we root-cause several limitations of prior GPU-centric address translation schemes and propose a memory management unit (MMU) that is tailored for NPUs. Compared to an oracular MMU design point, our proposal incurs only an average 0.06% performance overhead.
Publisher
ACM
Issue Date
2020-03-20
Language
English
Citation

The 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-25), pp.1109 - 1124

DOI
10.1145/3373376.3378494
URI
http://hdl.handle.net/10203/276229
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 23 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0