DC Field | Value | Language |
---|---|---|
dc.contributor.author | Hong, Byungchul | ko |
dc.contributor.author | Kim, Gwangsun | ko |
dc.contributor.author | Ahn, Jung Ho | ko |
dc.contributor.author | Kwon, Yongkee | ko |
dc.contributor.author | Kim, Hongsik | ko |
dc.contributor.author | Kim, John | ko |
dc.date.accessioned | 2020-02-12T01:20:23Z | - |
dc.date.available | 2020-02-12T01:20:23Z | - |
dc.date.created | 2020-02-12 | - |
dc.date.issued | 2016-09 | - |
dc.identifier.citation | 25th International Conference on Parallel Architectures and Compilation Techniques, PACT 2016, pp.113 - 124 | - |
dc.identifier.uri | http://hdl.handle.net/10203/272284 | - |
dc.description.abstract | Recent technology advances in memory system design, along with 3D stacking, have made near-data processing (NDP) more feasible to accelerate different workloads. In this work, we explore the near-data processing opportunity of a fundamental operation-linked-list traversal (LLT). We propose a new NDP architecture which does not change the existing sequential programming model and does not require any modification to the core microarchitecture. Instead, we exploit the packetized interface between the core and the memory modules to off-load LLT for NDP. We assume a system with multiple memory modules (e.g., hybrid memory cube (HMC) modules) interconnected with a memory network and our initial evaluation shows that simply off-loading LLT computation to near-memory can actually reduce performance because of the additional off-chip memory network channel traversal. Thus, we first propose NDP-aware data localization to exploit packaging locality-including locality within a single memory module and memory vault-to minimize latency and improve energy efficiency. In order to improve overall throughput and maximize parallelism, we propose batching multiple LLT operations together to amortize the cost of NDP by utilizing the highly parallel execution of NDP processing units and the high bandwidth of 3D stacked DRAM. Our evaluation shows that the combination of NDP-aware data localization and batching can provide significant improvement in performance and energy efficiency. | - |
dc.language | English | - |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
dc.title | Accelerating Linked-list Traversal Through Near-Data Processing | - |
dc.type | Conference | - |
dc.identifier.wosid | 000392249100010 | - |
dc.identifier.scopusid | 2-s2.0-84989284870 | - |
dc.type.rims | CONF | - |
dc.citation.beginningpage | 113 | - |
dc.citation.endingpage | 124 | - |
dc.citation.publicationname | 25th International Conference on Parallel Architectures and Compilation Techniques, PACT 2016 | - |
dc.identifier.conferencecountry | IS | - |
dc.identifier.conferencelocation | Dan Carmel HotelHaifa, Israel | - |
dc.identifier.doi | 10.1145/2967938.2967958 | - |
dc.contributor.localauthor | Kim, John | - |
dc.contributor.nonIdAuthor | Ahn, Jung Ho | - |
dc.contributor.nonIdAuthor | Kwon, Yongkee | - |
dc.contributor.nonIdAuthor | Kim, Hongsik | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.