Energy-Efficient DRAM System Design via Memory Traffic Reshaping and Partial Data Writes메모리 트래픽 조정 및 부분 데이터 쓰기를 통한 에너지 효율적인 DRAM 시스템 설계에 관한 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 606
  • Download : 0
Performance of modern computing systems have been dramatically improved in the past four decades due to the continued scaling of CMOS process technology. With the generalization of multi-core processors, multi-threaded and multi-programmed workloads have become prevalent in today's computing systems. Following this trend, memory traffic has become heavier and more random, which necessitates faster and larger memory for the improvement of system performance and throughput. Owing to the continuous growing demand for high-speed and large-capacity memory in modern computing systems, DRAM accounts for a major portion of entire power and energy consumptions of computing systems. Recent studies demonstrate that the memory system accounts for 25%~57% of the total power consumption of a system. Therefore, reducing DRAM power and energy consumptions have big potential for improving the power and energy efficiencies of the computing systems. This dissertation proposes the effective techniques to reduce the power and energy consumptions of DRAM systems such as embedded systems and servers, where the power and energy efficiencies are critical and primary concern. The power consumption of DRAM system can be classified into several consuming factors such as row activation (with bank precharge), I/O (read and write), refresh, and background (idle). Among them, this dissertation focuses on reducing row activation, background, and I/O power consumptions which are major contributors to DRAM power consumption. The first proposal of this dissertation, called CLAP (clustered look-ahead prefetching), targets row activation and background power/energy saving by exploiting stride-based memory access patterns which can be commonly observable in the programs. Because stride memory accesses originate mainly from array references in looping structures of a program code, such stride-based memory accesses exhibit stable patterns and access timing. Thus, future memory accesses that have strides can be accurately predicted. Clustering future memory accesses with on-demand memory accesses can significantly increase the probability of row buffer hits and idle periods of DRAM. Therefore, a large number of row activations and a significant amount of idle energy consumption can be reduced. The second proposal, called RAMS (rank-aware memory scheduling), utilizes the cache block replacement policy of the last-level cache (LLC) and the memory request scheduling of the memory controller with the awareness of the DRAM rank states for reducing idle and power state transition energy consumptions. RAMS utilizing the LLC reduces the write requests to DRAM and thus the number of state transitions to/from low-power states by selecting victim cache blocks upon cache misses through the use of clean cache block first replacement and DRAM rank-aware cache block replacement policies. RAMS utilizing the memory controller increases rank idling and decreases the number of state transitions by deferring the write requests and batch-writing the postponed write requests by considering rank power states. Thus, DRAM write traffic and rank state transitions are noticeably reduced and DRAM can reside in low-power states for a longer period of time. The third proposal, called skinflint DRAM system (SDS), minimizes write accesses at DRAM chip level by exploiting interesting phenomena of memory accesses such as silent stores and narrow width value characteristics. DRAM system is inherently partitioned into chips which are accessed simultaneously in the conventional DRAM system. However, DRAM chips do not need to be accessed when data heading to them are the same as those data stored in them. Preventing chips from being accessed can reduce row activations and lengthen idle period. In this scheme, DRAM chips can be accessed selectively by re-architecting the conventional DRAM system while it inherits advantages of the conventional DRAM organization. The last proposal, called partial row activation (PRA), dynamically adjusts row activation granularity ranging from a one-eighth row to a full row and transfers only a part of cache line data that must be written to DRAM rather than transferring a full cache line data. PRA exploits the inherent fine-grained structures of DRAM (i.e., MATs) and makes them activated or deactivated on a row activation request without sacrificing the bandwidth of DRAM through the asymmetric row activation mechanism for read and write accesses. Thus, large amount of row activation and I/O power consumptions which are major contributors to DRAM power consumption can be reduced.
Advisors
Kim, Soontaeresearcher김순태researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2017
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학부, 2017.2,[vii, 95 p. :]

Keywords

DRAM; low-power; memory traffic clustering; memory request scheduling; partial data write; 저전력; 메모리 트래픽 클러스터링; 메모리 요청 스케줄링; 부분 데이터 쓰기

URI
http://hdl.handle.net/10203/242086
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=675855&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0