Compression-based architecture design through cross-layer optimizations계층 교차 최적화를 통한 압축 기반 아키텍처 설계

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 31
  • Download : 0
Due to the exponential growth of data utilized and generated by key workloads, accessing and storing data have emerged as primary bottlenecks in modern computing systems. To mitigate the overheads associated with data handling, compression-based architectures have been widely used in various domains. Irrespective of hardware domains, optimizing compression-based architectures requires two essential factors: 1) minimizing the compression and decompression overheads and 2) maximizing a compression effect. To achieve these goals, this dissertation utilizes two kinds of cross-layer optimizations: 1) harnessing software-layer characteristics and 2) redefining the boundary between hardware and software. First, we leverage software-layer characteristics to optimize the code compression algorithm and the hardware components. To optimize the code compression algorithm, we analyze the entropy of instruction encoding and discover that certain bits within the 32-bit instruction encoding in RISC ISAs have high entropy due to several characteristics of high-level languages, such as reusability and the calling convention. Based on this observation, we co-design the code compression algorithm and the hardware components of the code compression support architecture. As a result of cross-layer optimization and co-design, we achieve a higher code compression effect and reduce the total energy consumption and the area of the code compression support architecture, compared to the state-of-the-art architectures. Second, we conduct a detailed bit-level analysis of data patterns in software to overcome the limitations of previous intra- and inter-block compression techniques. Consequently, we identify two types of low-entropy among blocks. The first type is naturally observed low-entropy among memory blocks with the same word layout. The second type is artificially generated low-entropy through our proposed three optimization techniques. Based on these two low-entropy types, we propose an entropy-based inter-block pattern compression (EPC) technique. To efficiently manage inter-block patterns, we propose hardware-based and profiling-based pattern selection methods. Moreover, we present a hybrid approach that leverages both intra- and inter-block compression techniques. As a result, EPC achieves higher speedup and DRAM energy consumption reduction while significantly reducing the hardware area for supporting an inter-block compression, compared to the state-of-the-art inter-block compression technique. Lastly, we redefine the boundary between hardware and software for tiling in sparse matrix multiplication. To minimize data movements in sparse matrix multiplication, the state-of-the-art accelerator atiles the input matrix by software preprocessing. However, this software-based tiling generates a compression format for each tile and does not provide any data-skipping information for the other input matrix. Consequently, the software-based tiling incurs huge memory overheads and ineffectual accesses. To overcome these limitations, we introduce Hardware-based Pseudo-Tiling (HARP), which performs the tiling process in hardware instead of software. HARP logically tiles the input matrix as if it were tiled, while preserving the original compression format for the input matrix. To realize pseudo-tiling, we propose Runtime Operand Descriptor (ROD) to point to effectual elements in a particular pseudo-tile. By utilizing RODs, HARP not only accesses effectual elements in a pseudo-tile but also can naturally skip ineffectual accesses. As a result, HARP achieves higher speedup and energy efficiency compared to the state-of-the-art accelerator, CPU, and GPU. The cross-layer optimizations introduced in this dissertation will enable the development of more efficient compression-based architectures and inspire advancements in other domains of compression-based architecture. Given the anticipated increase in data utilization and generation, ongoing research into optimizing compression-based architecture remains essential.
Advisors
김순태researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2024
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학부, 2024.2,[vii, 106 p. :]

Keywords

데이터 압축▼a코드 압축▼a메모리 압축▼a메모리 시스템▼a희소 행렬 곱셈▼a희소 행렬 타일링▼a하드웨어 가속기▼a애플리케이션-특화 하드웨어; Data compression▼aCode compression▼aMemory compression▼aMemory systems▼aSparse matrix multiplication▼aSparse matrix tiling▼aHardware accelerator▼aApplication-specific hardware

URI
http://hdl.handle.net/10203/322202
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1100111&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0