NMIX: NoC-Aware Mixed Precision Quantization for Energy-Efficient ML Accelerator

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 25
  • Download : 0
Energy efficiency remains a critical challenge in network-on-chip (NoC) based machine learning (ML) accelerators due to high computation and communication costs. Existing NoC optimization techniques often overlook the intrinsic co-optimization between NoC characteristics and the ML model design. We propose NoC-aware mixed precision quantization (NMIX), a two-phase algorithm that jointly minimizes computation and routing energy while preserving model accuracy. The first phase performs a coarse search to identify accuracy-sensitive quantization boundaries, followed by fine-grained precision assignment based on sensitivity and NoC energy profiles. Experiments on image classification tasks with VGG and ResNet demonstrate energy savings of up to 91.5% over FP32 and 57.5% over uniform 8-bit quantization, confirming the scalability and effectiveness of NMIX for energy-efficient ML accelerators.
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Issue Date
2025-12
Language
English
Article Type
Article
Citation

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, v.72, no.12, pp.2002 - 2006

ISSN
1549-7747
DOI
10.1109/TCSII.2025.3617919
URI
http://hdl.handle.net/10203/337785
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0