Energy efficiency remains a critical challenge in network-on-chip (NoC) based machine learning (ML) accelerators due to high computation and communication costs. Existing NoC optimization techniques often overlook the intrinsic co-optimization between NoC characteristics and the ML model design. We propose NoC-aware mixed precision quantization (NMIX), a two-phase algorithm that jointly minimizes computation and routing energy while preserving model accuracy. The first phase performs a coarse search to identify accuracy-sensitive quantization boundaries, followed by fine-grained precision assignment based on sensitivity and NoC energy profiles. Experiments on image classification tasks with VGG and ResNet demonstrate energy savings of up to 91.5% over FP32 and 57.5% over uniform 8-bit quantization, confirming the scalability and effectiveness of NMIX for energy-efficient ML accelerators.