A DNN training processor with a maximum of 332 TOPS/W is proposed for efficient and robust object detection. The proposed processor is able to support both quantization and pruning-based personalization to make a user-optimized lightweight network. In addition to personalization, it supports real-time adaptation to compensate for accuracy degradation caused by environmental changes or unpredictable situations. It maintains conventional input slice skipping architecture and stochastic rounding-based computing for the efficient acceleration of the DNN training. It further improves efficiency by removing pseudo-RNGs during the stochastic rounding and adding blocks to pruning-aware training. Moreover, it suggests an LT-flag-based reconfigurable accumulation network and enables multi-learning-task-allocation for low-latency DNN training with the backward unlocking solution. Fabricated in 28-nm technology, the proposed processor demonstrates 46.6 FPS object detection with 0.95 mJ/frame energy consumption which is the state-of-the-art performance compared with the previous processors.