AR/VR systems require heavy processing for object recognition, voice recognition, hand gesture recognition (HGR), and camera pose estimation with limited battery and computation capability. CMOS Image Sensor (CIS) on AR/VR systems can integrate functions to reduce data transaction and power consumption. 3 Functional CIS and a UI processor are explained; 'Eye-Mouse' with gaze tracking, Event-driven ultra-low-power face detection, video-based human action recognition, and CNN-based HGR AR/VR application processors are also explained with their architecture and featured functions. In the future of AR/VR user interaction, SoC will utilize more DNN functional blocks for 3D Point Neural Network and fusion of more sensors for better accuracy, lower power consumption, and easy implementation.