Today's CPUs are general-purpose processors, which have the von Neumann architecture (including the Harvard architectures) to maximize the generality and programmability. On the other hand, application-specific integrated circuits (ASICs) have domain-specific architectures to optimize the cost-effective performance but show very low generality. The combination of generality and ASIC, which usually seemed to have no contact, is expected to be enabled by deep learning (DL). DL, realized with deep neural networks (DNNs), has changed the paradigm of machine learning (ML) and brought significant progress in vision, speech, language processing, and many other applications. DNNs have special features that can be efficiently implemented with dedicated architectures, ASICs. Sharing their special features, DNNs have a wide variety of network architectures, and even the same network architecture can be used for different applications depending on the weight parameters. This paper aims to provide the necessity, validity, and characteristics of the ML-specific integrated circuits (MSICs) that have a different architecture from the von Neumann architecture. MSICs can avoid the overhead from the complex instruction set, instruction decoder, multilevel caches, and branch prediction of the recent von Neumann architecture processors designed for high generality and programmability. We will also discuss the necessity and validity of a heterogeneous architecture in MSIC, starting from the differences between the visual-type information processing and the vector-type information processing, and show the chip implementation results.