A pattern-matching based image recognition accelerator (PRA) is presented for embedded vision applications. It is a hardware accelerator that performs interest point detection and matching for image-based recognition applications in real time in both mobile devices and vehicles. The proposed system is implemented as a small IP, and it has eight times higher throughput than state-of-the-art object recognition processors, which are implemented based on a heterogeneous many-core system. PRA has three key features: 1) joint algorithm-architecture optimizations for exploiting bit-level parallelism; 2) a low-power unified hardware platform for interest point detection and matching; and 3) scalable hardware architecture. PRA achieves 9.5x performance improvement with only 30% of logic gates including static random-access memory (SRAM) compared to the state-of-the-art object recognition processors. It consists of 78.3 k logic gates and 128 kB SRAM, which are integrated in a test chip implemented for PRA verification. It achieves 94.3 frames per second (fps) in 1080p full HD resolution at 200-MHz operating frequency while consuming 182mW. Each complete operation for interest point detection and matching requires 2.09 cycles and 8 cycles on average, respectively, based on a unified bit-level matching accelerator, which is implemented only with 680 logic gates.