This brief proposes a new power-efficient Chien search (CS) architecture for parallel Bose-Chaudhuri-Hocquenghem (BCH) codes. For syndrome-based decoding, the CS plays a significant role in finding error locations, but exhaustive computation incurs a huge waste of power consumption. In the proposed architecture, the searching process is decomposed into two steps based on the binary matrix representation. Unlike the first step accessed every cycle, the second step is activated only when the first step is successful, resulting in remarkable power saving. Furthermore, an efficient architecture is presented to avoid the delay increase in critical paths caused by the two-step approach. Experimental results show that the proposed two-step architecture for the BCH (8752, 8192, 40) code saves power consumption by up to 50% compared with the conventional architecture.