Motion estimation is a major part of the video coding, which traces the motion of moving objects in video sequences. Among various motion estimation algorithms, the Hierarchical Block-Matching Algorithm (HBMA) that is a multilayered motion estimation algorithm is attractive in motion-compensated interpolation when accurate motion estimation is required. However, parallel processing of HBMA is necessary since the high computational complexity of HBMA prevents it from operating in real-time. Further, the repeated updates of vectors naturally lead to pipelined processing. In this paper, we present a pipelined architecture for HBMA. We investigate the data dependency of HBMA and the requirements of the pipeline to operate synchronously. Each pipeline stage of the proposed architecture consists of a systolic array for the block-matching algorithm, a bilinear interpolator, and a latch mechanism. The latch mechanism mainly resolves the data dependency and arranges the data flow in a synchronous way. The proposed architecture achieves nearly linear speedup without additional hardware cost over a non-pipelined one. It requires the clock of 2.70 ns to process a large size of frame (e.q. HDTV) in real-time, which is about to be available under the current VLSI technology.