We present an approach to efficiently detect the 2D human upper-body pose in RGB images. Among the system for estimating the joints position, the method using only RGB camera sensor is very cost-effective compared to the system with high-priced sensors such as a motion capture system. In this work, we use semantic segmentation using a fully convolutional network to es-timate the upper-body poses of each skeleton and choose the location coordi-nate using joint heatmaps. The architecture is designed to learn joint locations and their association via the sequential prediction process. We demonstrate the performance of the proposed method using various datasets.