With an interest in advanced marine propulsion systems, much research has been done on mimicking fish-like locomotion using flapping fins. This study aims to optimize the swimming pattern of fish-like locomotion based on hierarchical reinforcement learning. A simplified carangiform fish model is employed and a segmented tail motion is learned by Q-learning to maximize the average forward velocity by flapping the tail fin. The performance of the self-learned swimming pattern is verified and analyzed in terms of the flapping efficiency. The results show that the flapping angle limit of approximately 35 degrees is best in maximizing the forward moving velocity and the hierarchical reinforcement learning approach is effective in providing a reasonable solution for a large-scale problem.