This paper presents an optimization model for finding preventive maintenance policy of two-stage tandem queue by utilizing Markov decision process (MDP). Non-preemptive preventive maintenance action is assumed in this model. Compare to the classic model of the single-server queue, a machine and corresponding buffer are added to the model and investigated. An optimal policy is derived based on the number of WIP at each buffer and health status of each machine. The structural behavior of optimal policy shows there exist control limit policy and the tendency of optimal thresholds can be observed through numerical studies. Finally, it is shown that proposed model outperformed single-server policy at all performance matrix we investigated: cycle time, operating cost, availability.