In this paper, adaptive training beam sequence design for efficient channel estimation in large millimeter-wave (mmWave) multiple-input multiple-output (MIMO) channels is considered. By exploiting the sparsity in large mmWave MIMO channels and imposing a Markovian random walk assumption on the movement of the receiver and reflection clusters, the adaptive training beam sequence design and channel estimation problem is formulated as a partially observable Markov decision process (POMDP) problem that finds non-zero bins in a two-dimensional grid. Under the proposed POMDP framework, optimal and suboptimal adaptive training beam sequence design policies are derived. Furthermore, a very fast suboptimal greedy algorithm is developed based on a newly proposed reduced sufficient statistic to make the computational complexity of the proposed algorithm low to a level for practical implementation. Numerical results are provided to evaluate the performance of the proposed training beam design method. Numerical results show that the proposed training beam sequence design algorithms yield good performance.