In millimeter wave communication, the beamforming technique with accurate angle information plays a key role to overcome the high path-loss and mitigate the interference. Particularly with multiple mobile stations (MSs), accurate multi-beam tracking without any knowledge of dynamic model is challenging. In this regard, we propose the model-free multi-beam tracking algorithm combining the Q-learning with auxiliary beam pair-based angle estimation in multi-MS environment. The proposed scheme benefits from low pilot overhead and high resolution angle estimation. Simulation results show that the proposed scheme outperforms the conventional schemes in terms of the effective sum-rate. (C) 2021 The Korean Institute of Communications and Information Sciences (KICS). Publishing services by Elsevier B.V.