Research on biped humanoid robots is currently one of the most exciting topics in the field of robotics and among those issues, how to recover from the external push is typical one of the problems related to the robot stability. Different from the walking or other initiative motions, in push recovering, the humanoid has diffculties in taking actions in a short period when the unknown push occurs. Because of the complexity, in order to give a reference control to recover the robots from the push the more robust learning methods are needed. Moreover while imitating the human and animals, robots are preferred to learn the method in endless trials. Therefore the reinforcement learning (RL) is considered.
In this thesis, a model for the push recovery is proposed as a reference trajectory. On the other hand, one of the reinforcement learning methods which is used in the continuous state and action space is introduced and applied to the push recovery with unknown push. Also, based on this RL method a with-model RL and a modified RL are proposed to realize the application of RL on the robot control.
The results of the simulations are given to indicate the performance and are compared with each other to show the advantages of each method.