Two mode Q-learning using failure experience of the agent and its application to biped robot개체의 실패 경험을 활용한 Two mode Q-학습과 이족보행 로봇에의 적용

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 676
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorKim, Jong-Hwan-
dc.contributor.advisor김종환-
dc.contributor.authorPark, Kui-Hong-
dc.contributor.author박귀홍-
dc.date.accessioned2011-12-14-
dc.date.available2011-12-14-
dc.date.issued2004-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=237638&flag=dissertation-
dc.identifier.urihttp://hdl.handle.net/10203/35208-
dc.description학위논문(박사) - 한국과학기술원 : 전기및전자공학전공, 2004.2, [ ix, 154 p. ]-
dc.description.abstractIn this thesis, a two mode Q-learning method is proposed for fast convergence, extending Q-learning, a well-known scheme in reinforcement learning. It employs a separate failure Q value that keeps track of failure experiences and uses this to modify the exploratory behavior of a learning agent. The effectiveness of the pro-posed two mode Q-learning method is verified in a grid world environment. Its performance is also evaluated against conventional Q-learning in training a soccer agent to perform goalkeeping. The acquired knowledge of two mode Q-learning is implemented on the goalie robot of the NaroSot soccer system. The effects of varying parameters in two mode Q-learning is investigated. Also, a biped robot, called HSR-IV, is used as a test bed to compare the performance of both algorithms. An external force that is generated in the sagittal plane, is applied to the HSR-IV and its standing posture is investigated. Q and two mode Q-learning are employed to select an action for resisting the external force. In the frontal plane, an external force is generated and impacts the HSR-IV. In this situation, more than two actuators that were considered in this thesis, exist for resisting the external force. For implementing Q and two mode Q-learning in this situation, a curse of dimensionality must be considered. To solve this problem, a module-based scheme is adopted. The effectiveness of module-based two mode Q-learning is verified by real experiments using HSR-IV.eng
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectTwo mode Q-learning-
dc.subjectFailure experience-
dc.subjectQ-learning-
dc.subjectBiped robot-
dc.subject이족보행 로봇-
dc.subjectTwo mode Q-학습-
dc.subject실패 경험-
dc.subjectQ-학습-
dc.titleTwo mode Q-learning using failure experience of the agent and its application to biped robot-
dc.title.alternative개체의 실패 경험을 활용한 Two mode Q-학습과 이족보행 로봇에의 적용-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN237638/325007 -
dc.description.department한국과학기술원 : 전기및전자공학전공, -
dc.identifier.uid000995132-
dc.contributor.localauthorKim, Jong-Hwan-
dc.contributor.localauthor김종환-
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0