DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Kim, Jong-Hwan | - |
dc.contributor.advisor | 김종환 | - |
dc.contributor.author | Park, Kui-Hong | - |
dc.contributor.author | 박귀홍 | - |
dc.date.accessioned | 2011-12-14 | - |
dc.date.available | 2011-12-14 | - |
dc.date.issued | 2004 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=237638&flag=dissertation | - |
dc.identifier.uri | http://hdl.handle.net/10203/35208 | - |
dc.description | 학위논문(박사) - 한국과학기술원 : 전기및전자공학전공, 2004.2, [ ix, 154 p. ] | - |
dc.description.abstract | In this thesis, a two mode Q-learning method is proposed for fast convergence, extending Q-learning, a well-known scheme in reinforcement learning. It employs a separate failure Q value that keeps track of failure experiences and uses this to modify the exploratory behavior of a learning agent. The effectiveness of the pro-posed two mode Q-learning method is verified in a grid world environment. Its performance is also evaluated against conventional Q-learning in training a soccer agent to perform goalkeeping. The acquired knowledge of two mode Q-learning is implemented on the goalie robot of the NaroSot soccer system. The effects of varying parameters in two mode Q-learning is investigated. Also, a biped robot, called HSR-IV, is used as a test bed to compare the performance of both algorithms. An external force that is generated in the sagittal plane, is applied to the HSR-IV and its standing posture is investigated. Q and two mode Q-learning are employed to select an action for resisting the external force. In the frontal plane, an external force is generated and impacts the HSR-IV. In this situation, more than two actuators that were considered in this thesis, exist for resisting the external force. For implementing Q and two mode Q-learning in this situation, a curse of dimensionality must be considered. To solve this problem, a module-based scheme is adopted. The effectiveness of module-based two mode Q-learning is verified by real experiments using HSR-IV. | eng |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Two mode Q-learning | - |
dc.subject | Failure experience | - |
dc.subject | Q-learning | - |
dc.subject | Biped robot | - |
dc.subject | 이족보행 로봇 | - |
dc.subject | Two mode Q-학습 | - |
dc.subject | 실패 경험 | - |
dc.subject | Q-학습 | - |
dc.title | Two mode Q-learning using failure experience of the agent and its application to biped robot | - |
dc.title.alternative | 개체의 실패 경험을 활용한 Two mode Q-학습과 이족보행 로봇에의 적용 | - |
dc.type | Thesis(Ph.D) | - |
dc.identifier.CNRN | 237638/325007 | - |
dc.description.department | 한국과학기술원 : 전기및전자공학전공, | - |
dc.identifier.uid | 000995132 | - |
dc.contributor.localauthor | Kim, Jong-Hwan | - |
dc.contributor.localauthor | 김종환 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.