DSpace at KOASAS: Online neural Q-Learning using heuristic weight assignment algorithm and optimization method

DSpace at KOASAS

College of Engineering(공과대학)The Robotics Program(로봇공학학제전공)RE-Theses_Master(석사논문)

Online neural Q-Learning using heuristic weight assignment algorithm and optimization method휴리스틱한 가중치 배당 알고리즘과 최적화 방법을 이용한 온라인 Neural Q-Learning

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 841
Download : 0

Export

Kim, Yeon-Seob / 김연섭

The classic and almost recent robots still rely on fixed behavior based control. So the recent models of robots focus on increasing the robot`s ability to deal with any uncertainty from the environment. One approach of the paradigm is learning from experience and creating an appropriate control system from it. This approach is called Reinforcement Learning(RL). RL is a class of intelligent control methods that develop or improve the actions of the agent in an uncertain environment. By interacting with the environment, the agent learns and finds an optimal solution. To find the optimal solution, RL uses the value function. The value function is calculated using Bellman equation which is a nonlinear Lyapunov equation. But it usually requires knowledge of the system dynamics in order to solve for the value function. To avoid it, Q-Learning method for discrete space was introduced by Watkins. Another method is action dependent heuristic dynamic programming(AD HDP). AD HDP is based on an actor-critic structure that was introduced by Werbos. But the actor-critic structure involves training of two or more function approximators. This makes the training and the analysis of the results difficult. If it fails, it is unclear whether this is a result of the settings of the training parameters, the choice of function approximators or insufficient exploration in generating the data. In contrast, Neural Q-Learning which involves the training of a function approximator was introduced by S. Hagen to apply Q-Learning for continuous space. This approach is based on Q-Learning for Linear Quadratic Regulation(LQR). But the learning time of Neural Q-Learning is very slow when it learns very complex systems such as Multi Input Multi Output (MIMO) system etc. Furthermore, batch learning cannot adapt in other environments without using a large data set for the training process. To solve these problems, we propose three contributions. First, we apply this learning to online learning t...

Advisors: Lee, Ju-Jang researcher; 이주장

Description: 한국과학기술원 : 로봇공학학제전공,

Publisher: 한국과학기술원

Issue Date: 2013

Identifier: 514920/325007 / 020113119

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 로봇공학학제전공, 2013.2, [ v, 39 p. ]

Keywords: reinforcement learning; intelligent control; neural Q-Learning; balancing control; 강화학습; 지능제어; 뉴럴 큐 학습; 밸런싱 제어; 온라인 학습; online learning

URI: http://hdl.handle.net/10203/182397

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=514920&flag=dissertation

Appears in Collection: RE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Online neural Q-Learning using heuristic weight assignment algorithm and optimization method휴리스틱한 가중치 배당 알고리즘과 최적화 방법을 이용한 온라인 Neural Q-Learning

KOASAS

Communities & Collections