Exploratory networks using binary rewards for multigoal reinforcement learning이진 보상을 사용하는 탐험 네트워크를 통한 다중 목표 강화학습

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 210
  • Download : 0
In multigoal reinforcement learning, an agent learns a policy to achieve multiple goals by interacting with an environment. With a sparse binary reward and a large state space, success cases do not appear frequently, which slows down the learning speed and increases the learning difficulty. To solve these problems, researches have been on reward design and efficient exploration and experience sampling. However, traditional reward designs may include a developer's bias and efficient exploration is still challenging. In this paper, a method to improve exploration efficiency while minimizing the developer's bias by using exploratory networks using binary rewards. The binary rewards used for exploratory networks involve minimal the developer's bias as a reward that makes the agent take actions that have the possibility of success, but since the main network is isolated and only depends on the goal reward, sparse binary reward, the final behavior of the main network only achieves the goal without involving the developer's bias. The proposed method can increase the exploration efficiency through the same effect as using a combined reward which has multiple terms in the reward function while maintaining the advantages of sparse binary rewards in which the developer's bias is not involved. In this paper, the proposed method is experimented to compare the performance of agent with and without the proposed method in Push task, PickAndPlace task, and Slide task, in which a robot moves an object and sends it to the target point. In three experiments, the cases with the proposed method show a higher success rate or marginally improved success rate compared to the cases without the proposed method.
Advisors
Har, Dongsooresearcher하동수researcher
Description
한국과학기술원 :조천식녹색교통대학원,
Publisher
한국과학기술원
Issue Date
2021
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 조천식녹색교통대학원, 2021.2,[v, 36 p. :]

Keywords

Multigoal Reinforcement Learning▼aExploratory Network▼aBinary Reward▼aReward Design; 다중목표 강화학습▼a탐험네트워크▼a이진보상▼a보상설계

URI
http://hdl.handle.net/10203/296216
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=948359&flag=dissertation
Appears in Collection
GT-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0