Model ensemble-based intrinsic reward for sparse reward reinforcement learning드문 보상이 주어진 강화학습 환경에서 여러 개의 확률 모델을 사용한 내적 보상 설계

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 318
  • Download : 0
In this paper, a new intrinsic reward generation method is proposed based on an ensemble of dynamics models for sparse-reward reinforcement learning. In the proposed method, the mixture of multiple dynamics models is used to approximate the true unknown transition probability and the intrinsic reward is designed as the minimum of the surprise seen from each dynamics model to the mixture of the dynamics models. Then, a working algorithm is constructed by combining the proposed intrinsic reward generation method with PPO. Numerical results show that the proposed model ensemble-based intrinsic reward generation method outperforms the previous intrinsic reward generation method based on a single dynamics model.
Advisors
Sung, Young Chulresearcher성영철researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2018
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2018.8,[iii, 21 p. :]

Keywords

Sparse reward reinforcement learning▼aintrinsic reward▼aensemble of dynamics models▼asurprise; 드문 보상이 주어진 강화학습▼a내적 보상▼a여러 개의 확률 모델▼a놀라움

URI
http://hdl.handle.net/10203/266816
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=828574&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0