Addressing double sampling issue by learning dynamics model모델 학습을 통한 이중 샘플링 문제 해소

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 56
  • Download : 0
With the recent advancements in deep neural networks, reinforcement learning has demonstrated remarkable performance in various fields such as games, language models, and robotics. However, currently prevalent reinforcement learning algorithms employ the target network to address the double sampling issue, which necessitates an additional Q-network and delays the update. In this thesis, we tackle the aforementioned problem by training the dynamics model instead of using the target network, aiming to resolve the double sampling issue. Specifically, our approach modified deep Q-network by sampling another independent next state from the learned dynamics model and introducing a new loss function that takes into account the double sampling issue. With the proposed method, we aim to optimize the Q-network through a more precise gradient closer to the true gradient of mean squared Bellman error. In experiments, the proposed algorithm robustly achieved higher undiscounted returns and predicted action-values more stably compared to deep Q-network.
Advisors
이동환researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2024
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2024.2,[iii, 20 p. :]

Keywords

강화학습▼a모델기반 강화학습▼a심층 큐 네트워크▼a이중 샘플링 문제; Reinforcement learning▼aModel-based reinforcement learning▼aDeep Q-network▼aDouble sampling issue

URI
http://hdl.handle.net/10203/321593
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1097165&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0