Bayesian reinforcement learning with behavioral feedback행동 피드백을 통한 베이지안 강화학습

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 1650
  • Download : 0
In the standard reinforcement learning setting, the agent learns optimal policy solely from state transitions and rewards from the environment. We consider an extended setting where a trainer additionally provides feedback on the actions executed by the agent. This requires appropriately incorporating the feedback, even when the feedback is not necessarily accurate. In this paper, we present a Bayesian approach to this extended reinforcement learning setting. Specifically, we extend Kalman Temporal Difference learning to compute the posterior distribution over Q-values given the state transitions and rewards from the environment as well as the feedback from the trainer. Through experiments on standard reinforcement learning tasks, we show that learning performance can be significantly improved even with inaccurate feedback.
Advisors
Kim, Kee-Eungresearcher김기응researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2016
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전산학부, 2016.8 ,[iii, 22 p. :]

Keywords

Bayesian reinforcement learning; Kalman filter; Behavioral feedback; 베이지안 강화학습; 칼만 필터; 행동 피드백

URI
http://hdl.handle.net/10203/221845
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=663496&flag=dissertation
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0