DSpace at KOASAS: Constrained Bayesian Reinforcement Learning via Approximate Linear Programming

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Conference Papers(학술회의논문)

Constrained Bayesian Reinforcement Learning via Approximate Linear Programming

Cited 2 time in

Cited 0 time in scopus

Hit : 266
Download : 0

Export

Lee, Jongmin / Jang, Youngsoo / Poupart, Pascal / Kim, Kee-Eung researcher

In this paper, we highlight our recent work~\cite{Lee2017} considering the safe learning scenario where we need to restrict the exploratory behavior of a reinforcement learning agent. Specifically, we treat the problem as a form of Bayesian reinforcement learning (BRL) in an environment that is modeled as a constrained MDP (CMDP) where the cost function penalizes undesirable situations. We propose a model-based BRL algorithm for such an environment, eliciting risk-sensitive exploration in a principled way. Our algorithm efficiently solves the constrained BRL problem by approximate linear programming, and generates a finite state controller in an off-line manner. We provide theoretical guarantees and demonstrate empirically that our approach outperforms the state of the art.

Publisher: Scaling-Up Reinforcement Learning Workshop at ECML PKDD 2017

Issue Date: 2017-09-18

Language: English

Citation: Scaling-Up Reinforcement Learning Workshop at ECML PKDD 2017

URI: http://hdl.handle.net/10203/227104

Appears in Collection: AI-Conference Papers(학술대회논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 2 items in WoS	Click to see citing articles in

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Constrained Bayesian Reinforcement Learning via Approximate Linear Programming

This item is cited by other documents in WoS

KOASAS

Communities & Collections