Algorithms for efficient offline reinforcement learning효율적인 오프라인 강화학습을 위한 알고리즘 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 148
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorKim, Kee-Eung-
dc.contributor.advisor김기응-
dc.contributor.authorLee, Byung-Jun-
dc.date.accessioned2022-04-21T19:34:23Z-
dc.date.available2022-04-21T19:34:23Z-
dc.date.issued2021-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=956453&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/295721-
dc.description학위논문(박사) - 한국과학기술원 : 전산학부, 2021.2,[iv, 61 p. :]-
dc.description.abstractOffline reinforcement learning (RL) aims to learn without additional interaction with the environment, based on the pre-collected dataset. It has recently gathered attention due to its promise for real-world applications. Unlike online RL where the agent's predictions can be further corrected through additional interactions, offline RL requires robust policy improvement under the potentially incorrect predictions. To do this, it is necessary to accurately measure the uncertainty of the implicitly or explicitly constructed environment model, and design an algorithm that can find a trade-off between the potential policy performance and the uncertainty in policy evaluation. In this thesis, we study offline RL algorithms about (1) finding a good trade-off using a validation split and (2) learning model that is more robust especially for offline RL.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectMachine Learning▼aReinforcement Learning▼aOffline Reinforcement Learning▼aHypergradient▼aBalanced Representation-
dc.subject기계학습▼a강화학습▼a오프라인 강화학습▼a하이퍼그래디언트▼a표현 밸런싱-
dc.titleAlgorithms for efficient offline reinforcement learning-
dc.title.alternative효율적인 오프라인 강화학습을 위한 알고리즘 연구-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전산학부,-
dc.contributor.alternativeauthor이병준-
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0