Algorithms for offline imitation learning with supplementary demonstrations추가적인 시연을 활용한 오프라인 모방학습 알고리즘 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 72
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorKim, Kee-Eung-
dc.contributor.advisor김기응-
dc.contributor.advisorYang, Hongseok-
dc.contributor.advisor양홍석-
dc.contributor.authorKim, Geon-Hyeong-
dc.date.accessioned2023-06-23T19:34:25Z-
dc.date.available2023-06-23T19:34:25Z-
dc.date.issued2022-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1007876&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/309225-
dc.description학위논문(박사) - 한국과학기술원 : 전산학부, 2022.8,[v, 56 p. :]-
dc.description.abstractWe consider offline imitation learning (IL), which aims to mimic the expert’s behavior from its demonstration without further interaction with the environment. One of the main challenges in offline IL is to deal with the narrow support of the data distribution exhibited by the expert demonstrations that cover only a small fraction of the state and the action spaces. In this thesis, we address the two offline IL problems: imitation learning from (1) state-action demonstrations by experts, and (2) state-only demonstrations by experts. First, we consider the problem of learning from demonstrations (LfD), in which the agent aims to mimic the expert’s behavior from the state-action demonstrations by experts. Compared with the recent LfD algorithms that adopt adversarial minimax training objectives, we substantially stabilize overall learning process by reducing minimax optimization to a direct convex optimization in a principled manner. Next, we consider the problem of learning from observation (LfO), in which the agent aims to mimic the expert’s behavior from the state-only demonstrations by experts. We introduce an algorithm that solves a single convex minimization problem, which minimizes the divergence between the two state-transition distributions induced by the expert and the agent policy.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectImitation learning▼aLearning from demonstrations▼aLearning from observations▼aOffline imitation learning▼aImperfect demonstrations-
dc.subject모방학습▼a시연으로부터 학습▼a관찰로부터 학습▼a오프라인 모방학습▼a불완전한 시연-
dc.titleAlgorithms for offline imitation learning with supplementary demonstrations-
dc.title.alternative추가적인 시연을 활용한 오프라인 모방학습 알고리즘 연구-
dc.typeThesis(Ph.D)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :전산학부,-
dc.contributor.alternativeauthor김건형-
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0