Algorithms for offline imitation learning with supplementary demonstrations추가적인 시연을 활용한 오프라인 모방학습 알고리즘 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 71
  • Download : 0
We consider offline imitation learning (IL), which aims to mimic the expert’s behavior from its demonstration without further interaction with the environment. One of the main challenges in offline IL is to deal with the narrow support of the data distribution exhibited by the expert demonstrations that cover only a small fraction of the state and the action spaces. In this thesis, we address the two offline IL problems: imitation learning from (1) state-action demonstrations by experts, and (2) state-only demonstrations by experts. First, we consider the problem of learning from demonstrations (LfD), in which the agent aims to mimic the expert’s behavior from the state-action demonstrations by experts. Compared with the recent LfD algorithms that adopt adversarial minimax training objectives, we substantially stabilize overall learning process by reducing minimax optimization to a direct convex optimization in a principled manner. Next, we consider the problem of learning from observation (LfO), in which the agent aims to mimic the expert’s behavior from the state-only demonstrations by experts. We introduce an algorithm that solves a single convex minimization problem, which minimizes the divergence between the two state-transition distributions induced by the expert and the agent policy.
Advisors
Kim, Kee-Eungresearcher김기응researcherYang, Hongseokresearcher양홍석researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2022
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학부, 2022.8,[v, 56 p. :]

Keywords

Imitation learning▼aLearning from demonstrations▼aLearning from observations▼aOffline imitation learning▼aImperfect demonstrations; 모방학습▼a시연으로부터 학습▼a관찰로부터 학습▼a오프라인 모방학습▼a불완전한 시연

URI
http://hdl.handle.net/10203/309225
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1007876&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0