Meta-reinforcement learning with imaginary tasks가상 태스크를 활용한 메타 강화학습

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 17
  • Download : 0
Deep reinforcement learning (RL) is confronted by critical challenges, particularly the issue of overfitting and limited generalization ability. Traditional RL models, though proficient in their training tasks, face performance degradation when presented with unseen test tasks. Meta-reinforcement learning (meta-RL) proposes a solution by training agents with a range of tasks to develop an inductive bias, which ideally allows the agents to infer the underlying structure of new tasks and rapidly adapt their strategies accordingly. However, a fundamental constraint in current meta-RL paradigms is their restricted training task distribution, limiting adaptability to new environments, especially out-of-distribution (OOD) dynamics. To address these limitations, we introduce two novel meta-RL algorithms based on training policies on imaginary tasks generated by the learned dynamics model. We first introduce the Latent Dynamics Mixture (LDM), an innovative context-based meta-RL framework enhancing generalization to unseen tasks. LDM employs imaginary tasks derived from latent beliefs for more effective meta-training, eliminating the need for further policy updates during test phases. Despite its promise, LDM operates within parametric task variations, prompting our exploration into non-parametric task variability with Subtask Decomposition and Virtual Training (SDVT). SDVT transcends traditional constraints by decomposing tasks into elementary subtasks. SDVT leverages a Gaussian mixture variational autoencoder to discern effective subtask representations, creating a parameterized understanding of complex tasks. We present rigorous evaluations of LDM and SDVT across diverse meta-RL benchmarks, maintaining strict separation between training and test distributions, and showcasing their superiority in unfamiliar tasks without necessitating test-time network updates. These methodologies signify a breakthrough in meta-RL, employing imaginary tasks generated from learned latent task dynamics. We outline this transformative journey, emphasizing the shift from mitigating overfitting in standard task distributions to mastering non-parametric tasks. The findings herein lay the foundation for future innovations, steering the field towards more adaptable and generalizable reinforcement learning.
Advisors
성영철researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2024
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2024.2,[viii, 74 p. :]

Keywords

강화학습▼a메타 강화학습▼a일반화▼a가상 태스크▼a하위 태스크 분해; Reinforcement learning▼aMeta-reinforcement learning▼aGeneralization▼aImaginary tasks▼aSubtask decomposition

URI
http://hdl.handle.net/10203/322131
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1100031&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0