Self-supervised exploration for cooperative multi-agent reinforcement learning다중 에이전트 강화학습에서의 협력을 위한 자기지도 탐색기법

DC FieldValueLanguage
dc.contributor.advisorYi, Yung-
dc.contributor.authorDelos Reyes, Roben-
dc.description학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2020.8,[iii, 21 p. :]-
dc.description.abstractLearning in sparse reward environments remains challenging for standard cooperative multi-agent reinforcement learning (MARL) algorithms. Because extrinsic rewards are sparse, agents lack the motivation or direction on how to explore the environment. An effective approach for encouraging exploration in the single-agent setting is to give the agent the prediction error of a novelty module as intrinsic reward. This novelty module is trained to predict the agent’s next state given its current state and action. Thus, giving this prediction error to the agent as intrinsic reward motivates the agent to explore parts of the environment which are novel to it. In this work, we extend this self-supervised exploration method to cooperative MARL. Unlike in single-agent environments, exploration in cooperative multi-agent environments would be more efficient if agents coordinate how they explore the environment. Here, we propose a new novelty module architecture and intrinsic reward formulation that encourage coordinated exploration. In particular, we design a two-headed novelty module that learns to predict both the agent’s next state and the joint next state of all agents. We then give as intrinsic reward to the agent the sum of the individual prediction error and the joint prediction error of this two-headed novelty module. We demonstrate in two sparse reward cooperative navigation scenarios that the combination of our novelty module architecture and intrinsic reward formulation improves the performance of standard cooperative MARL algorithms the most.-
dc.subjectDeep learning▼amulti-agent reinforcement learning▼asparse reward▼aexploration▼aself-supervised learning-
dc.titleSelf-supervised exploration for cooperative multi-agent reinforcement learning-
dc.title.alternative다중 에이전트 강화학습에서의 협력을 위한 자기지도 탐색기법-
dc.description.department한국과학기술원 :전기및전자공학부,-
dc.contributor.alternativeauthorRoben Delos Reyes-
