DSpace at KOASAS: Hindsight Goal Ranking on Replay Buffer for Sparse Reward Environment

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Journal Papers(저널논문)

Hindsight Goal Ranking on Replay Buffer for Sparse Reward Environment

Cited 2 time in

Cited 0 time in

Hit : 258
Download : 0

Export

Luu, Tung M. / Yoo, Chang-Dong researcher

This paper proposes a method for prioritizing the replay experience referred to as Hindsight Goal Ranking (HGR) in overcoming the limitation of Hindsight Experience Replay (HER) that generates hindsight goals based on uniform sampling. HGR samples with higher probability on the states visited in an episode with larger temporal difference (TD) error, which is considered as a proxy measure of the amount which the RL agent can learn from an experience. The actual sampling for large TD error is performed in two steps: first, an episode is sampled from the relay buffer according to the average TD error of its experiences, and then, for the sampled episode, the hindsight goal leading to larger TD error is sampled with higher probability from future visited states. The proposed method combined with Deep Deterministic Policy Gradient (DDPG), an off-policy model-free actor-critic algorithm, accelerates learning significantly faster than that without any prioritization on four challenging simulated robotic manipulation tasks. The empirical results show that HGR uses samples more efficiently than previous methods across all tasks.

Publisher: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Issue Date: 2021-04

Language: English

Article Type: Article

Citation: IEEE ACCESS, v.9, pp.51996 - 52007

ISSN: 2169-3536

DOI: 10.1109/ACCESS.2021.3069975

URI: http://hdl.handle.net/10203/282559

Appears in Collection: EE-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 2 items in WoS	Click to see citing articles in

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Hindsight Goal Ranking on Replay Buffer for Sparse Reward Environment

This item is cited by other documents in WoS

KOASAS

Communities & Collections