SAFE REINFORCEMENT LEARNING BASED MULTI-ROTOR COLLISION AVOIDANCE WITH UNEXPECTED OBSTACLES

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 119
  • Download : 0
From warfare to the civil domain, UAV's roles and missions are kept increasing and more complicated. To complete the sophisticated mission, the significance of autonomous flight under uncertain condition rise. For an avoidance problem with the unexpected situation caused by dynamic obstacles on a quad-rotor, methods based on deep reinforcement learning have recently come into the spotlight, even though safety issues to be implemented in the real world are included. In this research, a collision avoidance algorithm is proposed based on reinforcement learning combined with a safety filter, with the advantages of (1)reshaping the unsafe command from the DDPG actor network to avoid collision, and (2)using simplified depthmap for DDPG agent training to consider a real-world implementation and to reduce the computation time. Through the surrounding obstacle detection sensor such as 360°3D Lidar, the UAV flight control unit can estimate the distance between the airborne vehicle and detect stumbling blocks by depthmap and perform evasion flight if required. For path following considering obstacles, we adopt the Reinforcement Learning (RL) agent. Meanwhile, the RL agent shows excellent accuracy and fast learning compared to optimization-based avoidance it cannot guarantee successful evasion. To ensure safeness and fast learning, we suggest an RL agent with a safety filter. As a result, the RL agent acquired about an 80% success rate on a random-dynamic obstacle mission with randomly generated 1,000 trials. Also, some of the successful scenarios are included to show the agent can avoid the collision with the randomly moving dynamic obstacles with suggested safe reinforcement learning method. The paper is organized as follows; first, we described the problem to solve and clarified the definition of random-dynamic obstacle and environment including quad-rotor dynamic equations. Then, we described a safety filter to reshape the velocity command from reinforcement learning agent not to collide if any collision with obstacles or walls is predicted. Also, Deep Deterministic Policy Gradient (DDPG) neural network based reinforcement learning method with the Actor-Critic Framework for given quad-rotor state and the depthmap is defined for quad-rotor obstacle avoidance. Finally, simulation results based on MATLAB environment, and concluding remarks are given.
Publisher
International Council of the Aeronautical Sciences
Issue Date
2022-09
Language
English
Citation

33rd Congress of the International Council of the Aeronautical Sciences, ICAS 2022, pp.5663 - 5673

URI
http://hdl.handle.net/10203/312600
Appears in Collection
AE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0