(The) optimization of flapping wing kinematics using the water-tank experiment and hovering control of a flapping wing micro aerial vehicle using reinforcement learning = 수조 상사 실험 기반 날갯짓 최적화 및 강화학습을 통한 날갯짓 비행체의 제자리 비행 제어
In this study, an energy efficient wing motion of a flapping wing micro aerial vehicle (FWMAV) was optimized using the water-tank experiment. Also, the hovering controller with energy-minimized wing motion was investigated using the reinforcement learning. The scaled-up wing model with 3-axis robotic manipulator analyzed the various kinds of flapping wing kinematics. The aerodynamic property of the deviating motion was intensively analyzed using the force/moments measurement and flow visualization. The result revealed that the quasi-steady (QS) model, frequently used for many optimization and control researches, could not predict forces and moments correctly in the case with deviating wing motion. The digital particle image velocimetry (DPIV) showed that deviating motion caused the delay of the trailing edge vortex (TEV) shedding and produced a vortex loop in the middle of each stroke. These are the reasons for the lift reduction in the middle of the stroke compared to the QS model. The result of deviating motion analysis implies that the QS model is not sufficient to optimize the wing motion. More complicated but accurate numerical models, such as computational fluid dynamics (CFD), need a considerable amount of calculations for the direct optimization of the flapping wing kinematics in large design space. Therefore, the water-tank experimental approach was introduced to optimize the energy-minimized flapping wing kinematics. The result showed that the deviating motion was required for an efficient hovering flight in the selected case. However, this direct experimental approach was not fast enough to optimize several cases for a more in-depth understanding of the optimal wing motion of the FWMAV. Therefore, data-driven modeling technique was applied. The surrogate model was modeled using ~100,000 sets of the experimental data from previous optimization experiments. Several validation experiments were carefully conducted in a wide range of the kinematic variables. This model accurately estimated the lift, drag, and power consumption in terms of both mean value and time series. The surrogate optimization results showed that the deviating motion is only necessary when the high lift force is required. The FWMAV, however, needs not only the proper wing kinematics but also the controller to achieve stable hovering flight. The surrogate model, which was developed by data-driven modeling technique, is a black-box model. It is difficult to apply the classical control method in this model due to the lack of knowledge of its internal workings. Deep reinforcement learning was applied to provide a general way to develop a controller for the black-box model. The optimal policy was obtained by using the proximal policy optimization algorithm. This method successfully achieved the hovering state with energy-minimizing flapping motion. This general algorithm had enough feasibility to find the energy-minimized hovering controller for the FWMAV.