In this dissertation, the online planning algorithm is proposed for a multi-goal mission in multiple domains. Real systems require online planning due to the uncertainty of information. However, the lack of computational power made it difficult to apply the existing planning methods to real systems. To overcome this limitation, research on learning a planning method based on a deep learning technique
has recently been proposed. Although deep learning has been successfully implemented to solve many planning problems in a domain-specific setting, developing a learning method to solve multi-goal/domain planning problems is still a challenging task. The presence of multiple targets and domains increases the state space. The dilated state space of multi-goal/domain problems diminishes planning and learning efficiency.
This dissertation aims to develop a dimensionality reduction framework for multi-goal mission planning problems in multi-domain. A state-space can be divided into a domain state and a system state. The domain state refers to information about the domain in which the mission is performed, such as obstacles, threats, and terrain. In many cases, the domain state is high dimensional but sparse. Inspired
by observations, the abstraction is adopted in this dissertation to reduce the dimensions of domain space into a compact form. The system state consists of information indicating the current system, such as position and health, and information indicating the completion of goals. As the number of goals increases, the size of the system state grows exponentially in multi-goal problems. Some types of tasks in robotics can be treated as episodic sparse reward tasks. This fact makes it possible to deal efficiently with complex multi-goal problems. The approximation method for the value of a multi-goal problem is proposed by combining the value of single-goal problems. Based on the aforementioned dimensional reductions, a network structure that can efficiently learn the value function of multiple goals/domains is proposed. Numerical studies and simulations are conducted to demonstrate the efficiency and effectiveness of the proposed framework.