We present a finite-horizon Markov Decision Process (MDP) model for a patient prioritization and hospital selection problem, which is a critical decision-making problem in emergency medical service operation. Solving this model requires reinforcement learning (RL) due to its large state space. We propose a novel approach with an aim to significantly enhance the scalability of RL algorithms. Our approach, which we call a State Partitioning and Action Network, SPartAN in short, is a meta-algorithm that offers a framework an RL algorithm can be incorporated into. In this approach, we partition the state space into smaller subspaces to construct a reliable action network in the downstream subspace. This action network is then used in a simulation to approximate values of the upstream subspace. Using temporal difference (TD) learning as an example RL algorithm, we show that SPartAN is able to reliably derive a high-quality policy solution, thereby opening opportunities to solve many practical MDP models in healthcare system problems.