Disclosed is a reinforcement learning-based resource allocation method for a wireless backhaul network, which is performed by a resource allocation apparatus. The method includes estimating locations of a plurality of base stations on the basis of channel state information (CSI) measured by the plurality of base stations; and allocating resources of the wireless backhaul network to the plurality of base stations using a reinforcement learning neural network having the locations as an input.