In the past few decades, automatic face recognition has been an important vision task. In this paper, we exploit the spatial relationships of facial local regions by using a novel deep network. In the proposed method, face is spatially scanned with spatial long short-term memory (LSTM) to encode the spatial correlation of facial regions. Moreover, with facial regions of various scales, the complementary information of the multi-scale facial features is encoded. Experimental results on public database showed that the proposed method outperformed the conventional methods by improving the face recognition accuracy under illumination variation.