This paper proposes a novel visual comfort assessment (VCA) for stereoscopic images using deep learning. To predict visual discomfort of human visual system in stereoscopic viewing, we devise VCA deep networks to latently encode perceptual cues, which are visual differences between stereoscopic images and human attention-based disparity magnitude and gradient information. To extract the visual difference features from left and right views, a Siamese network is employed. In addition, human attention region-based disparity magnitude and gradient maps are fed to two individual deep convolutional neural networks (DCNNs) for disparity-related features based on human visual system (HVS). Finally, by aggregating these perceptual features, the proposed method directly predicts the final visual comfort score. Extensive and comparative experiments have been conducted on IEEE-SA dataset. Experimental results show that the proposed method can yield excellent correlation performance compared to existing methods.