The ambiguity of the decision-making process has been pointed out as the main obstacle to practically applying the deep learning-based method in spite of its outstanding performance. Interpretability can guarantee the confidence of the deep learning system, therefore it is particularly important in the medical field. In this study, a novel deep network is proposed to explain the diagnostic decision with visual pointing map and diagnostic sentence justifying result simultaneously. To increase the accuracy of sentence generation, a visual word constraint model is devised in training justification generator. To verify the proposed method, comparative experiments were conducted on the problem of the diagnosis of breast masses. Experimental results demonstrated that the proposed deep network can explain diagnosis more accurately with various textual justifications.