In recent years, some researchers have been interested in whether robustness and blindness can be simultaneously secured in a watermarking based on machine learning. However, achieving robustness against various attacks at once is still difficult for watermarking techniques. To address the problem, in this paper, we propose a learning framework for robust and blind watermarking based on reinforcement learning. We repeat three stages: watermark embedding, attack simulation, and weight updating. Specifically, we present image watermarking networks called WMNet using convolutional neural networks (CNNs). Two methods to embed a watermark are proposed and these two methods are based on backpropagation and autoencoder, respectively. We can optimize the robustness while carefully considering the invisibility of the watermarking system. The experimental results show that the trained WMNet captures more robust features than the current watermarking schemes, which use the frequency domain. The trade-off between the robustness and the invisibility of each technique was measured. Also, we adopt a visual masking with which we can achieve the appropriate balance between robustness and invisibility of the watermark. Our reinforcement-learning-based technique has better robustness than the existing techniques for both attacks seen in learning and unseen attacks. Due to the generalization ability of WMNet, moreover, it shows high robustness against multiple attacks and various levels of attacks which are not considered in training stage.