In this paper, we present a novel hardware-friendly super-resolution (SR) method based on a convolutional neural network (CNN) and its dedicated hardware (HW) on field programmable gate array (FPGA). Although CNN-based SR methods have shown very promising results for SR, their computational complexities are prohibitive for hardware implementation. To the best of our knowledge, we are the first to implement a real-time CNN-based SR HW that upscales 2K full high-definition video to 4K ultra high-definition (UHD) video at 60 frames per second (fps). In our dedicated CNN-based SR HW, low-resolution input frames are processed line-by-line, and the number of convolutional filter parameters is reduced significantly by incorporating depth-wise separable convolutions with a residual connection. Our CNN-based SR HW incorporates a cascade of 1D convolutions having large receptive fields along horizontal lines while keeping vertical receptive fields minimal, which allows us to save required line memory space in achieving comparable SR performance against full 2D convolution operations. For efficient HW implementation, we use a simple and effective quantization method with little peak signal-to-noise ratio (PSNR) degradation. Also, we propose a compression method to efficiently store intermediate feature map data to reduce the number of line memories used in HW. Our HW implementation on the FPGA generates 4K UHD frames of higher PSNR values at 60 fps and shows better visual quality, compared with conventional CNN-based SR methods that are trained and tested in software.