3D object detection is a key technology for autonomous driving. The technology is based on the image of the camera or the point cloud of the LiDAR. Usually, LiDAR-based detectors perform better than using images due to their inherent characteristics. However, point clouds also have difficulty in detecting due to limitations such as scarcity and density variations with distance. Therefore, it is believed that the use of multiple sensors, i. e. cameras, and LiDARs, will improve the performance of the detector and enable robust object detection. However, due to the general discrepancy in resolution and viewpoint, a fusion between the camera and the LiDAR is tricky. Furthermore, the performance often deteriorated due to the fusion. Therefore, many studies have transformed and used images to make them only compatible with the point cloud network. Thus, it is difficult to apply the fusion method of the image used in one network to another. In this study, a fusion module that can be used universally for the network using PointNet, a point cloud encoder known for its excellent performance, was designed. In addition, the KITTI Dataset was used to evaluate the performance of the network where the module was attached.