Hand pose estimation is actively studied problem in many areas such as Augmented Reality (AR), Virtual Reality (VR), Computer Vision (CV), and Human Computer Interaction (HCI). Existing model-based hand tracking studies tracked fairly accurately and robustly hand pose for slow and smooth hand motion by assuming temporal continuity of the hand pose. In addition, learning-based hand pose estimation studies estimated accurately hand pose in a single camera frame without the assumption of temporal continuity. However, motion blur occurs when the hand moves quickly, which significantly drops the accuracy of the previously studied approaches. This dissertation studies the methods that estimate the 3D hand pose from a depth camera, a gyroscope, and an infrared camera in a situation where temporal continuity of hand pose cannot be assumed and in a challenging environment called motion blur. The proposed method is verified by performing quantitative and qualitative comparisons with state-of-the-art methods. This study demonstrates the possibility of application in augmented reality and virtual reality by showing that it can be used in 3D interaction and sign language.