In mixed reality (MR), augmenting virtual objects consistently with real-world illumination is one of the key factors that provide a realistic and immersive user experience. For this purpose, we propose a novel deep learning-based method to estimate high dynamic range (HDR) illumination from a single RGB image of a reference object. To obtain illumination of a current scene, previous approaches inserted a special camera in that scene, which may interfere with user's immersion, or they analyzed reflected radiances from a passive light probe with a specific type of materials or a known shape. The proposed method does not require any additional gadgets or strong prior cues, and aims to predict illumination from a single image of an observed object with a wide range of homogeneous materials and shapes. To effectively solve this ill-posed inverse rendering problem, three sequential deep neural networks are employed based on a physically-inspired design. These networks perform end-to-end regression to gradually decrease dependency on the material and shape. To cover various conditions, the proposed networks are trained on a large synthetic dataset generated by physically-based rendering. Finally, the reconstructed HDR illumination enables realistic image-based lighting of virtual objects in MR. Experimental results demonstrate the effectiveness of this approach compared against state-of-the-art methods. The paper also suggests some interesting MR applications in indoor and outdoor scenes.