From the definition of reality-virtuality continuum by Paul Milgram, human’s experience environments can be classified into four categories: real world(RW), augmented reality(AR), augmented virtuality(AV) and virtual reality(VR) environment.
Recent advancements in the computing capabilities of such devices as Google Glass, Kinect2 and Oculus Rift have drawn people’s attention which had stayed in RW environments to AR, AV and VR environments. Among them, AV services have gained tremendous popularity and have been widely employed in a variety of places such as home and shopping malls with the wide spread of RGB and RGB-D sensors like Kinect2.
AV services provide a user with experience expansion beyond time and space limitation, because the user can be put into any environment in a virtual world. But, the user can be easily distracted when seamless integration between RW objects and virtual worlds is broken. To give a full immersion to the user, it is required to extract a user’s silhouette with high accuracy and every processing must be conducted under real-time constraint.
In this dissertation, we propose fast and accurate foreground segmentation for RGB and RGB-D sensors. Firstly, adaptive mixture of Gaussians(MoGs) with observation masks control is proposed for RGB sensors to handle the stopped objects problem. Secondly, multi-weighted moving average(MWMA) and saliency- and depth-aided segmentation(SDAS) method are proposed for RGB-D sensors to solve flickering on the boundary and data loss on the head and feet of a human body, respectively. The qualitative and quantitative experimental results show that the proposed methods outperforms the previous methods in accuracy with a comparable speed.