Motion estimation approaches enable the robust prediction of successive
camera poses when a camera undergoes erratic motion. It is especially difficult
to make robust predictions under such conditions when using a constant-velocity
model. However, motion estimation itself inevitably involves pose errors that result
in the production of an inconsistent map. To solve this problem, we propose a novel
3D visual SLAM approach in which both motion estimation and stochastic filtering
are performed; in the proposed method, visual odometry and Rao-blackwellized
particle filtering are combined. First, to ensure that the process and the measurement
noise are independent (they are actually dependent in the case of a single sensor),
we simply divide observations (i.e., image features) into two categories, common
features observed in the consecutive key-frame images and new features detected
in the current key-frame image. In addition, we propose a key-frame SLAM to reduce
error accumulation with a data-driven proposal distribution. We demonstrate
the accuracy of the proposed method in terms of the consistency of the global map.