The present invention includes: a targeting unit configured to, when an event by a user's action is generated in a 3D image displayed on a display device, acquire a first subspace of a first three-dimensional (3D) shape corresponding to the user's action; and a refinement unit configured to acquire a second subspace of a second 3D shape, of which position and scale are adjusted according to a user's gesture within a range of the first subspace acquired by the targeting unit.