Accurate depth estimation is a challenging, yet essential step in the conversion of a 2D image sequence to a 3D stereo sequence. We present a novel approach to construct a temporally coherent depth map for each image in a sequence. The quality of the estimated depth is high enough for the purpose of2D to 3D stereo conversion. Our approach first combines the video sequence into a panoramic image. A user can scribble on this single panoramic image to specify depth information. The depth is then propagated to the remainder of the panoramic image. This depth map is then remapped to the original sequence and used as the initial guess for each individual depth map in the sequence. Our approach greatly simplifies the required user interaction during the assignment of the depth and allows for relatively free camera movement during the generation of a panoramic image. We demonstrate the effectiveness of our method by showing stereo converted sequences with various camera motions.