Haptic feedback takes on an important role in providing spatial cues, which are difficult to convey solely by sight, as well as in increasing the immersion of contents. However, although a number of techniques and applications for haptic media have been proposed in this regard, live streaming of touchable video has yet to be actively deployed due to computational complexity and equipment limitations. In order to mitigate these issues, we introduce an approach to render haptic feedback directly from RGB-D video streams without surface reconstruction, and also describe how to superimpose virtual objects or haptic effects onto real-world scenes. Furthermore, we discuss possible improvements in software and appropriate device setups to extend the proposed system to support a practical solution for multi-sensory and multi-point interaction in streaming touchable media.