We propose a metadata generation technique to describe contents of broadcasting video. It can generate XML documents which summarize the video contents. In the system, the video is analyzed by shot-boundary detection and multi-modal featuring. These features are then combined to construct high level metadata such as segments of important events.
Nowadays, there is an increasing demand for interactive broadcast system. TV-Anytime and MPEG-7 standards provide efficient and effective content- based metadata for summarizing, retrieving, and indexing content. For example, they can give the information such as color histogram, homogeneous texture, program information, program locator and so on.
In this paper, we analyze the video contents with multiple content features, e.g., multiple MPEG-7 metadata. At first, we carry out a shot boundary detection and then MPEG-7 meta data, such as camera motion, motion trajectory, GOP (Group Of Picture), edge detection and homogeneous texture, are extracted. However, single metadata may not work to find similar high level pattern. For example, camera motion or GOP metadata can not be used alone to find putting-shots. Therefore one needs to combine multiple metadata. In this paper, we combined camera motion feature and GOP to find putting-shots more exactly. Further, the summary which has the structure of SegmentGroupInformation DS in TV-Anytime is constructed and formed in XML document. One can search the video contents by parsing the XML document in the way of direct access of the key shot.
To demonstrate the usefulness of the proposed method, we implemented a video metadata generation tool. The necessary metadata are extracted and the key events of video sequences are presented with multiple metadata and formed in XML document for the summary. In experiments, we used MPEG-7 video data set. With a query pattern, we measured the ratio of finding similar patterns among all retrieved patterns. One can get a more similar pat...