DSpace at KOASAS: Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Journal Papers(저널논문)

Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection

Cited 5 time in

Cited 0 time in

Hit : 415
Download : 0

Export

Min, Hyun-seok / Choi, Jae Young / De Neve, Wesley / Ro, YongMan researcher

The detection of near-duplicate video clips (NDVCs) is an area of current research interest and intense development. Most NDVC detection methods represent video clips with a unique set of low-level visual features, typically describing color or texture information. However, low-level visual features are sensitive to transformations of the video content. Given the observation that transformations tend to preserve the semantic information conveyed by the video content, we propose a novel approach for identifying NDVCs, making use of both low-level visual features (this is, MPEG-7 visual features) and high-level semantic features (this is, 32 semantic concepts detected using trained classifiers). Experimental results obtained for the publicly available MUSCLE-VCD-2007 and TRECVID 2008 video sets show that bimodal fusion of visual and semantic features facilitates robust NDVC detection. In particular, the proposed method is able to identify NDVCs with a low missed detection rate (3% on average) and a low false alarm rate (2% on average). In addition, the combined use of visual and semantic features outperforms the separate use of either of them in terms of NDVC detection effectiveness. Further, we demonstrate that the effectiveness of the proposed method is on par with or better than the effectiveness of three state-of-the-art NDVC detection methods either making use of temporal ordinal measurement, features computed using the Scale-Invariant Feature Transform (SIFT), or bag-of-visual-words (BoVW). We also show that the influence of the effectiveness of semantic concept detection on the effectiveness of NDVC detection is limited, as long as the mean average precision (MAP) of the semantic concept detectors used is higher than 0.3. Finally, we illustrate that the computational complexity of our NDVC detection method is competitive with the computational complexity of the three aforementioned NDVC detection methods. (C) 2011 Elsevier B.V. All rights reserved.

Publisher: ELSEVIER SCIENCE BV

Issue Date: 2011-11

Language: English

Article Type: Article

Citation: SIGNAL PROCESSING-IMAGE COMMUNICATION, v.26, no.10, pp.612 - 627

ISSN: 0923-5965

URI: http://hdl.handle.net/10203/100872

Appears in Collection: EE-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 5 items in WoS	Click to see citing articles in

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection

This item is cited by other documents in WoS

KOASAS

Communities & Collections