Audio feature extraction methods for multimedia content analysis멀티미디어 내용 분석을 위한 오디오 특징 추출 방법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 538
  • Download : 0
With the rapid distribution of the user created contents (UCCs), a variety of multimedia contents including general contents such as music, movies and malicious contents such as adult videos have been made and shared easily through the Internet. These social trends can negatively affect children or teenagers and cause sexual crimes. Therefore, analyzing the multimedia contents to decide whether the contents are malicious or not has recently received a great attention from many researchers and social groups. This thesis addresses the problem of analyzing the multimedia contents based on audio signals to detect and block the objectionable multimedia contents. The malicious sounds such as sexual scream or moan show the distinctive characteristics that have large temporal variations and fast spectral transitions. Therefore, extracting appropriate features to properly represent these characteristics is important in achieving a better performance. In this thesis, we employ segment-based two-dimensional $\emph{Mel}$-frequency cepstral coefficients and histograms of gradient directions as a feature set to characterize both the temporal variations and spectral transitions within a long-range segment of the target signal. Gaussian mixture model (GMM) is adopted to statistically represent the malicious and non-malicious sounds, and the test sounds are classified by a maximum a posterior probability (MAP) method. Evaluation of the proposed extraction method on a database of several hundred malicious and non-malicious sound clips yielded a classification accuracy of 96.06 %, which was a good performance showing a possibility that could be used as an alternative to the image-based methods.
Advisors
Kim, Hoi-Rinresearcher김회린researcher
Description
한국과학기술원 : 정보통신공학과,
Publisher
한국과학기술원
Issue Date
2010
Identifier
455270/325007  / 020084297
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 정보통신공학과, 2010.08, [ viii, 48 p. ]

Keywords

Histograms of Oriented Gradients; Segmental Two-Dimensional Mel-Frequency Cepstral Coefficients; Multimedia Content Analysis; Audio Feature Extraction; Gaussian Mixture Model; 가우시안 혼합 모델; 기울기방향성 히스토그램; 세그먼트기반 2차 멜켑스트럼계수; 멀티미디어 내용 분석; 오디오 특징 추출

URI
http://hdl.handle.net/10203/40139
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=455270&flag=dissertation
Appears in Collection
ICE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0