DSpace at KOASAS: Representation learning for boundary detection in music structure analysis

DSpace at KOASAS

College of Liberal Arts and Convergence Science(인문사회융합과학대학)Graduate School of Culture Technology(문화기술대학원)GCT-Theses_Master(석사논문)

Representation learning for boundary detection in music structure analysis음악 구조 분석에서 경계 탐지를 위한 표현 학습

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 269
Download : 0

Export

DC Field	Value	Language
dc.contributor.advisor	Nam, Juhan	-
dc.contributor.advisor	남주한	-
dc.contributor.author	Choi, Minsuk	-
dc.date.accessioned	2021-05-12T19:36:44Z	-
dc.date.available	2021-05-12T19:36:44Z	-
dc.date.issued	2020	-
dc.identifier.uri	http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=910801&flag=dissertation	en_US
dc.identifier.uri	http://hdl.handle.net/10203/284009	-
dc.description	학위논문(석사) - 한국과학기술원 : 문화기술대학원, 2020.2,[iv, 28 p. :]	-
dc.description.abstract	Music has a structure that consists of functional sections such as verse or chorus in pop music. Detecting the boundaries between two functionally homogeneous sections is a front-end task toward complete music structure analysis. Boundary detection is typically conducted on a self-similarity matrix (SSM) computed from audio features that capture harmonic or timbral characteristics of a music track. Traditionally, hand-crafted features that explicitly extract the characteristics such as Mel-Frequency Cepstral Coefficients (MFCC), chroma have been common choices. In this paper, we propose a method to learn feature representations via deep neural networks to obtain more effective SSM. Specifically, we use a Siamese-style neural network with a triplet loss that consists of anchor, positive and negative examples to train the model. The anchor and positive samples are selected from the same or temporally close section in music structure whereas the negative samples are from the outside of the section that anchor and positive samples are selected. We show that this approach tends to render the audio features to bemore homogeneous within a section. Once we compute the SSM from the learned features, we apply a Gaussian checkerboard kernel to detect the structure boundary. We evaluate the performance of the proposed method on the SALAMI dataset. The results show that the propose method outperforms the traditional hand-crafted features when the same setup is used except the audio features.	-
dc.language	eng	-
dc.publisher	한국과학기술원	-
dc.subject	Music Structure Analysis▼aMusic Structure Boundary Detection▼aRepresentation Learning▼aMetric Learning	-
dc.subject	음악 구조 분석▼a음악 구조 경계 탐지▼a표현 학습▼a거리 학습	-
dc.title	Representation learning for boundary detection in music structure analysis	-
dc.title.alternative	음악 구조 분석에서 경계 탐지를 위한 표현 학습	-
dc.type	Thesis(Master)	-
dc.identifier.CNRN	325007	-
dc.description.department	한국과학기술원 :문화기술대학원,	-
dc.contributor.alternativeauthor	최민석	-

Appears in Collection: GCT-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Representation learning for boundary detection in music structure analysis음악 구조 분석에서 경계 탐지를 위한 표현 학습

KOASAS

Communities & Collections