DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Nam, Juhan | - |
dc.contributor.advisor | 남주한 | - |
dc.contributor.author | Choi, Minsuk | - |
dc.date.accessioned | 2021-05-12T19:36:44Z | - |
dc.date.available | 2021-05-12T19:36:44Z | - |
dc.date.issued | 2020 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=910801&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/284009 | - |
dc.description | 학위논문(석사) - 한국과학기술원 : 문화기술대학원, 2020.2,[iv, 28 p. :] | - |
dc.description.abstract | Music has a structure that consists of functional sections such as verse or chorus in pop music. Detecting the boundaries between two functionally homogeneous sections is a front-end task toward complete music structure analysis. Boundary detection is typically conducted on a self-similarity matrix (SSM) computed from audio features that capture harmonic or timbral characteristics of a music track. Traditionally, hand-crafted features that explicitly extract the characteristics such as Mel-Frequency Cepstral Coefficients (MFCC), chroma have been common choices. In this paper, we propose a method to learn feature representations via deep neural networks to obtain more effective SSM. Specifically, we use a Siamese-style neural network with a triplet loss that consists of anchor, positive and negative examples to train the model. The anchor and positive samples are selected from the same or temporally close section in music structure whereas the negative samples are from the outside of the section that anchor and positive samples are selected. We show that this approach tends to render the audio features to bemore homogeneous within a section. Once we compute the SSM from the learned features, we apply a Gaussian checkerboard kernel to detect the structure boundary. We evaluate the performance of the proposed method on the SALAMI dataset. The results show that the propose method outperforms the traditional hand-crafted features when the same setup is used except the audio features. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.subject | Music Structure Analysis▼aMusic Structure Boundary Detection▼aRepresentation Learning▼aMetric Learning | - |
dc.subject | 음악 구조 분석▼a음악 구조 경계 탐지▼a표현 학습▼a거리 학습 | - |
dc.title | Representation learning for boundary detection in music structure analysis | - |
dc.title.alternative | 음악 구조 분석에서 경계 탐지를 위한 표현 학습 | - |
dc.type | Thesis(Master) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :문화기술대학원, | - |
dc.contributor.alternativeauthor | 최민석 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.