Representation learning for boundary detection in music structure analysis음악 구조 분석에서 경계 탐지를 위한 표현 학습

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 269
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorNam, Juhan-
dc.contributor.advisor남주한-
dc.contributor.authorChoi, Minsuk-
dc.date.accessioned2021-05-12T19:36:44Z-
dc.date.available2021-05-12T19:36:44Z-
dc.date.issued2020-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=910801&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/284009-
dc.description학위논문(석사) - 한국과학기술원 : 문화기술대학원, 2020.2,[iv, 28 p. :]-
dc.description.abstractMusic has a structure that consists of functional sections such as verse or chorus in pop music. Detecting the boundaries between two functionally homogeneous sections is a front-end task toward complete music structure analysis. Boundary detection is typically conducted on a self-similarity matrix (SSM) computed from audio features that capture harmonic or timbral characteristics of a music track. Traditionally, hand-crafted features that explicitly extract the characteristics such as Mel-Frequency Cepstral Coefficients (MFCC), chroma have been common choices. In this paper, we propose a method to learn feature representations via deep neural networks to obtain more effective SSM. Specifically, we use a Siamese-style neural network with a triplet loss that consists of anchor, positive and negative examples to train the model. The anchor and positive samples are selected from the same or temporally close section in music structure whereas the negative samples are from the outside of the section that anchor and positive samples are selected. We show that this approach tends to render the audio features to bemore homogeneous within a section. Once we compute the SSM from the learned features, we apply a Gaussian checkerboard kernel to detect the structure boundary. We evaluate the performance of the proposed method on the SALAMI dataset. The results show that the propose method outperforms the traditional hand-crafted features when the same setup is used except the audio features.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectMusic Structure Analysis▼aMusic Structure Boundary Detection▼aRepresentation Learning▼aMetric Learning-
dc.subject음악 구조 분석▼a음악 구조 경계 탐지▼a표현 학습▼a거리 학습-
dc.titleRepresentation learning for boundary detection in music structure analysis-
dc.title.alternative음악 구조 분석에서 경계 탐지를 위한 표현 학습-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :문화기술대학원,-
dc.contributor.alternativeauthor최민석-
Appears in Collection
GCT-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0