When we listen to unknown music contents on TV or computer, we often want to know some information about the music. However, it is usually difficult to get the desired information from service providers directly. Content-based MIR provides a solution for this problem. Therefore, various content-based audio retrieval techniques based on QBE have been required to efficiently identify an unknown music signal. In this thesis, we suggest two methods for music retrieval. One method is a MFCC-temporal method using the temporal characteristics of melody. The other method is a hybrid method based on pitch histogram and MFCC-VQ dynamic patterns: uses both static patterns and temporal patterns of melody for MIR.
Our features include pitch and MFCC for representing the characteristics of notes and we describe melody patterns by using pitch histogram and temporal sequence of codeword index. Then, we compute the similarity between test pattern and reference patterns. When compare with the patterns, the proper pattern matching method is especially important to get good performance. Therefore we also present appropriate pattern matching methods for our retrieval methods. In MFCC-VQ temporal method, a time alignment method is used to compensate for the temporal difference between two patterns by shifting the reference sequence. In addition, A modified ED technique is employed which divides the distance of two patterns by the weighted value which is the number of frames with the same MFCC-VQ index. In the hybrid method, we used a TSO method using the minimum sum of order index in the pitch histogram and MFCC-VQ temporal method as the retrieved result.
We have tested the proposed methods in small and broader search areas, which are two different TV drama OSTs and 1,005 popular songs, respectively. When we compare the proposed methods with baseline methods, the experimental results showed that the performance of our methods is better than that of the baseline methods in both s...