DC Field | Value | Language |
---|---|---|
dc.contributor.author | Cho, Janghoon | ko |
dc.contributor.author | Yoo, Chang-Dong | ko |
dc.date.accessioned | 2015-06-25T06:40:34Z | - |
dc.date.available | 2015-06-25T06:40:34Z | - |
dc.date.created | 2015-05-06 | - |
dc.date.created | 2015-05-06 | - |
dc.date.created | 2015-05-06 | - |
dc.date.issued | 2015-05 | - |
dc.identifier.citation | IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, v.23, no.5, pp.828 - 839 | - |
dc.identifier.issn | 2329-9290 | - |
dc.identifier.uri | http://hdl.handle.net/10203/199129 | - |
dc.description.abstract | This paper considers the underdetermined blind source separation (BSS) of convolutively mixed super-Gaussian signals that include speech, audio, and various other sparse signals. Here, the separation is performed in three steps. In the first and second steps, the mixing matrix and the sources at each time-frequency location are estimated by minimizing the Bayes risk (or the posterior risk) with squared loss. In the final third step, the permutation alignment is conducted by considering the correlation between adjacent spectral bins as in many conventional algorithms. To overcome any computationally intractable integrations involving a complex-valued super-Gaussian source prior, the posterior distribution of the sources is approximated as a mixture of super-Gaussians. The posterior means of the mixing matrix and the sources are obtained with Metropolis-Hastings within Gibbs sampling and the weighted sum of individual super-Gaussians, respectively. Overall, this approximation leads to a separation that is computationally lighter than and as accurate as the algorithm without the approximation. The simulation results of the synthetically generated data in a virtual room with reverberation show that the estimates of the mixing matrix in the first step and the sources in the second step are more accurate than the estimates from the state-of-the-art algorithms in terms of the mixing error ratio (MER) and the signal-to-distortion ratio (SDR). The experiment was also conducted with recorded data in a real room environment using a public benchmark dataset. Results show that the proposed algorithm gives a better performance compared to the state-of-the-art algorithms in terms of the SDR. | - |
dc.language | English | - |
dc.publisher | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC | - |
dc.title | Underdetermined Convolutive BSS: Bayes Risk Minimization Based on a Mixture of Super-Gaussian Posterior Approximation | - |
dc.type | Article | - |
dc.identifier.wosid | 000352281500002 | - |
dc.identifier.scopusid | 2-s2.0-84954448989 | - |
dc.type.rims | ART | - |
dc.citation.volume | 23 | - |
dc.citation.issue | 5 | - |
dc.citation.beginningpage | 828 | - |
dc.citation.endingpage | 839 | - |
dc.citation.publicationname | IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | - |
dc.identifier.doi | 10.1109/TASLP.2015.2409778 | - |
dc.contributor.localauthor | Yoo, Chang-Dong | - |
dc.description.isOpenAccess | N | - |
dc.type.journalArticle | Article | - |
dc.subject.keywordAuthor | Bayesian estimation | - |
dc.subject.keywordAuthor | blind source separation (BSS) | - |
dc.subject.keywordAuthor | cocktail party problem | - |
dc.subject.keywordAuthor | underdetermined convolutive mixture | - |
dc.subject.keywordPlus | BLIND SOURCE SEPARATION | - |
dc.subject.keywordPlus | AUDIO SOURCE SEPARATION | - |
dc.subject.keywordPlus | TIME-FREQUENCY MASKING | - |
dc.subject.keywordPlus | OVERCOMPLETE REPRESENTATIONS | - |
dc.subject.keywordPlus | DOMAIN | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.