Human Auditory System inspired Neural Network for Text-Independent Speaker Recognition

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 123
  • Download : 0
The Human has a special auditory system that performs high speaker recognition rates in various environments. In the inner ear, a traveling wave appears as multiple waves that are bandpass filtered by the basilar membrane. The inner hair cells convert these waves to neural firings only waves greater than a certain value due to its structural characteristic. This paper proposes a neural network architecture using a 1-D convolution layer similar to the inner ear of human auditory system. The first convolutional neural network (CNN) layer with a kernel similar to sinc function is applied to implement bandpass filters of the basilar membrane. In order to realize the characteristics of the inner hair cell, only outputs greater than zero were applied as inputs of the next CNN layer. However, it is inefficient to discard half of the input, so another architecture using second CNN layer inputs created by concatenated outputs greater than zero and outputs less than zero is also proposed. The two architectures are trained and verified the speaker recognition performance using the TIMIT dataset. As a result, the both structures showed better recognition performance than the conventional speaker recognition system performance, and the latter structure showed better performance than the former.
Publisher
International Institute of Noise Control Engineering
Issue Date
2020-08-24
Language
English
Citation

Inter-Noise 2020

URI
http://hdl.handle.net/10203/279256
Appears in Collection
ME-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0