It is necessary for a computer to recognize continuous speech, to provide the most convenient way of communication for users. The major problem addressed in this study is the difficulty of segmenting continuous speech into primitive units for recognition. The exact segmentation of phonemes is almost impossible, especially when the speech is spoken without restrictions. Most of the recognizers search for optimal positions of the units during recognition by using dynamic programming or by shifting windows. Such processes usually take much more time than segmenting the speech before classification.
As a solution, we define a non-uniform unit and propose a segmentation method for the unit. A unit is defined as a segment which is cut out at stationary points of the speech, and have a transition part in the middle of it. It is segmented by using spectral transition measure without iterations or exhaustive search. A unit can have an arbitrary number of phonemes so it can absorb co-articulation effects which span for several phonemes.
To show the effectiveness of the unit, we implemente two recognition systems based on a knowledge-based and a connectionist approaches. In the knowledge-based system, the rules for recognizing units are represented by frames which describe the dynamic structures of the units. Then, fuzzy concepts are used for speech recognition in two ways. First, fuzzy reasoning is applied to the recognition of the basic unit. The second application of fuzzy concepts in this study is estimating fuzzy phoneme similarity relation for word spotting. We propose a method to evaluate the similarities of the pairs of Korean phonemes based on the similarities of the articulatory features. The similarities of the places and the manners of articulations of phoneme pairs are estimated and then the results are combined by using fuzzy operations to calculate the similarities of the phonemes.
In the neural network system, the segmentation and classification of the ...