Publication Date

2007

Document Type

Thesis

Committee Members

Brian Rigling (Advisor)

Degree Name

Master of Science in Engineering (MSEgr)

Abstract

Articulatory features describe the way in which the speech organs are used when producing speech sounds. Research has shown that incorporating this information into speech recognizers can lead to an improvement in system performance. The majority of previous work, however, has been limited to detecting articulatory features in a single language. In this thesis, Gaussian Mixture Models (GMMs) and Multi-Layer Perceptrons (MLPs) were used to detect articulatory features in English, German, Spanish, and Japanese. The outputs of the detectors were used to form the feature set for a Hidden Markov Model (HMM)-based phoneme recognizer. The best overall detection and recognition performance was obtained using MLPs with context. Compared to Mel-Frequency Cepstral Coefficient (MFCC)-based systems, the proposed feature sets yielded an increase of up to 4.39% correct and 5.37% accuracy when using monophone models, and an increase of up to 3.22% correct and 2.60% accuracy with triphone models. On a word recognition task, however, the MFCC systems performed better. Multilingual articulatory feature detectors were also created for all four languages using MLPs. An additional feature set was created using the multilingual detectors and evaluated on the same phoneme recognition task. Compared to the feature sets created with the language-dependent MLP detectors, the maximum decrease in system performance with monophone models was 1.44% correct and 1.72% accuracy on Japanese, and the maximum improvement in system performance with triphone models was 0.75% correct and 0.40% accuracy on Spanish. On a word recognition task, the feature sets created with the multilingual MLP detectors yielded a decrease of up to 3.75% correct and 6.01% accuracy. As a final experiment, two different procedures were investigated for combining the scores from the English GMM and MLP articulatory feature detectors. It was found that the detection performance for each articulatory feature can be improved by combining the scores from all GMM and MLP detectors.

Page Count

101

Department or Program

Department of Electrical Engineering

Year Degree Awarded

2007


Share

COinS