Recursive Estimation of Time-Varying Environments for Robust Speech Recognition
An EM-type of recursive estimation algorithm is formulated in the DFT domain for joint estimation of time-varying parameters of distortion channel and additive noise from online degraded speech. Speech features are estimated from the posterior estimates of short-time speech power spectra in an on-the-fly fashion. Experiments were performed on speaker-independent continuous speech recognition using features of perceptually based linear prediction cepstral coefficients, log energy, and temporal regression coefficients. Speech data were taken from the TIMIT database and were degraded by simulated time-varying channel and noise. Experimental results showed significant improvement in recognition word accuracy due to the proposed recursive estimation as compared with the results from direct recognition using a baseline system and from performing speech feature estimation using a batch EM algorithm.
& Yen, K.
(2001). Recursive Estimation of Time-Varying Environments for Robust Speech Recognition. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 1, 225-228.