Abstract
This work presents a robust normalization technique by cascading a speech enhancement method followed by a feature vector normalization algorithm. To provide speech enhancement the Spectral Subtraction (SS) algorithm is used; this method reduces the effect of additive noise by performing a subtraction of the noise spectrum estimate over the complete speech spectrum. On the other hand, an empirical feature vector normalization technique known as PD-MEMLIN (Phoneme-Dependent Multi-Enviroment Models based LInear Normalization) has also shown to be effective. PD-MEMLIN models clean and noisy spaces employing Gaussian Mixture Models (GMMs), and estimates a set of linear compensation transformations to be used to clean the signal. The proper integration of both approaches is studied and the final design, PD-MEEMLIN (Phoneme-Dependent Multi-Enviroment Enhanced Models based LInear Normalization), confirms and improves the effectiveness of both approaches. The results obtained show that in very high degraded speech PD-MEEMLIN outperforms the SS by a range between 11.4% and 34.5%, and for PD-MEMLIN by a range between 11.7% and 24.84%. Furthemore, in moderate SNR, i.e. 15 or 20 dB, PD-MEEMLIN is as good as PD-MEMLIN and SS techniques.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Boll, S.: Suppression of Acoustic Noise in Speech Using Spectral Subtraction. IEEE Trans. ASSP 27, 113–120 (1979)
Droppo, J., Deng, L., Acero, A.: Evaluation of the Splice Algorithm on the Aurora2 Database. In: Proc. Eurospeech, vol. 1 (2001)
Gales, M.J.F., Young, S.: Cepstral Parameter Compensation for HMM Recognition in Noise. Speech Communication 12(3), 231–239 (1993)
Moreno, P.J., Raj, B., Gouvea, E., Stern, R.M.: Multivariate-Gaussian-Based Cepstral Normalization for Robust Speech Recognition. Department of Electrical and Computer Engineering & School of Computer Science. Carnegie Mellon University
Hermansky, H., Morgan, N.: RASTA Processing of Speech. IEEE Transactions on Speech and Audio Processing 2(4), 578–589 (1994)
Nolazco-Flores, J., Young, S.: Continuous Speech Recognition in Noise Using Spectral Subtraction and HMM adaptation. In: ICASSP, pp. I.409–I.412 (1994)
Buera, L., Lleida, E., Miguel, A., Ortega, A.: Multienvironment Models Based LInear Normalization for Speech Recognition in Car Conditions. In: Proc. ICASSP (2004)
Buera, L., Lleida, E., Miguel, A., Ortega, A.: Robust Speech Recognition in Cars Using Phoneme Dependent Multienvironment LInear Normalization. In: Proceedings of Interspeech, Lisboa, Portugal, pp. 381–384 (2005)
Martin, R.: Spectral Subtraction Based on Minimum Statistics. In: Proc. Eur. Signal Processing Conf. pp. 1182–1185 (1994)
Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing, pp. 504–512. Prentice Hall PTR, Englewood Cliffs (2001)
Martin, R.: Noise Power Spectral Density Estimation Based on Optimal Smoothing and Minimum Statistics. IEEE Transactions on Speech and Audio Processing, vol. 9(5) (2000)
Berouti, M., Schwartz, R., Makhoul, J.: Enhancement of Speech Corrupted by Acoustic Noise. In: Proc. IEEE Conf. ASSP, pp. 208–211 (1979)
Hirsch, H.G., Pearce, D.: The AURORA Experimental Framework for the Performance Evaluations of Speech Recognition Systems Under Noisy Condidions. In: ISCA ITRW ASR2000, Automatic Speech Recognition: Challenges for the Next Millennium, Paris, France (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Hernández, I., García, P., Nolazco, J., Buera, L., Lleida, E. (2007). Robust Automatic Speech Recognition Using PD-MEEMLIN. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2007. Lecture Notes in Computer Science, vol 4478. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72849-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-72849-8_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72848-1
Online ISBN: 978-3-540-72849-8
eBook Packages: Computer ScienceComputer Science (R0)Springer Nature Proceedings Computer Science
Keywords
- Speech Recognition
- Gaussian Mixture Model
- Automatic Speech Recognition
- Acoustic Model
- Speech Enhancement
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

