Speech enhancement by linear transformation and constrained optimization
Date of Award
Doctor of Philosophy (Ph.D.)
Electrical and Computer Engineering
First Committee Member
Michael Scordilis, Committee Chair
The effective enhancement of noise-degraded speech is one of the most challenging problems in speech processing. The difficulty results from the fact that current methods tend to introduce audible distortion and artifacts in the processed signal. In this dissertation, three speech enhancement algorithms are proposed and evaluated. The objective is to enhance single channel speech degraded by additive noise. All of the three enhancement methods apply linear transformations to the noisy speech, enhancement is performed in the transformed domain by constrained optimization. Method one uses linear prediction (LP) analysis to transform the noisy speech to the LP residual domain. The distortion of clean speech residual is minimized subject to the constraints imposed on the power of noise residual. A common feature of both method two and three is that they apply short time Fourier Transforms to map the noisy speech from the time to the frequency domain, where a spectral weighting function is derived by constrained optimization. Algorithm two is perceptually motivated. The constraints are specified as both the speech distortion and residual noise be suppressed below the masking thresholds, which incorporate both temporal and simultaneous masking effects. Method three enhances the harmonics of voiced speech by incorporating harmonic structure into the constraints. Noise-flooring parameter is included in the spectral gain for enhancement of harmonics and control of residual noise level. The performances of the proposed algorithms are evaluated in terms of modified bark spectral distortion (MBSD) measures and ITU PESQ scores. Experimental results indicate that the proposed algorithms effectively improve speech quality for both white Gaussian and real world colored noise.
Engineering, Electronics and Electrical
Jin, Wen, "Speech enhancement by linear transformation and constrained optimization" (2006). Dissertations from ProQuest. 2343.