Phase coherence in speech reconstruction for enhancement and coding applications
May 26, 1989
Conference Paper
Author:
Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, Speech Processing 1, 23-26 May 1989, pp. 207-209.
R&D Area:
Summary
It has been shown that an analysis-synthesis system based on a sinusoidal representation leads to synthetic speech that is essentially perceptually indistinguishable from the original. A change in speech quality has been observed, however, when the phase relation of the sine waves is altered. This occurs in practice when sine waves are processed for speech enhancement (e.g., time-scale modification and reducing peak-to-RMS ratio) and for speech coding. This paper describes a zero-phase sinusoidal analysis-synthesis system which generates natural-sounding speech without the requirement of vocal tract phase. The method provides a basis for improving sound quality by providing different levels of phase coherence in speech reconstruction for time-scale modification, for a baseline system for coding, and for reducing the peak-to-RMS ration by dispersion.