Phase coherence in speech reconstruction for enhancement and coding applications

May 26, 1989

Conference Paper

Author:

…

Robert J. McAulay

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, Speech Processing 1, 23-26 May 1989, pp. 207-209.

R&D Area:

Cyber Security and Information Sciences

R&D Group:

Artificial Intelligence Technology and Systems

Phase coherence in speech reconstruction for enhancement and coding applications

Summary

It has been shown that an analysis-synthesis system based on a sinusoidal representation leads to synthetic speech that is essentially perceptually indistinguishable from the original. A change in speech quality has been observed, however, when the phase relation of the sine waves is altered. This occurs in practice when sine waves are processed for speech enhancement (e.g., time-scale modification and reducing peak-to-RMS ratio) and for speech coding. This paper describes a zero-phase sinusoidal analysis-synthesis system which generates natural-sounding speech without the requirement of vocal tract phase. The method provides a basis for improving sound quality by providing different levels of phase coherence in speech reconstruction for time-scale modification, for a baseline system for coding, and for reducing the peak-to-RMS ration by dispersion.