Publications

Refine Results

(Filters Applied) Clear All

Phase coherence in speech reconstruction for enhancement and coding applications

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, Speech Processing 1, 23-26 May 1989, pp. 207-209.

Summary

It has been shown that an analysis-synthesis system based on a sinusoidal representation leads to synthetic speech that is essentially perceptually indistinguishable from the original. A change in speech quality has been observed, however, when the phase relation of the sine waves is altered. This occurs in practice when sine waves are processed for speech enhancement (e.g., time-scale modification and reducing peak-to-RMS ratio) and for speech coding. This paper describes a zero-phase sinusoidal analysis-synthesis system which generates natural-sounding speech without the requirement of vocal tract phase. The method provides a basis for improving sound quality by providing different levels of phase coherence in speech reconstruction for time-scale modification, for a baseline system for coding, and for reducing the peak-to-RMS ration by dispersion.
READ LESS

Summary

It has been shown that an analysis-synthesis system based on a sinusoidal representation leads to synthetic speech that is essentially perceptually indistinguishable from the original. A change in speech quality has been observed, however, when the phase relation of the sine waves is altered. This occurs in practice when sine...

READ MORE

Mixed-phase deconvolution of speech based on a sine-wave model

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 2, 6-9 April 1987, pp. 649-652.

Summary

This paper describes a new method of deconvolving the vocal cord excitation and vocal tract system response. The technique relies on a sine-wave representation of the speech waveform and forms the basis of an analysis-synthesis method which yields synthetic speech essentially indistinguishable from the original. Unlike an earlier sinusoidal analysis-synthesis technique that used a minimum-phase system estimate, the approach in this paper generates a "mixed-phase" system estimate and thus an improved decomposition of excitation and system components. Since a mixed-phase system estimate is removed from the speech waveform, the resulting excitation residual is less dispersed than the previous sinusoidal-based excitation estimate of the more commonly used linear prediction residual. A method of time-varying linear filtering is given as an alternative to sinusoidal reconstruction, similar to conventional time-domain synthesis used in certain vocoders, but without the requirement of pitch and voicing decisions. Finally, speech modification with a mixed-phase system estimate is shown to be capable of more closely preserving waveform shape in time-scale and pitch transformations than the earlier approach.
READ LESS

Summary

This paper describes a new method of deconvolving the vocal cord excitation and vocal tract system response. The technique relies on a sine-wave representation of the speech waveform and forms the basis of an analysis-synthesis method which yields synthetic speech essentially indistinguishable from the original. Unlike an earlier sinusoidal analysis-synthesis...

READ MORE

Speech transformations based on a sinusoidal representation

Published in:
IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-34, No. 6, December 1986, pp. 1449-1464.

Summary

In this paper a new speech analysis/synthesis technique is presented which provides the basis for a general class of speech transformations including time-scale modification, frequency scaling, and pitch modification. These modifications can be performed with a time-varying change, permitting continuous adjustment of a speaker's fundamental frequency rate of articulation. The method is based on a sinusoidal representation of the speech production mechanism which has been shown to produce synthetic speech that preserves the waveform shape and is perceptually indistinguishable from the original. Although the analysis/synthesis system was originally designed for single speaker signals, it is also capable ot recovering and modifying non-speech signals such as music, multiple speakers, marine biologic sounds, and speakers in the presence of interferences such as noise and musical backgrounds.
READ LESS

Summary

In this paper a new speech analysis/synthesis technique is presented which provides the basis for a general class of speech transformations including time-scale modification, frequency scaling, and pitch modification. These modifications can be performed with a time-varying change, permitting continuous adjustment of a speaker's fundamental frequency rate of articulation. The...

READ MORE

Speech analysis/synthesis based on a sinusoidal representation

Published in:
IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-34, No. 4, August 1986, pp. 744-754.

Summary

A sinusoidal model for the speech waveform is used to develop a new analysis/synthesis technique that is characterized by the amplitudes, frequencies, and phases of the component sine waves. These parameters are estimated from the short-time Fourier transform using a simple peak-picking algorithm. Rapid changes in the highly resolved spectral components are tracked using the concept of "birth" and "death" of the underlying sine waves. For a given frequency track a cubic function is used to unwrap and interpolate the phase such that the phase track is maximally smooth. This phase function is applied to a sine-wave generator, which is amplitude modulated and added to the other sine waves to give the final speech output. The resulting synthetic waveform preserves the general waveform shape and is essentially perceptually indistinguishable from the original speech. Furthermore, in the presence of noise the perceptual characteristics of the speech as well as the noise are maintained. In addition, it was found that the representation was sufficiently general that high-quality reproduction was obtained for a larger class of inputs including: two overpallping, superposed speech waveforms; music waveforms; speech in musical backgrounds; and certain marine biologic sounds. Finally, the analysis/synthesis system forms the basis for new approaches to the problems of speech transformations including time-scale and pitch-scale modification, and midrate speech coding.
READ LESS

Summary

A sinusoidal model for the speech waveform is used to develop a new analysis/synthesis technique that is characterized by the amplitudes, frequencies, and phases of the component sine waves. These parameters are estimated from the short-time Fourier transform using a simple peak-picking algorithm. Rapid changes in the highly resolved spectral...

READ MORE

Frequency sampling of the short-time Fourier-transform magnitude for signal reconstruction

Published in:
J. Opt. Soc. Amer., Vol. 73, November 1983, pp. 1523- 1526.

Summary

Unique recovery of a signal from the magnitude (modulus) of the Fourier transform has been of long-standing interest in image and optical processing in which Fourier-transform phase is lost or difficult to measure. We investigate an alternative problem of recovering a signal from the Fourier-transform magnitude of overlapping regions of the signal, i.e., from the short-time (or -space) Fourier-transform magnitude. Recently it was established that a discrete-time signal x (n) can be uniquely obtained under mild restrictions from its short-time Fourier-transform magnitude. In this paper we extend this result to the case when the short-time Fourier-transform magnitude is known at only one or two frequencies for each n. We also present a recursive algorithm for recovering a sequence from such samples and demonstrate the algorithm with an example.
READ LESS

Summary

Unique recovery of a signal from the magnitude (modulus) of the Fourier transform has been of long-standing interest in image and optical processing in which Fourier-transform phase is lost or difficult to measure. We investigate an alternative problem of recovering a signal from the Fourier-transform magnitude of overlapping regions of...

READ MORE

Object detection by two-dimensional linear prediction

Published in:
MIT Lincoln Laboratory Report TR-632

Summary

An important component of any automated image analysis system is the detection and classification of objects. In this report, we consider the first of these problems where the specific goal is to detect anomalous areas (e.g., man-made objects) in textured backgrounds such as trees, grass, and fields of aerial photographs. Our detection algorithm relies on a significance test which adapts itself to the changing background in such a way that a constant false alarm rate is maintained. Furthermore, this test has a potentially practical implementation since it can be expressed in terms of the residuals of an adaptive two-dimensional linear predictor. The algorithm is demonstrated with both synthetic and realworld images.
READ LESS

Summary

An important component of any automated image analysis system is the detection and classification of objects. In this report, we consider the first of these problems where the specific goal is to detect anomalous areas (e.g., man-made objects) in textured backgrounds such as trees, grass, and fields of aerial photographs...

READ MORE

Implementation of 2-D digital filters by iterative methods

Published in:
IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-30, No. 3, June 1982, pp. 473-87.

Summary

A two-dimensional (2-D) rational filter can be implemented by an iterative computation involving only finite-extent impulse response (FIR) filtering operations, provided a certain convergence criterion is met. In this paper, we generalize this procedure so that the convergence criterion is satisfied for any stable 2-D rational transfer function. One formulation which guarantees convergence invokes a relaxed form of the iterative computation along with prefiltering the numerator and denominator polynomials of the rational transfer function. This implementation may be applied with a frequency-varying relaxation parameter for increasing the rate of convergence. An alternative generalization uses several previously computed iterates, unlike our first modification which utilizes only the most recently computed iterate. This formulation can potentially guarantee convergence and also increase the convergence rate without the requirement of prefiltering. Another extension of the iterative computation incorporates constraints (e.g., positivity or finite extent) on the output of each iteration. Proof of convergence of such constrained iterations relies on the concept of a nonexpansive operator. In particular, the error introduced within the converging solution resulting from a finite-extent constraint is shown to satisfy a homogeneous partial difference equation. Finally, this error computation leads to an important link between our iterative implementation with constraints and an iterative solution to partial difference equations (e.g., Laplace's equation) with known boundary conditions.
READ LESS

Summary

A two-dimensional (2-D) rational filter can be implemented by an iterative computation involving only finite-extent impulse response (FIR) filtering operations, provided a certain convergence criterion is met. In this paper, we generalize this procedure so that the convergence criterion is satisfied for any stable 2-D rational transfer function. One formulation...

READ MORE

Signal reconstruction from the short-time Fourier transform magnitude

Published in:
IEEE-ASSP Int. Conf., 2 May 1982.

Summary

In this paper, a signal is shown to be uniquely represented by the magnitude of its short-time Fourier transform (STFT) under mild restrictions on the signal and the analysis window of the STFT. Furthermore, various algorithms are developed which reconstruct signal from appropriate samples of the STFT magnitude. Several of the algorithms can also be used to obtain signal estimates from the processed STFT magnitude, which generally does not have a valid short-time structure. These algorithms are successfully applied to the time-scale modification and noise reduction problems in speech processing. Finally, the results presented here have similar potential for other applications areas, including those with multidimensional signals.
READ LESS

Summary

In this paper, a signal is shown to be uniquely represented by the magnitude of its short-time Fourier transform (STFT) under mild restrictions on the signal and the analysis window of the STFT. Furthermore, various algorithms are developed which reconstruct signal from appropriate samples of the STFT magnitude. Several of...

READ MORE

Iterative techniques for minimum phase signal reconstruction from phase or magnitude

Published in:
IEEE Trans. on Acoustics, Speech & Signal Processing, Vol. ASSP-29, No.6, Dec. 1981, pp.1187-1193.

Summary

In this paper, we develop iterative algorithms for reconstructing a minimum phase sequence from pthhea se or magnitude of its Fourier transform. These iterative solutions involve repeatedly imposing a causality constraint in the time domain and incorporating the known phase or magnitude function in the frequency domain. This approach is the basis of a new means of computing the Hilbert transform of the log-magnitude or phase of the Fourier transform of a minimum phase sequence which does not require phase unwrapping. Finally, we discuss the potential use of this iterative computation in determining samples of the unwrapped phase of a mixed phase sequence.
READ LESS

Summary

In this paper, we develop iterative algorithms for reconstructing a minimum phase sequence from pthhea se or magnitude of its Fourier transform. These iterative solutions involve repeatedly imposing a causality constraint in the time domain and incorporating the known phase or magnitude function in the frequency domain. This approach is...

READ MORE

Recursive two-dimensional signal reconstruction from linear system input and output magnitudes

Published in:
Proc. IEEE, Vol. 69, No. 5, May 1981, pp. 667-668.

Summary

A recursive algorithm is presented for reconstructing a two-dimensional complex signal from its magnitude and the magnitude of the output of a known linear shift-invariant system whose input is the desired signal. The recursion has a simple geometric interpretation, and is easily extended to causal, shift-varying systems.
READ LESS

Summary

A recursive algorithm is presented for reconstructing a two-dimensional complex signal from its magnitude and the magnitude of the output of a known linear shift-invariant system whose input is the desired signal. The recursion has a simple geometric interpretation, and is easily extended to causal, shift-varying systems.

READ MORE