Publications

Refine Results

(Filters Applied) Clear All

Time-scale modification of complex acoustic signals

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 1, Plenary, Special, Audio, Underwater Acoustics, VLSI, Neural Networks, 27-30 April 1993, pp. 213-216.

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The technique constrains the modified signal to take on a specified spectral characteristic while imposing a time-scaled version of the original temporal envelope. Both full-band and sub-band representations of the temporal envelope are considered. In the full-band case, the modified signal is obtained by appropriate selection of its Fourier transform phase. In the sub-band case, using locations of maxima in the sub-band temporal envelopes, the phase of each bandpass signal is formed to preserve "events" in the envelope of the composite signal. The approach is applied to synthetic and actual short-duration acoustic signals consisting of closely-spaced and overlapping sequential time components.
READ LESS

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The technique constrains the modified signal to take on a specified spectral characteristic while imposing a time-scaled version of the original temporal envelope. Both full-band and sub-band representations of the temporal envelope are...

READ MORE

Time-scale modification with temporal envelope invariance

Published in:
Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 17-20 October 1993, pp. 127-130.

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The method preserves the time-scaled temporal envelope of a signal and for enhancement capitalizes on the perceptual importance of a signal's temporal structure. The basis for the approach is a sub-band representation whose channel phases are controlled to shape the temporal envelope of the time-scaled signal. The phase control is derived from locations of events which occur within filterbank outputs. A frame-based generalization of the method imposes phase consistency across consecutive synthesis frames. The approach is applied to synthetic and actual short-duration acoustic signals consisting of closely-spaced and overlapping sequential time components.
READ LESS

Summary

A new approach is introduced for time-scale modification of short-duration complex acoustic signals to improve their audibility. The method preserves the time-scaled temporal envelope of a signal and for enhancement capitalizes on the perceptual importance of a signal's temporal structure. The basis for the approach is a sub-band representation whose...

READ MORE

Shape invariant time-scale and pitch modification of speech

Published in:
IEEE Trans. Signal Process., Vol. 40, No. 3, March 1992, pp. 497-510.

Summary

The simplified linear model of speech production predicts that when the rate of articulation is changed, the resulting waveform takes on the appearance of the original, except for a change in the time scale. The goal of this paper is to develop a time-scale modification system that preserves this shape-invariance property during voicing. This is done using a version of the sinusoidal analysis-synthesis system that models and independently modifies the phase contributions of the vocal tract and vocal cord excitation. An important property of the system is its capability of performing time-varying rates of change. Extensions of the method are applied to fixed and time-varying pitch modification of speech. The sine-wave analysis-synthesis system also allows for shape-invariant joint time-scale and pitch modification, and allows for the adjustment of the time scale and pitch according to speech characteristics such as the degree of voicing.
READ LESS

Summary

The simplified linear model of speech production predicts that when the rate of articulation is changed, the resulting waveform takes on the appearance of the original, except for a change in the time scale. The goal of this paper is to develop a time-scale modification system that preserves this shape-invariance...

READ MORE

Low-rate speech coding based on the sinusoidal model

Published in:
Chapter 6 in Advances in Speech Signal Processing, Marcel Dekker, Inc., 1992, pp. 165-208.

Summary

One approach to the problem of representation of speech signals is to use the speech production model in which speech is viewed as the result of passing a glottal excitation waveform through a time-varying linear filter that models the resonant characteristics of the vocal tract. In many applications it suffices to assume that the glottal excitation can be in one of two possible states corresponding to voiced or unvoiced speech. In attempts to design high-quality speech coders at the midband rates, generalizations of the binary excitation model have been developed. One such approach is multipulse (Atal and Remde, 1982) which uses more than one pitch pulse to model voiced speech and a possibly random set of pulses to model unvoiced speech. Code excited linear prediction (CELP) (Schroeder and Atal, 1985) is another representation which models the excitation as one of a number of random sequences or "codewords" superimposed on periodic pitch pulses. In this chapter the goal is also to generalize the model for the glottal excitation; but instead of using impulses as in multipulse or random sequences as in CELP, the excitation is assumed to be composed of sinusoidal components of arbitrary amplitudes, frequencies, and phases (McAulay and Quatieri, 1986).
READ LESS

Summary

One approach to the problem of representation of speech signals is to use the speech production model in which speech is viewed as the result of passing a glottal excitation waveform through a time-varying linear filter that models the resonant characteristics of the vocal tract. In many applications it suffices...

READ MORE

Speech nonlinearities, modulations, and energy operators

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, 14-17 May 1991, pp. 421-424.

Summary

In this paper, we investigate an AM-FM model for representing modulations in speech resonances. Specifically, we propose a frequency modulation (FM) model for the time-varying formants whose amplitude varies as the envelope of an amplitude-modulated (AM) signal. To detect the modulations we apply the energy operator (psi)(x) = (x)^2 - xx and its discrete counterpart. We found that psi can approximately track the envelope of AM signals, the instantaneous frequency of FM signals, and the product of these two functions in the general case of AM-FM signals. Several experiments are reported on the applications of this AM-FM modeling to speech signals, bandpass filtered via Gabor filtering.
READ LESS

Summary

In this paper, we investigate an AM-FM model for representing modulations in speech resonances. Specifically, we propose a frequency modulation (FM) model for the time-varying formants whose amplitude varies as the envelope of an amplitude-modulated (AM) signal. To detect the modulations we apply the energy operator (psi)(x) = (x)^2 -...

READ MORE

Peak-to-rms reduction of speech based on a sinusoidal model

Published in:
IEEE Trans. Signal Process., Vol. 39, No. 2, February 1991, pp. 273-288.

Summary

In a number of applications, a speech waveform is processed using phase dispersion and amplitude compression to reduce its peak-to-rms ratio so as to increase loudness and intelligibility while minimizing perceived distortion. In this paper, a sinusoidal-based analysis/synthesis system is used to apply a radar design solution to the problem of dispersing the phase of a speech waveform. Unlike conventional methods of phase dispersion, this solution technique adapts dynamically to the pitch and spectral characteristics of the speech, while maintaining the original spectral envelope. The solution can also be used to drive the sine-wave amplitude modification for amplitude compression, and is coupled to the desired shaping of the speech spectrum. The new dispersion solution, when integrated with amplitude compression, results in a significant reduction in the peak-to-rms ratio of the speech waveform with acceptable loss in quality. Application of a real-time prototype sine-wave preprocessor to AM radio broadcasting is described.
READ LESS

Summary

In a number of applications, a speech waveform is processed using phase dispersion and amplitude compression to reduce its peak-to-rms ratio so as to increase loudness and intelligibility while minimizing perceived distortion. In this paper, a sinusoidal-based analysis/synthesis system is used to apply a radar design solution to the problem...

READ MORE

Short-time signal representation by nonlinear difference equations

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 3, Digital Signal Processing, 3-6 April 1990, pp. 1551-1554.

Summary

The solution of a nonlinear difference equation can take on complicated deterministic behavior which appears to be random for certain values of the equation's coefficients. Due to the sensitivities to initial conditions of the output of such "chaotic" systems, it is difficult to duplicate the waveform structure by parameter analysis and waveform synthesis techniques. In this paper, methods are investigated for short-time analysis and synthesis of signals from a class of second-order difference equations with a cubic nonlinearity. In analysis, two methods are explored for estimating equation coefficients: (1) prediction error minimization (a linear estimation problem) and (2) waveform error minimization (a nonlinear estimation problem). In the latter case, which improves on the prediction error solution, an iterative analysis-by-synthesis method is derived which allows as free variables initial conditions, as well as equation coefficients. Parameter estimates from these techniques are used in sequential short-time synthesis procedures. Possible application to modeling "quasi-periodic" behavior in speech waveforms is discussed.
READ LESS

Summary

The solution of a nonlinear difference equation can take on complicated deterministic behavior which appears to be random for certain values of the equation's coefficients. Due to the sensitivities to initial conditions of the output of such "chaotic" systems, it is difficult to duplicate the waveform structure by parameter analysis...

READ MORE

Noise reduction using a soft-decision sine-wave vector quantizer

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 2, Speech Processing 2; VLSI, Audio and Electroacoustics, 3-6 April 1990, pp. 821-824.

Summary

The need for noise reduction arises in speech communication channels, such as ground-to-air transmission and ground-based cellular radio, to improve vocoder quality and speech recognition accuracy. In this paper, noise reduction is performed in the context of a high-quality harmonic serc-phase sine-wave analysis/synthesis system which is characterized by sine-wave amplitudes, a voicing probability, and a fundamental frequency. Least-squared error estimation of a harmonic sine-wave representation leads to a "soft decision" template estimate consisting of sine-wave amplitudes and a voicing probability. The least-squares solution is modified to use template-matching with "nearest neighbors." The reconstruction is improved by using the modified least-squares solution only in spectral regions with low signal-to-noise ratio. The results, although preliminary, provide evidence that harmonic zero-phase sine-wave analysis/synthesis, combined with effective estimation of sine-wave amplitudes and probability of voicing, offers a promising approach to noise reduction.
READ LESS

Summary

The need for noise reduction arises in speech communication channels, such as ground-to-air transmission and ground-based cellular radio, to improve vocoder quality and speech recognition accuracy. In this paper, noise reduction is performed in the context of a high-quality harmonic serc-phase sine-wave analysis/synthesis system which is characterized by sine-wave amplitudes...

READ MORE

An approach to co-channel talker interference suppression using a sinusoidal model for speech

Published in:
IEEE Trans. Acoust. Speech Signal Process., Vol. 38, No. 1, January 1990, pp. 56-59.

Summary

This paper describes a new approach to co-channel talker interference suppression on a sinusoidal representation of speech. The technique fits a sinusoidal model to additive vocalic speech segments such that the least mean-squared error between the model and the summed waveforms is obtained. Enhancement is achieved by synthesizing a waveform from the sine waves attributed to the desired speaker. Least-squares estimation is applied to obtain sine-wave amplitudes and phases of both talkers, based on either a priori sine-wave frequencies or a priori fundamental frequency contours. When the frequencies of the two waveforms are closely spaced, the performance is significantly improved by exploiting the time evolution of the sinusoidal parameters across multiple analysis frames. The least-squared error approach is also extended, under restricted conditions, to estimate fundamental frequency contours of both speakers from the summed waveforms. The results obtained, although limited in their scope, provide evidence that the sinusoidal analysis/synthesis model with effective parameter estimation techniques offers a promising approach to the problem of co-channel talker interference suppression over a range of conditions.
READ LESS

Summary

This paper describes a new approach to co-channel talker interference suppression on a sinusoidal representation of speech. The technique fits a sinusoidal model to additive vocalic speech segments such that the least mean-squared error between the model and the summed waveforms is obtained. Enhancement is achieved by synthesizing a waveform...

READ MORE

Far-echo cancellation in the presence of frequency offset (full duplex modem)

Published in:
IEEE Trans. Commun., Vol. 37, No. 6, June 1989, pp. 635-644.

Summary

In this paper, we present a design for a full-duplex echo-cancelling data modem based on a combined adaptive reference algorithm and adaptive channel equalizer. The adaptive reference algorithm has the advantage that interference to the echo canceller caused by the far-end signal can be eliminated by subtracting an estimate of the far-end signal based on receiver decisions. This technique provides a new approach for full-duplex far-echo cancellation in which the far echo can be cancelled in spite of carrier frequency offset. To estimate the frequency offset, the system uses a separate receiver structure for the far echo which provides equalization of the far-echo channel and tracks the frequency offset in the far echo. The feasibility of the echo-cancelling algorithms is demonstrated by computer simulation with realistic channel distortions and with 4800 bits/s data transmission at which rate frequency offset in the far echo becomes important.
READ LESS

Summary

In this paper, we present a design for a full-duplex echo-cancelling data modem based on a combined adaptive reference algorithm and adaptive channel equalizer. The adaptive reference algorithm has the advantage that interference to the echo canceller caused by the far-end signal can be eliminated by subtracting an estimate of...

READ MORE