Publications

Refine Results

(Filters Applied) Clear All

A phrase recognizer using syllable-based acoustic measurements

Published in:
IEEE Trans. Acoust. Speech Signal Process., Vol. ASSP-26, No. 5, October 1978, pp. 409-418.

Summary

A system for the recognition of spoken phrases is described. The recognizer assumes that the input utterance contains one of a known set of allowable phrases, which may be spoken within a longer carrier sentence. Analysis is performed on a syllable-by-syllable basis with only the strong syllables considered in the recognition process. Each strong syllable is represented in terms of a set of distinguishing acoustic measurements taken at time points in and around the syllable nucleus. Phrases are represented as sequences of strong syllables. All parameters used in recognition are derived from LPC coefficients. Input speech is limited to 3.3 kHZ upper frequency. Recognition is completed within 1-3 s after the utterance is spoken. An interactive training facility allows flexible composition of key phrase sets. Testing was performed for a number of phrase sets each containing ten or fewer phrases, and included equal numbers of talkers used in training and talkers not used in training. Average phrase recognition accuracy was 95 percent when parameters were derived from unquantized (i.e., 16 bit) LPC coefficients and 90 percent when the LPC coefficients were transmitted to the recognizer across the ARPA network at 3500 bits/s. The recognizer has been incorporated into a user interface system where the parameters required to set up a point-to-point ARPANET voice connection can be established remotely by voice.
READ LESS

Summary

A system for the recognition of spoken phrases is described. The recognizer assumes that the input utterance contains one of a known set of allowable phrases, which may be spoken within a longer carrier sentence. Analysis is performed on a syllable-by-syllable basis with only the strong syllables considered in the...

READ MORE

A linear prediction vocoder with voice excitation

Published in:
Proc. EASCON, 29 September - 1 October 1975, pp. 30-a-30-g.

Summary

A speech bandwidth compression system, which employs voice excitation in conjunction with a Linear Predictive Coding (LPC) parameterization of the vocal tract filter, is described. To generate the excitation signal, the transmitted speech baseband is broadened at the receiver with a nonlinear distorter, and spectrally flattened by means of an adaptive inverse filter whose parameters are obtained through LPC analysis of the distorted baseband. The voice-excited linear prediction (VELP) system has been implemented in real time on the Fast Digital Processor at Lincoln Laboratory. A detailed description of an 8 kbps version of VELP is given. VELP offers promise as a good quality, medium rate speech compression system which, by avoiding the pitch problem, performs relatively well for telephone quality input speech.
READ LESS

Summary

A speech bandwidth compression system, which employs voice excitation in conjunction with a Linear Predictive Coding (LPC) parameterization of the vocal tract filter, is described. To generate the excitation signal, the transmitted speech baseband is broadened at the receiver with a nonlinear distorter, and spectrally flattened by means of an...

READ MORE

A system for acoustic-phonetic analysis of continuous speech

Published in:
Proc. IEEE Symp. on Speech Recognition, 15-19 April 1974, pp. 54-67.

Summary

A system for acoustic-phonetic analysis of continuous speech is being developed to serve as part of an automatic speech understanding system. The acoustic system accepts the speech waveform as an input and produces as output a string of phoneme-like units referred to as acoustic phonetic elements (APEL'S). This paper should be considered as a progress report, since the system is still under active development. The initial phase of the acoustic analysis consists of signal processing and parameter extraction, and includes spectrum analysis via linear prediction, computation of a number of parameters of the spectrum, and fundamental frequency extraction. This is followed by a preliminary segmentation of the speech into a few broad acoustic categories and formant tracking during vowel-like segments. The next phase consists of more detailed segmentation and classification intended to meet the needs of subsequent linguistic analysis. The preliminary segmentation and segment classification yield the following categories: vowel-like sound; volume dip within vowel-like sound; fricative-like sound; stop consonants, including silence or voice bar, and associated burst. These categories are produced by a deviation tree based upon energy measurements in selected frequency bands, derivatives and ratios of these measurements, a voicing detector, and a few editing rules. The more detailed classification algorithms include: 1) detection and identification of some diphthongs, semivowels, and nasals, through analysis of formant motions, positions, and amplitudes; 2) a vowel identifier, which determines three ranked choices for each vowel based on a comparison of the formant positions in the detected vowel segment to stored formant positions in a speaker-normalized vowel table; 3) a fricative identifier, which employs measurement of relative spectral energies in several bands to group the fricative segments into phoneme-like categories; 4) stop consonant classification based on the properties of the plosive burst. The above algorithms have been tested on a substantial corpus of continuous speech data. Performance results, as well as detailed descriptions of the algorithms are given.
READ LESS

Summary

A system for acoustic-phonetic analysis of continuous speech is being developed to serve as part of an automatic speech understanding system. The acoustic system accepts the speech waveform as an input and produces as output a string of phoneme-like units referred to as acoustic phonetic elements (APEL'S). This paper should...

READ MORE

Effects of finite register length in digital filtering and the fast Fourier transform

Published in:
Proceedings of the IEEE Vol. 60, No. 8, Aug 72, pp. 957-976.

Summary

When digital signal processing operations are implemented on a computer or with special-purpose hardware, errors and constraints due to finite word length are unavoidable. The main categories of finite register length effects are errors due to A/D conversion, errors due to roundoffs in the arithmetic, constraints on signal levels imposed by the need to prevent overflow, and quantization of system coefficients. The effects of finite register length on implementations of linear recursive difference equation digital filters, and the fast Fourier transform (FFT), are discussed in some detail. For these algorithms, the differing quantization effects of fixed point, floating point, and block floating point arithmetic are examined and compared. The paper is intended primarily as a tutorial review of a subject which has received considerable attention over the past few years. The groundwork is set through a discussion of the relationship between the binary representation of numbers and truncation or rounding, and a formulation of a statistical model for arithmetic roundoff. The analyses presented here are intended to illustrate techniques of working with particular models. Results of previous work are discussed and summarized when appropriate. Some examples are presented to indicate how the results developed for simple digital filters and the FFT can be applied to the analysis of more complicated systems which use these algorithms as building blocks.
READ LESS

Summary

When digital signal processing operations are implemented on a computer or with special-purpose hardware, errors and constraints due to finite word length are unavoidable. The main categories of finite register length effects are errors due to A/D conversion, errors due to roundoffs in the arithmetic, constraints on signal levels imposed...

READ MORE

A theory of multiple antenna AMTI radar

Published in:
MIT Lincoln Laboratory Report TN-1971-21

Summary

This note presents a detailed mathematical analysis of a multiple-antenna AMTI radar system capable of detecting moving targets over a significantly wider velocity range than is achievable with a single-antenna system. The general system configuration and signaling strategy is defined, and relationships among system and signaling parameters are investigated. A deterministic model for the target return and a statistical model for the clutter and noise returns are obtained, and an optimum processor for target detection is derived. A performance measure applicable to a large class of processors, including the optimum processor, is defined and some of its analytical properties investigated. It is shown that an easily implementable sub-optimum processor, based on two-dimensional spectral analysis, performs nearly as well as the optimum processor. The resolution and ambiguity properties of this sub-optimum processor are studied and a detailed numerical investigation of system performance is presented, including a study of how performance varies with basic system parameters such as the number of antennas.
READ LESS

Summary

This note presents a detailed mathematical analysis of a multiple-antenna AMTI radar system capable of detecting moving targets over a significantly wider velocity range than is achievable with a single-antenna system. The general system configuration and signaling strategy is defined, and relationships among system and signaling parameters are investigated. A...

READ MORE

Predictive coding in a homomorphic vocoder

Published in:
IEEE Trans. Audio Electroacoust., Vol. AU-19, No. 3 September 1971, pp. 243-248.

Summary

Application of a type of predictive coding to the channel signals of a homomorphic vocoder has produced sizable bit rate reductions. With only slight degradation in speech quality, reduction (for the spectral envelope information) from 7800 to 4000 bits/s was achieved. A technique for obtaining the formant frequencies from the predictive coding parameters is described; this approach promises further bit rate reductions. As a byproduct of this study of predictive coding, direct and cascade form speech synthesizers are compared on the basis of differing quantization effects.
READ LESS

Summary

Application of a type of predictive coding to the channel signals of a homomorphic vocoder has produced sizable bit rate reductions. With only slight degradation in speech quality, reduction (for the spectral envelope information) from 7800 to 4000 bits/s was achieved. A technique for obtaining the formant frequencies from the...

READ MORE

Quantization effects in digital filters

Published in:
MIT Lincoln Laboratory Report TR-468

Summary

When a digital filter is implemented on a computer or with special-purpose hardware, errors and constraints due to finite word length are unavoidable. These quantization effects must be considered, both in deciding what register length is needed for a given filter implementation and in choosing between several possible implementations of the same filter design, which will be affected differently by quantization. Quantization effects in digital filters can be divided into four main categories: quantization of system coefficients, errors due to analog-digital (A-D) conversion, errors due to roundoffs in the arithmetic, and a constraint on signal level due to the requirement that overflow be prevented in the computation. The effects of these errors and constraints will vary, depending on the type of arithmetic used. Fixed point, floating point, and block floating point are three alternate types of arithmetic often employed in digital filtering. A very large portion of the computation performed in digital filtering is composed of two basic algorithms the first- or second-order, linear, constant coefficient, recursive difference equation; and computation of the discrete Fourier transform (DFT) by means of the fast Fourier transform (FFT). These algorithms serve as building blocks from which the most complicated digital filtering systems can be constructed. The effects of quantization on implementations of these basic algorithms are studied in some detail. Sensitivity formulas are presented for the effects of coefficient quantization on the poles of simple recursive filters. The mean-squared error in a computed DFT, due to coefficient quantization in the FFT, is estimated. For both recursions and the FFT, the differing effects of fixed and floating point coefficients are investigated. Statistical models for roundoff errors and A-D conversion errors, and linear system noise theory, are employed to estimate output noise variance in simple recursive filters and in the FFT. By considering the overflow constraint in conjunction with these noise analyses, output noise-to-signal ratios are derived. Noise-to-signal ratio analyses are carried out for fixed, floating, and block floating point arithmetic, and the results are compared. All the noise analyses are based on simple statistical models for roundoff errors (and A-D conversion errors). Of course, somewhat different models are applied for the different types of arithmetic. These models cannot in general be verified theoretically, and thus one must resort to experimental noise measurements to support the predictions obtained via the models. A good deal of experimental data on noise measurements is presented here, and the empirical results are generally in good agreement with the predictions based on the statistical models. The ideas developed in the study of simple recursive filters and the FFTare applied to analyze quantization effects in two more complicated types of digital filters frequency sampling and FFT filters. The frequency sampling filter is realized by means of a comb filter and a bank of second-order recursive filters; while an FFT filter implements a convolution via an FFT, a multiplication in the frequency domain, and an inverse FFT. Any finite duration impulse response filter can be realized by either of these methods. The effects of coefficient quantization, roundoff noise, and the overflow constraint are investigated for these two filter types. Through use of a specific example, realizations of the same filter design, by means of the frequency sampling and FFT methods, are compared on the basis of differing quantization effects.
READ LESS

Summary

When a digital filter is implemented on a computer or with special-purpose hardware, errors and constraints due to finite word length are unavoidable. These quantization effects must be considered, both in deciding what register length is needed for a given filter implementation and in choosing between several possible implementations of...

READ MORE

Roundoff noise in floating point fast Fourier transform computation

Published in:
IEEE Trans. Audio Electroacoust., Vol. AU-17, No. 3, September 1969, pp. 209-215.

Summary

A statistical model for roundoff errors is used to predict output noise-to-signal ratio when a fast Fourier transform is computed using floating point arithmetic. The result, derived for the case of white input signal, is that the ratio of mean-squared output noise to mean-squared output signal varies essentially as v = log2 N where N is the number of points transformed. This predicted result is significantly lower than bounds previously derived on mean-squared output noise-to-signal ratio, which are proportional to v2. The predictions are verified experimentally, with excellent agreement. The model applies to rounded arithmetic, and it is found experimentally that if one truncates, rather than rounds, the results of floating point additions and multiplications, the output noise increases significantly (for a given v). Also, for truncation, a greater than linear increase with v of the output noise-to-signal ratio is observed; the empirical results seem to be proportional to v2 rather than to v.
READ LESS

Summary

A statistical model for roundoff errors is used to predict output noise-to-signal ratio when a fast Fourier transform is computed using floating point arithmetic. The result, derived for the case of white input signal, is that the ratio of mean-squared output noise to mean-squared output signal varies essentially as v...

READ MORE

A comparison of roundoff noise in floating point and fixed point digital filter realizations

Published in:
Proc. IEEE, Vol. 57, No. 6, June 1969, pp. 1181-1183.

Summary

A statistical model for roundoff noise in floating point digital filters, proposed by Kanoko and Liu, is tested experimentally for first- and second-order digital filters. Good agreement between theory and experiment is obtained. The model is used to specify a comparison between floating point and fixed point digital filter realizations on the basis of their output noise-to-signal ratio, and curves representing this comparison are presented. One can find values of the filter parameters at which the fixed and the floating point curves will cross, for equal total register lengths.
READ LESS

Summary

A statistical model for roundoff noise in floating point digital filters, proposed by Kanoko and Liu, is tested experimentally for first- and second-order digital filters. Good agreement between theory and experiment is obtained. The model is used to specify a comparison between floating point and fixed point digital filter realizations...

READ MORE