Publications

Refine Results

(Filters Applied) Clear All

Missing feature theory with soft spectral subtraction for speaker verification

Published in:
Interspeech 2006, ICSLP, 17-21 September 2006.

Summary

This paper considers the problem of training/testing mismatch in the context of speaker verification and, in particular, explores the application of missing feature theory in the case of additive white Gaussian noise corruption in testing. Missing feature theory allows for corrupted features to be removed from scoring, the initial step of which is the detection of these features. One method of detection, employing spectral subtraction, is studied in a controlled manner and it is shown that with missing feature compensation the resulting verification performance is improved as long as a minimum number of features remain. Finally, a blending of "soft" spectral subtraction for noise mitigation and missing feature compensation is presented. The resulting performance improves on the constituent techniques alone, reducing the equal error rate by about 15% over an SNR range of 5 - 25 dB.
READ LESS

Summary

This paper considers the problem of training/testing mismatch in the context of speaker verification and, in particular, explores the application of missing feature theory in the case of additive white Gaussian noise corruption in testing. Missing feature theory allows for corrupted features to be removed from scoring, the initial step...

READ MORE

A comparison of soft and hard spectral subtraction for speaker verification

Published in:
8th Int. Conf. on Spoken Language Processing, ICSLP 2004, 4-8 October 2004.

Summary

An important concern in speaker recognition is the performance degradation that occurs when speaker models trained with speech from one type of channel are subsequently used to score speech from another type of channel, known as channel mismatch. This paper investigates the relative performance of two different spectral subtraction methods for additive noise compensation in the context of speaker verification. The first method, termed "soft" spectral subtraction, is performed in the spectral domain on the |DFT|^2 values of the speech frames while the second method, termed "hard" spectral subtraction, is performed on the Mel-filter energy features. It is shown through both an analytical argument as well as a simulation that soft spectral subtraction results in a higher signal-to-noise ratio in the resulting Mel-filter energy features. In the context of Gaussian mixture model-based speaker verification with additive noise in testing utterances, this is shown to result in an equal error rate improvement over a system without spectral subtraction of approximately 7% in absolute terms, 21% in relative terms, over an additive white Gaussian noise range of 5-25 dB.
READ LESS

Summary

An important concern in speaker recognition is the performance degradation that occurs when speaker models trained with speech from one type of channel are subsequently used to score speech from another type of channel, known as channel mismatch. This paper investigates the relative performance of two different spectral subtraction methods...

READ MORE

Showing Results

1-2 of 2