Publications

Refine Results

(Filters Applied) Clear All

Nuisance attribute projection

Published in:
Chapter in Speech Communication, May 2007.

Summary

Cross-channel degradation is one of the significant challenges facing speaker recognition systems. We study this problem in the support vector machine (SVM) context and nuisance variable compensation in high-dimensional spaces more generally. We present an approach to nuisance variable compensation by removing nuisance attribute-related dimensions in the SVM expansion space via projections. Training to remove these dimensions is accomplished via an eigenvalue problem. The eigenvalue problem attempts to reduce multisession variation for the same speaker, reduce different channel effects, and increase "distance" between different speakers. Experiments show significant improvement in performance for the cross-channel case.
READ LESS

Summary

Cross-channel degradation is one of the significant challenges facing speaker recognition systems. We study this problem in the support vector machine (SVM) context and nuisance variable compensation in high-dimensional spaces more generally. We present an approach to nuisance variable compensation by removing nuisance attribute-related dimensions in the SVM expansion space...

READ MORE

SVM based speaker verification using a GMM supervector kernel and NAP variability compensation

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Speech and Language Processing, ICASSP, Vol. 1, 14-19 May 2006, pp. 97-100.

Summary

Gaussian mixture models with universal backgrounds (UBMs) have become the standard method for speaker recognition. Typically, a speaker model is constructed by MAP adaptation of the means of the UBM. A GMM supervector is constructed by stacking the means of the adapted mixture components. A recent discovery is that latent factor analysis of this GMM supervector is an effective method for variability compensation. We consider this GMM supervector in the context of support vector machines. We construct a support vector machine kernel using the GMM supervector. We show similarities based on this kernel between the method of SVM nuisance attribute projection (NAP) and the recent results in latent factor analysis. Experiments on a NIST SRE 2005 corpus demonstrate the effectiveness of the new technique.
READ LESS

Summary

Gaussian mixture models with universal backgrounds (UBMs) have become the standard method for speaker recognition. Typically, a speaker model is constructed by MAP adaptation of the means of the UBM. A GMM supervector is constructed by stacking the means of the adapted mixture components. A recent discovery is that latent...

READ MORE

Advances in channel compensation for SVM speaker recognition

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, Vol. 1, 19-23 March 2005, pp. I-629 - I-631.

Summary

Cross-channel degradation is one of the significant challenges facing speaker recognition systems. We study the problem for speaker recognition using support vector machines (SVMs). We perform channel compensation in SVM modeling by removing non-speaker nuisance dimensions in the SVM expansion space via projections. Training to remove these dimensions is accomplished via an eigenvalue problem. The eigenvalue problem attempts to reduce multisession variation for the same speaker, reduce different channel effects, and increase "distance" between different speakers. We apply our methods to a subset of the Switchboard 2 corpus. Experiments show dramatic improvement in performance for the cross-channel case.
READ LESS

Summary

Cross-channel degradation is one of the significant challenges facing speaker recognition systems. We study the problem for speaker recognition using support vector machines (SVMs). We perform channel compensation in SVM modeling by removing non-speaker nuisance dimensions in the SVM expansion space via projections. Training to remove these dimensions is accomplished...

READ MORE

Channel compensation for SVM speaker recognition

Published in:
Odyssey, The Speaker and Language Recognition Workshop, 31 May - 3 June 2004.

Summary

One of the major remaining challenges to improving accuracy in state-of-the-art speaker recognition algorithms is reducing the impact of channel and handset variations on system performance. For Gaussian Mixture Model based speaker recognition systems, a variety of channel-adaptation techniques are known and available for adapting models between different channel conditions, but for the much more recent Support Vector Machine (SVM) based approaches to this problem, much less is known about the best way to handle this issue. In this paper we explore techniques that are specific to the SVM framework in order to derive fully non-linear channel compensations. The result is a system that is less sensitive to specific kinds of labeled channel variations observed in training.
READ LESS

Summary

One of the major remaining challenges to improving accuracy in state-of-the-art speaker recognition algorithms is reducing the impact of channel and handset variations on system performance. For Gaussian Mixture Model based speaker recognition systems, a variety of channel-adaptation techniques are known and available for adapting models between different channel conditions...

READ MORE

Showing Results

1-4 of 4