Publications

Refine Results

(Filters Applied) Clear All

R&D Areas

R&D Groups

Year

Items per page

By

Reva Schwartz Clear filter

Assessing the speaker recognition performance of naive listeners using Mechanical Turk

May 22, 2011

Conference Paper

Author:

Wade Shen

…

Published in:

Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 22-27 May 2011, pp. 5916-5919.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this paper we attempt to quantify the ability of naive listeners to perform speaker recognition in the context of the NIST evaluation task. We describe our protocol: a series of listening experiments using large numbers of naive listeners (432) on Amazon's Mechanical Turk that attempts to measure the ability of the average human listener to perform speaker recognition. Our goal was to compare the performance of the average human listener to both forensic experts and state-of-the- art automatic systems. We show that naive listeners vary substantially in their performance, but that an aggregation of listener responses can achieve performance similar to that of expert forensic examiners.

READ LESS

Summary

Assessing the speaker recognition performance of naive listeners using Mechanical Turk

USSS-MITLL 2010 human assisted speaker recognition

January 2, 2011

Conference Paper

Author:

Reva Schwartz

…

Published in:

Proc. IEEE ICASSP, 26 May 2011, pp. 5904-7.

Topic:

biometrics

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

The United States Secret Service (USSS) teamed with MIT Lincoln Laboratory (MIT/LL) in the US National Institute of Standards and Technology's 2010 Speaker Recognition Evaluation of Human Assisted Speaker Recognition (HASR). We describe our qualitative and automatic speaker comparison processes and our fusion of these processes, which are adapted from USSS casework. The USSS-MIT/LL 2010 HASR results are presented. We also present post-evaluation results. The results are encouraging within the resolving power of the evaluation, which was limited to enable reasonable levels of human effort. Future ideas and efforts are discussed, including new features and capitalizing on naive listeners.

READ LESS

Summary

USSS-MITLL 2010 human assisted speaker recognition

Large-scale analysis of formant frequency estimation variability in conversational telephone speech

September 6, 2009

Conference Paper

Author:

Nancy Chen

…

Published in:

INTERSPEECH 2009, 6-10 September 2009.

Topic:

signal processing

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

We quantify how the telephone channel and regional dialect influence formant estimates extracted from Wavesurfer in spontaneous conversational speech from over 3,600 native American English speakers. To the best of our knowledge, this is the largest scale study on this topic. We found that F1 estimates are higher in cellular channels than those in landline, while F2 in general shows an opposite trend. We also characterized vowel shift trends in northern states in U.S.A. and compared them with the Northern city chain shift (NCCS). Our analysis is useful in forensic applications where it is important to distinguish between speaker, dialect, and channel characteristics.

READ LESS

Summary

Large-scale analysis of formant frequency estimation variability in conversational telephone speech

Forensic speaker recognition: a need for caution

March 1, 2009

Journal Article

Author:

Joseph P. Campbell Jr

…

Published in:

IEEE Signal Process. Mag., Vol. 26, No. 2, March 2009, pp. 95-103.

Topic:

speaker recognition

R&D area:

Cyber Security and Information Sciences

R&D group:

Summary

There has long been a desire to be able to identify a person on the basis of his or her voice. For many years, judges, lawyers, detectives, and law enforcement agencies have wanted to use forensic voice authentication to investigate a suspect or to confirm a judgment of guilt or innocence. Challenges, realities, and cautions regarding the use of speaker recognition applied to forensic-quality samples are presented.

READ LESS

Summary

Forensic speaker recognition: a need for caution

Proficiency testing for imaging and audio enhancement: guidelines for evaluation

July 21, 2008

Conference Paper

Author:

Reva Schwartz

…

Published in:

Int. Assoc. of Forensic Sciences, IAFS, 21-26 July 2008.

Topic:

cyber security

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

Proficiency tests in the forensic sciences are vital in the accreditation and quality assurance process. Most commercially available proficiency testing is available for examiners in the traditional forensic disciplines, such as latent prints, drug analysis, DNA, questioned documents, etc. Each of these disciplines is identification based. There are other forensic disciplines, however, where the output of the examination is not an identification of a person or substance. Two such disciplines are audio enhancement and video/image enhancement.

READ LESS

Summary

Proficiency testing for imaging and audio enhancement: guidelines for evaluation

Bridging the gap between linguists and technology developers: large-scale, sociolinguistic annotation for dialect and speaker recognition

May 28, 2008

Conference Paper

Author:

Christopher Cieri

…

Published in:

Proc. 6th Int. Conf. on Language Resources and Evaluation, LREC, 28 May 2008.

Topic:

cyber security

R&D area:

Cyber Security and Information Sciences

R&D group:

Cyber Operations and Analysis Technology

Summary

Recent years have seen increased interest within the speaker recognition community in high-level features including, for example, lexical choice, idiomatic expressions or syntactic structures. The promise of speaker recognition in forensic applications drives development toward systems robust to channel differences by selecting features inherently robust to channel difference. Within the language recognition community, there is growing interest in differentiating not only languages but also mutually intelligible dialects of a single language. Decades of research in dialectology suggest that high-level features can enable systems to cluster speakers according to the dialects they speak. The Phanotics (Phonetic Annotation of Typicality in Conversational Speech) project seeks to identify high-level features characteristic of American dialects, annotate a corpus for these features, use the data to dialect recognition systems and also use the categorization to create better models for speaker recognition. The data, once published, should be useful to other developers of speaker and dialect recognition systems and to dialectologists and sociolinguists. We expect the methods will generalize well beyond the speakers, dialects, and languages discussed here and should, if successful, provide a model for how linguists and technology developers can collaborate in the future for the benefit of both groups and toward a deeper understanding of how languages vary and change.

READ LESS

Summary

Bridging the gap between linguists and technology developers: large-scale, sociolinguistic annotation for dialect and speaker recognition

Construction of a phonotactic dialect corpus using semiautomatic annotation

August 27, 2007

Conference Paper

Author:

Reva Schwartz

…

Published in:

INTERSPEECH 2007, 27-31 August 2007, pp. 942-945.

Topic:

human language technology

R&D area:

Cyber Security and Information Sciences

R&D group:

Artificial Intelligence Technology and Systems

Summary

In this paper, we discuss rapid, semiautomatic annotation techniques of detailed phonological phenomena for large corpora. We describe the use of these techniques for the development of a corpus of American English dialects. The resulting annotations and corpora will support both large-scale linguistic dialect analysis and automatic dialect identification. We delineate the semiautomatic annotation process that we are currently employing and, a set of experiments we ran to validate this process. From these experiments, we learned that the use of ASR techniques could significantly increase the throughput and consistency of human annotators.

READ LESS

Summary

Construction of a phonotactic dialect corpus using semiautomatic annotation

Publications

Refine Results

By

Assessing the speaker recognition performance of naive listeners using Mechanical Turk

Summary

Summary

USSS-MITLL 2010 human assisted speaker recognition

Summary

Summary

Large-scale analysis of formant frequency estimation variability in conversational telephone speech

Summary

Summary

Forensic speaker recognition: a need for caution

Summary

Summary

Proficiency testing for imaging and audio enhancement: guidelines for evaluation

Summary

Summary

Bridging the gap between linguists and technology developers: large-scale, sociolinguistic annotation for dialect and speaker recognition

Summary

Summary

Construction of a phonotactic dialect corpus using semiautomatic annotation

Summary

Summary

Showing Results