Publications

Refine Results

(Filters Applied) Clear All

Link prediction methods for generating speaker content graphs

Published in:
ICASSP 2013, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 25-31 May 2013.

Summary

In a speaker content graph, vertices represent speech signals and edges represent speaker similarity. Link prediction methods calculate which potential edges are most likely to connect vertices from the same speaker; those edges are included in the generated speaker content graph. Since a variety of speaker recognition tasks can be performed on a content graph, we provide a set of metrics for evaluating the graph's quality independently of any recognition task. We then describe novel global and incremental algorithms for constructing accurate speaker content graphs that outperform the existing k nearest neighbors link prediction method. We evaluate those algorithms on a NIST speaker recognition corpus.
READ LESS

Summary

In a speaker content graph, vertices represent speech signals and edges represent speaker similarity. Link prediction methods calculate which potential edges are most likely to connect vertices from the same speaker; those edges are included in the generated speaker content graph. Since a variety of speaker recognition tasks can be...

READ MORE

Using United States government language proficiency standards for MT evaluation

Published in:
Chapter 5.3.3 in Handbook of Natural Language Processing and Machine Translation, 2011, pp. 775-82.

Summary

The purpose of this section is to discuss a method of measuring the degree to which the essential meaning of the original text is communicated in the MT output. We view this test to be a measurement of the fundamental goal of MT; that is, to convey information accurately from one language to another. We conducted a series of experiments in which educated native readers of English responded to test questions about translated versions of texts originally written in Arabic and Chinese. We compared the results for those subjects using machine translations of the texts with those using professional reference translations. These comparisons serve as a baseline for determining the level of foreign language reading comprehension that can be achieved by a native English reader relying on machine translation technology. This also allows us to explore the relationship between the current, broadly accepted automatic measures of performance for machine translation and a test derived from the Defense Language Proficiency Test, which is used throughout the Defense Department for measuring foreign language proficiency. Our goal is to put MT system performance evaluation into terms that are meaningful to US government consumers of MT output.
READ LESS

Summary

The purpose of this section is to discuss a method of measuring the degree to which the essential meaning of the original text is communicated in the MT output. We view this test to be a measurement of the fundamental goal of MT; that is, to convey information accurately from...

READ MORE

Construction of a phonotactic dialect corpus using semiautomatic annotation

Summary

In this paper, we discuss rapid, semiautomatic annotation techniques of detailed phonological phenomena for large corpora. We describe the use of these techniques for the development of a corpus of American English dialects. The resulting annotations and corpora will support both large-scale linguistic dialect analysis and automatic dialect identification. We delineate the semiautomatic annotation process that we are currently employing and, a set of experiments we ran to validate this process. From these experiments, we learned that the use of ASR techniques could significantly increase the throughput and consistency of human annotators.
READ LESS

Summary

In this paper, we discuss rapid, semiautomatic annotation techniques of detailed phonological phenomena for large corpora. We describe the use of these techniques for the development of a corpus of American English dialects. The resulting annotations and corpora will support both large-scale linguistic dialect analysis and automatic dialect identification. We...

READ MORE

The mixer and transcript reading corpora: resources for multilingual, crosschannel speaker recognition research

Summary

This paper describes the planning and creation of the Mixer and Transcript Reading corpora, their properties and yields, and reports on the lessons learned during their development.
READ LESS

Summary

This paper describes the planning and creation of the Mixer and Transcript Reading corpora, their properties and yields, and reports on the lessons learned during their development.

READ MORE

The MMSR bilingual and crosschannel corpora for speaker recognition research and evaluation

Summary

We describe efforts to create corpora to support and evaluate systems that meet the challenge of speaker recognition in the face of both channel and language variation. In addition to addressing ongoing evaluation of speaker recognition systems, these corpora are aimed at the bilingual and crosschannel dimensions. We report on specific data collection efforts at the Linguistic Data Consortium, the 2004 speaker recognition evaluation program organized by the National Institute of Standards and Technology (NIST), and the research ongoing at the US Federal Bureau of Investigation and MIT Lincoln Laboratory. We cover the design and requirements, the collections and evaluation integrating discussions of the data preparation, research, technology development and evaluation on a grand scale.
READ LESS

Summary

We describe efforts to create corpora to support and evaluate systems that meet the challenge of speaker recognition in the face of both channel and language variation. In addition to addressing ongoing evaluation of speaker recognition systems, these corpora are aimed at the bilingual and crosschannel dimensions. We report on...

READ MORE

The effect of text difficulty on machine translation performance -- a pilot study with ILR-related texts in Spanish, Farsi, Arabic, Russian and Korean

Published in:
4th Int. Conf. on Language Resources and Evaluation, LREC, 26-28 May 2004.

Summary

We report on initial experiments that examine the relationship between automated measures of machine translation performance (Doddington, 2003, and Papineni et al. 2001) and the Interagency Language Roundtable (ILR) scale of language proficiency/difficulty that has been in standard use for U.S. government language training and assessment for the past several decades (Child, Clifford and Lowe 1993). The main question we ask is how technology-oriented measures of MT performance relate to the ILR difficulty levels, where we understand that a linguist with ILR proficiency level N is expected to be able to understand a document rated at level N, but to have increasing difficulty with documents at higher levels. In this paper, we find that some key aspects of MT performance track with ILR difficulty levels, primarily for MT output whose quality is good enough to be readable by human readers.
READ LESS

Summary

We report on initial experiments that examine the relationship between automated measures of machine translation performance (Doddington, 2003, and Papineni et al. 2001) and the Interagency Language Roundtable (ILR) scale of language proficiency/difficulty that has been in standard use for U.S. government language training and assessment for the past several...

READ MORE

Conversational telephone speech corpus collection for the NIST speaker recognition evaluation 2004

Published in:
Proc. Language Resource Evaluation Conf., LREC, 24-30 May 2004, pp. 587-590.

Summary

This paper discusses some of the factors that should be considered when designing a speech corpus collection to be used for text independent speaker recognition evaluation. The factors include telephone handset type, telephone transmission type, language, and (non-telephone) microphone type. The paper describes the design of the new corpus collection being undertaken by the Linguistic Data Consortium (LDC) to support the 2004 and subsequent NIST speech recognition evaluations. Some preliminary information on the resulting 2004 evaluation test set is offered.
READ LESS

Summary

This paper discusses some of the factors that should be considered when designing a speech corpus collection to be used for text independent speaker recognition evaluation. The factors include telephone handset type, telephone transmission type, language, and (non-telephone) microphone type. The paper describes the design of the new corpus collection...

READ MORE

The mixer corpus of multilingual, multichannel speaker recognition data

Published in:
Proc. Language Resource Evaluation Conf., LREC, 24-30 May 2004, pp. 627-630.

Summary

This paper describes efforts to create corpora to support and evaluate systems that perform speaker recognition where channel and language may vary. Beyond the ongoing evaluation of speaker recognition systems, these corpora are aimed at the bilingual and cross channel dimensions. We report on specific data collection efforts at the Linguistic Data Consortium and the research ongoing at the US Federal Bureau of Investigation and MIT Lincoln Laboratories. We cover the design and requirements, the collections and final properties of the corpus integrating discussions of the data preparation, research, technology development and evaluation on a grand scale.
READ LESS

Summary

This paper describes efforts to create corpora to support and evaluate systems that perform speaker recognition where channel and language may vary. Beyond the ongoing evaluation of speaker recognition systems, these corpora are aimed at the bilingual and cross channel dimensions. We report on specific data collection efforts at the...

READ MORE

Corpora for the evaluation of speaker recognition systems

Published in:
ICASSP 1999, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 15-19 March 1999.

Summary

Using standard speech corpora for development and evaluation has proven to be very valuable in promoting progress in speech and speaker recognition research. In this paper, we present an overview of current publicly available corpora intended for speaker recognition research and evaluation. We outline the corpora's salient features with respect to their suitability for conducting speaker recognition experiments and evaluations. Links to these corpora, and to new corpora, will appear on the web http://www.apl.jhu.edu/Classes/Notes/Campbell/SpkrRec/. We hope to increase the awareness and use of these standard corpora and corresponding evaluation procedures throughout the speaker recognition community.
READ LESS

Summary

Using standard speech corpora for development and evaluation has proven to be very valuable in promoting progress in speech and speaker recognition research. In this paper, we present an overview of current publicly available corpora intended for speaker recognition research and evaluation. We outline the corpora's salient features with respect...

READ MORE

HTIMIT and LLHDB: speech corpora for the study of handset transducer effects

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, Vol. 2, 21-24 April 1997, pp. 1535-1538.

Summary

This paper describes two corpora collected at Lincoln Laboratory for the study of handset transducer effects on the speech signal: the handset TIMIT (HTIMIT) corpus and the Lincoln Laboratory Handset Database (LLHDB). The goal of these corpora are to minimize all confounding factors and to produce speech predominately differing only in handset transducer effects. The speech is recorded directly from a telephone unit in a sound-booth using prompted text and extemporaneous photograph descriptions. The two corpora allow comparison of speech collected from a person speaking into a handset (LLHDB) versus speech played through a loudspeaker into a headset (HTIMIT). A comparison of analysis and results between the two corpora will address the realism of artificially creating handset degraded speech by playing recorded speech through handsets. The corpora are designed primarily for speaker recognition experimentation (in terms of amount of speech and level of transcription), but since both speaker and speech recognition systems operate on the same acoustic features affected by the handset, knowledge gleaned is directly transferable to speech recognizers.
READ LESS

Summary

This paper describes two corpora collected at Lincoln Laboratory for the study of handset transducer effects on the speech signal: the handset TIMIT (HTIMIT) corpus and the Lincoln Laboratory Handset Database (LLHDB). The goal of these corpora are to minimize all confounding factors and to produce speech predominately differing only...

READ MORE