Publications
Tagged As
Corpora design and score calibration for text dependent pronunciation proficiency recognition
Summary
Summary
This work investigates methods for improving a pronunciation proficiency recognition system, both in terms of phonetic level posterior probability calibration, and in ordinal utterance level classification, for Modern Standard Arabic (MSA), Spanish and Russian. To support this work, utterance level labels were obtained by crowd-sourcing the annotation of language learners'...
NetProf iOS pronunciation feedback demonstration
Summary
Summary
One of the greatest challenges for an adult learning a new language is gaining the ability to distinguish and produce foreign sounds. The US Government trains 3,600 enlisted soldiers a year at the Defense Language Institute Foreign Language Center (DLIFLC) in languages critical to national security, most of which are...
Discrimination between singing and speech in real-world audio
Summary
Summary
The performance of a spoken language system suffers when non-speech is incorrectly classified as speech. Singing is particularly difficult to discriminate from speech, since both are natural language. However, singing conveys a melody, whereas speech does not; in particular, a singer's fundamental frequency should not deviate significantly from an underlying...
Comparing a high and low-level deep neural network implementation for automatic speech recognition
Summary
Summary
The use of deep neural networks (DNNs) has improved performance in several fields including computer vision, natural language processing, and automatic speech recognition (ASR). The increased use of DNNs in recent years has been largely due to performance afforded by GPUs, as the computational cost of training large networks on...
The MIT-LL/AFRL IWSLT-2010 MT system
Summary
Summary
This paper describes the MIT-LUAFRL statistical MT system and the improvements that were developed during the IWSLT 2010 evaluation campaign. As part of these efforts, we experimented with a number of extensions to the standard phrase-based model that improve performance on the Arabic and Turkish to English translation tasks. We...
Query-by-example spoken term detection using phonetic posteriorgram templates
Summary
Summary
This paper examines a query-by-example approach to spoken term detection in audio files. The approach is designed for low-resource situations in which limited or no in-domain training material is available and accurate word-based speech recognition capability is unavailable. Instead of using word or phone strings as search terms, the user...
A comparison of query-by-example methods for spoken term detection
Summary
Summary
In this paper we examine an alternative interface for phonetic search, namely query-by-example, that avoids OOV issues associated with both standard word-based and phonetic search methods. We develop three methods that compare query lattices derived from example audio against a standard ngrambased phonetic index and we analyze factors affecting the...
Cognitive services for the user
Summary
Summary
Software-defined cognitive radios (CRs) use voice as a primary input/output (I/O) modality and are expected to have substantial computational resources capable of supporting advanced speech- and audio-processing applications. This chapter extends previous work on speech applications (e.g., [1]) to cognitive services that enhance military mission capability by capitalizing on automatic...
Efficient speech translation through confusion network decoding
Summary
Summary
This paper describes advances in the use of confusion networks as interface between automatic speech recognition and machine translation. In particular, it presents a decoding algorithm for confusion networks which results as an extension of a state-of-the-art phrase-based text translation decoder. The confusion network decoder significantly improves both in efficiency...
Two protocols comparing human and machine phonetic discrimination performance in conversational speech
Summary
Summary
This paper describes two experimental protocols for direct comparison on human and machine phonetic discrimination performance in continuous speech. These protocols attempt to isolate phonetic discrimination while controlling for language and segmentation biases. Results of two human experiments are described including comparisons with automatic phonetic recognition baselines. Our experiments suggest...