Publications

Refine Results

(Filters Applied) Clear All

Combating Misinformation: HLT Highlights from MIT Lincoln Laboratory

Published in:
Human Language Technology Conference (HLTCon), 16-18 March 2021.

Summary

Dr. Joseph Campbell shares several human language technologies highlights from MIT Lincoln Laboratory. These include key enabling technologies in combating misinformation to link personas, analyze content, and understand human networks. Developing operationally relevant technologies requires access to corresponding data with meaningful evaluations, as Dr. Douglas Reynolds presented in his keynote. As Dr. Danelle Shah discussed in her keynote, it’s crucial to develop these technologies to operate at deeper levels than the surface. Producing reliable information from the fusion of missing and inherently unreliable information channels is paramount. Furthermore, the dynamic misinformation environment and the coevolution of allied methods with adversarial methods represent additional challenges
READ LESS

Summary

Dr. Joseph Campbell shares several human language technologies highlights from MIT Lincoln Laboratory. These include key enabling technologies in combating misinformation to link personas, analyze content, and understand human networks. Developing operationally relevant technologies requires access to corresponding data with meaningful evaluations, as Dr. Douglas Reynolds presented in his keynote...

READ MORE

Combating Misinformation: What HLT Can (and Can't) Do When Words Don't Say What They Mean

Author:
Published in:
Human Language Technology Conference (HLTCon), 16-18 March 2021.

Summary

Misinformation, disinformation, and “fake news” have been used as a means of influence for millennia, but the proliferation of the internet and social media in the 21st century has enabled nefarious campaigns to achieve unprecedented scale, speed, precision, and effectiveness. In the past few years, there has been significant recognition of the threats posed by malign influence operations to geopolitical relations, democratic institutions and processes, public health and safety, and more. At the same time, the digitization of communication offers tremendous opportunities for human language technologies (HLT) to observe, interpret, and understand this publicly available content. The ability to infer intent and impact, however, remains much more elusive.
READ LESS

Summary

Misinformation, disinformation, and “fake news” have been used as a means of influence for millennia, but the proliferation of the internet and social media in the 21st century has enabled nefarious campaigns to achieve unprecedented scale, speed, precision, and effectiveness. In the past few years, there has been significant recognition...

READ MORE

Speaker separation in realistic noise environments with applications to a cognitively-controlled hearing aid

Summary

Future wearable technology may provide for enhanced communication in noisy environments and for the ability to pick out a single talker of interest in a crowded room simply by the listener shifting their attentional focus. Such a system relies on two components, speaker separation and decoding the listener's attention to acoustic streams in the environment. To address the former, we present a system for joint speaker separation and noise suppression, referred to as the Binaural Enhancement via Attention Masking Network (BEAMNET). The BEAMNET system is an end-to-end neural network architecture based on self-attention. Binaural input waveforms are mapped to a joint embedding space via a learned encoder, and separate multiplicative masking mechanisms are included for noise suppression and speaker separation. Pairs of output binaural waveforms are then synthesized using learned decoders, each capturing a separated speaker while maintaining spatial cues. A key contribution of BEAMNET is that the architecture contains a separation path, an enhancement path, and an autoencoder path. This paper proposes a novel loss function which simultaneously trains these paths, so that disabling the masking mechanisms during inference causes BEAMNET to reconstruct the input speech signals. This allows dynamic control of the level of suppression applied by BEAMNET via a minimum gain level, which is not possible in other state-of-the-art approaches to end-to-end speaker separation. This paper also proposes a perceptually-motivated waveform distance measure. Using objective speech quality metrics, the proposed system is demonstrated to perform well at separating two equal-energy talkers, even in high levels of background noise. Subjective testing shows an improvement in speech intelligibility across a range of noise levels, for signals with artificially added head-related transfer functions and background noise. Finally, when used as part of an auditory attention decoder (AAD) system using existing electroencephalogram (EEG) data, BEAMNET is found to maintain the decoding accuracy achieved with ideal speaker separation, even in severe acoustic conditions. These results suggest that this enhancement system is highly effective at decoding auditory attention in realistic noise environments, and could possibly lead to improved speech perception in a cognitively controlled hearing aid.
READ LESS

Summary

Future wearable technology may provide for enhanced communication in noisy environments and for the ability to pick out a single talker of interest in a crowded room simply by the listener shifting their attentional focus. Such a system relies on two components, speaker separation and decoding the listener's attention to...

READ MORE

More than a fair share: Network Data Remanence attacks against secret sharing-based schemes

Published in:
Network and Distributed Systems Security Symp., NDSS, 23-26 February 2021.

Summary

With progress toward a practical quantum computer has come an increasingly rapid search for quantum-safe, secure communication schemes that do not rely on discrete logarithm or factorization problems. One such encryption scheme, Multi-path Switching with Secret Sharing (MSSS), combines secret sharing with multi-path switching to achieve security as long as the adversary does not have global observability of all paths and thus cannot capture enough shares to reconstruct messages. MSSS assumes that sending a share on a path is an atomic operation and all paths have the same delay. In this paper, we identify a side-channel vulnerability for MSSS, created by the fact that in real networks, sending a share is not an atomic operation as paths have multiple hops and different delays. This channel, referred to as Network Data Remanence (NDR), is present in all schemes like MSSS whose security relies on transfer atomicity and all paths having same delay. We demonstrate the presence of NDR in a physical testbed. We then identify two new attacks that aim to exploit the side channel, referred to as NDR Blind and NDR Planned, propose an analytical model to analyze the attacks, and demonstrate them using an implementation of MSSS based on the ONOS SDN controller. Finally, we present a countermeasure for the attacks and show its effectiveness in simulations and Mininet experiments.
READ LESS

Summary

With progress toward a practical quantum computer has come an increasingly rapid search for quantum-safe, secure communication schemes that do not rely on discrete logarithm or factorization problems. One such encryption scheme, Multi-path Switching with Secret Sharing (MSSS), combines secret sharing with multi-path switching to achieve security as long as...

READ MORE

Beyond expertise and roles: a framework to characterize the stakeholders of interpretable machine learning and their needs

Published in:
Proc. Conf. on Human Factors in Computing Systems, 8-13 May 2021, article no. 74.

Summary

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples stakeholders' knowledge from their interpretability needs. We characterize stakeholders by their formal, instrumental, and personal knowledge and how it manifests in the contexts of machine learning, the data domain, and the general milieu. We additionally distill a hierarchical typology of stakeholder needs that distinguishes higher-level domain goals from lower-level interpretability tasks. In assessing the descriptive, evaluative, and generative powers of our framework, we find our more nuanced treatment of stakeholders reveals gaps and opportunities in the interpretability literature, adds precision to the design and comparison of user studies, and facilitates a more reflexive approach to conducting this research.
READ LESS

Summary

To ensure accountability and mitigate harm, it is critical that diverse stakeholders can interrogate black-box automated systems and find information that is understandable, relevant, and useful to them. In this paper, we eschew prior expertise- and role-based categorizations of interpretability stakeholders in favor of a more granular framework that decouples...

READ MORE

Seasonal Inhomogeneous Nonconsecutive Arrival Process Search and Evaluation

Published in:
25th International Conference on Pattern Recognition [submitted]

Summary

Time series often exhibit seasonal patterns, and identification of these patterns is essential to understanding thedata and predicting future behavior. Most methods train onlarge datasets and can fail to predict far past the training data. This limitation becomes more pronounced when data is sparse. This paper presents a method to fit a model to seasonal time series data that maintains predictive power when data is limited. This method, called SINAPSE, combines statistical model fitting with an information criteria to search for disjoint, andpossibly nonconsecutive, regimes underlying the data, allowing for a sparse representation resistant to overfitting.
READ LESS

Summary

Time series often exhibit seasonal patterns, and identification of these patterns is essential to understanding thedata and predicting future behavior. Most methods train onlarge datasets and can fail to predict far past the training data. This limitation becomes more pronounced when data is sparse. This paper presents a method to...

READ MORE

Automatic detection of influential actors in disinformation networks

Published in:
Proc. Natl. Acad. Sci., Vol. 118, No. 4, January 2021, e2011216118.

Summary

The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IO). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language processing, machine learning, graph analytics, and a novel network causal inference approach to quantify the impact of individual actors in spreading IO narratives. We demonstrate its capability on real-world hostile IO campaigns with Twitter datasets collected during the 2017 French presidential elections, and known IO accounts disclosed by Twitter. Our system detects IO accounts with 96% precision, 79% recall, and 96% area-under-the-PR-curve, maps out salient network communities, and discovers high-impact accounts that escape the lens of traditional impact statistics based on activity counts and network centrality. Results are corroborated with independent sources of known IO accounts from U.S. Congressional reports, investigative journalism, and IO datasets provided by Twitter.
READ LESS

Summary

The weaponization of digital communications and social media to conduct disinformation campaigns at immense scale, speed, and reach presents new challenges to identify and counter hostile influence operations (IO). This paper presents an end-to-end framework to automate detection of disinformation narratives, networks, and influential actors. The framework integrates natural language...

READ MORE

The Speech Enhancement via Attention Masking Network (SEAMNET): an end-to-end system for joint suppression of noise and reverberation [early access]

Published in:
IEEE/ACM Trans. on Audio, Speech, and Language Processing, Vol. 29, 2021, pp. 515-26.

Summary

This paper proposes the Speech Enhancement via Attention Masking Network (SEAMNET), a neural network-based end-to-end single-channel speech enhancement system designed for joint suppression of noise and reverberation. It formalizes an end-to-end network architecture, referred to as b-Net, which accomplishes noise suppression through attention masking in a learned embedding space. A key contribution of SEAMNET is that the b-Net architecture contains both an enhancement and an autoencoder path. This paper proposes a novel loss function which simultaneously trains both the enhancement and the autoencoder paths, so that disabling the masking mechanism during inference causes SEAMNET to reconstruct the input speech signal. This allows dynamic control of the level of suppression applied by SEAMNET via a minimum gain level, which is not possible in other state-of-the-art approaches to end-to-end speech enhancement. This paper also proposes a perceptually-motivated waveform distance measure. In addition to the b-Net architecture, this paper proposes a novel method for designing target waveforms for network training, so that joint suppression of additive noise and reverberation can be performed by an end-to-end enhancement system, which has not been previously possible. Experimental results show the SEAMNET system to outperform a variety of state-of-the-art baselines systems, both in terms of objective speech quality measures and subjective listening tests. Finally, this paper draws parallels between SEAMNET and conventional statistical model-based enhancement approaches, offering interpretability of many network components.
READ LESS

Summary

This paper proposes the Speech Enhancement via Attention Masking Network (SEAMNET), a neural network-based end-to-end single-channel speech enhancement system designed for joint suppression of noise and reverberation. It formalizes an end-to-end network architecture, referred to as b-Net, which accomplishes noise suppression through attention masking in a learned embedding space. A...

READ MORE

Information Aware max-norm Dirichlet networks for predictive uncertainty estimation

Published in:
Neural Netw., Vol. 135, 2021, pp. 105–114.

Summary

Precise estimation of uncertainty in predictions for AI systems is a critical factor in ensuring trust and safety. Deep neural networks trained with a conventional method are prone to over-confident predictions. In contrast to Bayesian neural networks that learn approximate distributions on weights to infer prediction confidence, we propose a novel method, Information Aware Dirichlet networks, that learn an explicit Dirichlet prior distribution on predictive distributions by minimizing a bound on the expected max norm of the prediction error and penalizing information associated with incorrect outcomes. Properties of the new cost function are derived to indicate how improved uncertainty estimation is achieved. Experiments using real datasets show that our technique outperforms, by a large margin, state-of-the-art neural networks for estimating within-distribution and out-of-distribution uncertainty, and detecting adversarial examples.
READ LESS

Summary

Precise estimation of uncertainty in predictions for AI systems is a critical factor in ensuring trust and safety. Deep neural networks trained with a conventional method are prone to over-confident predictions. In contrast to Bayesian neural networks that learn approximate distributions on weights to infer prediction confidence, we propose a...

READ MORE

The 2019 NIST Speaker Recognition Evaluation CTS Challenge

Published in:
The Speaker and Language Recognition Workshop: Odyssey 2020, 1-5 November 2020.

Summary

In 2019, the U.S. National Institute of Standards and Technology (NIST) conducted a leaderboard style speaker recognition challenge using conversational telephone speech (CTS) data extracted from the unexposed portion of the Call My Net 2 (CMN2) corpus previously used in the 2018 Speaker Recognition Evaluation (SRE). The SRE19 CTS Challenge was organized in a similar manner to SRE18, except it offered only the open training condition. In addition, similar to the NIST i-vector challenge, the evaluation set consisted of two subsets: a progress subset, and a test subset. The progress subset comprised 30% of the trials and was used to monitor progress on the leaderboad, while the remaining 70% of the trials formed the test subset, which was used to generate the official final results determined at the end of the challenge. Which subset (i.e., progress or test) a trial belonged to was unknown to challenge participants, and each system submission had to contain outputs for all of trials. The CTS Challenge also served as a prerequisite for entrance to the main SRE19 whose primary task was audio-visual person recognition. A total of 67 organizations (forming 51 teams) from academia and industry participated in the CTS Challenge and submitted 1347 valid system outputs. This paper presents an overview of the evaluation and several analyses of system performance for all primary conditions in the CTS Challenge. Compared to the CTS track of the SRE18, the SRE19 CTS Challenge results indicate remarkable improvements in performance which are mainly attributed to 1) the availability of large amounts of in-domain development data from a large number of labeled speakers, 2) speaker representations (aka embeddings) extracted using extended and more complex end-to-end neural network frameworks, and 3) effective use of the provided large development set.
READ LESS

Summary

In 2019, the U.S. National Institute of Standards and Technology (NIST) conducted a leaderboard style speaker recognition challenge using conversational telephone speech (CTS) data extracted from the unexposed portion of the Call My Net 2 (CMN2) corpus previously used in the 2018 Speaker Recognition Evaluation (SRE). The SRE19 CTS Challenge...

READ MORE