Publications

Refine Results

(Filters Applied) Clear All

Percolation model of insider threats to assess the optimum number of rules

Published in:
Environ. Syst. Decis., Vol. 35, 2015, pp. 504-10.

Summary

Rules, regulations, and policies are the basis of civilized society and are used to coordinate the activities of individuals who have a variety of goals and purposes. History has taught that over-regulation (too many rules) makes it difficult to compete and under-regulation (too few rules) can lead to crisis. This implies an optimal number of rules that avoids these two extremes. Rules create boundaries that define the latitude at which an individual has to perform their activities. This paper creates a Toy Model of a work environment and examines it with respect to the latitude provided to a normal individual and the latitude provided to an insider threat. Simulations with the Toy Model illustrate four regimes with respect to an insider threat: under-regulated, possibly optimal, tipping point, and over-regulated. These regimes depend upon the number of rules (N) and the minimum latitude (Lmin) required by a normal individual to carry out their activities. The Toy Model is then mapped onto the standard 1D Percolation Model from theoretical physics, and the same behavior is observed. This allows the Toy Model to be generalized to a wide array of more complex models that have been well studied by the theoretical physics community and also show the same behavior. Finally, by estimating N and Lmin, it should be possible to determine the regime of any particular environment.
READ LESS

Summary

Rules, regulations, and policies are the basis of civilized society and are used to coordinate the activities of individuals who have a variety of goals and purposes. History has taught that over-regulation (too many rules) makes it difficult to compete and under-regulation (too few rules) can lead to crisis. This...

READ MORE

Domain mismatch compensation for speaker recognition using a library of whiteners

Published in:
IEEE Signal Process. Lett., Vol. 22, No. 11, November 2015, pp. 2000-2003.

Summary

The development of the i-vector framework for generating low dimensional representations of speech utterances has led to considerable improvements in speaker recognition performance. Although these gains have been achieved in periodic National Institute of Standards and Technology (NIST) evaluations, the problem of domain mismatch, where the system development data and the application data are collected from different sources, remains a challenging one. The impact of domain mismatch was a focus of the Johns Hopkins University (JHU) 2013 speaker recognition workshop, where a domain adaptation challenge (DAC13) corpus was created to address this problem. This paper proposes an approach to domain mismatch compensation for applications where in-domain development data is assumed to be unavailable. The method is based on a generalization of data whitening used in association with i-vector length normalization and utilizes a library of whitening transforms trained at system development time using strictly out-of-domain data. The approach is evaluated on the 2013 domain adaptation challenge task and is shown to compare favorably to in-domain conventional whitening and to nuisance attribute projection (NAP) inter-dataset variability compensation.
READ LESS

Summary

The development of the i-vector framework for generating low dimensional representations of speech utterances has led to considerable improvements in speaker recognition performance. Although these gains have been achieved in periodic National Institute of Standards and Technology (NIST) evaluations, the problem of domain mismatch, where the system development data and...

READ MORE

Unlocking user-centered design methods for building cyber security visualizations(3.93 MB)

Published in:
Proceedings of 2015 IEEE Symposium on Visualization for Cyber Security (VizSec)

Summary

User-centered design can aid visualization designers to build better, more practical tools that meet the needs of cyber security users. In this paper, we discuss three design methods and illustrate how each method informed two real-world cyber security visualization projects which resulted in successful deployments to users.
READ LESS

Summary

User-centered design can aid visualization designers to build better, more practical tools that meet the needs of cyber security users. In this paper, we discuss three design methods and illustrate how each method informed two real-world cyber security visualization projects which resulted in successful deployments to users.

READ MORE

VAST Challenge 2015: Mayhem at Dinofun World(757.94 KB)

Published in:
Proceedings of 2015 IEEE Conference on Visual Analytics Science and Technology (VAST)

Summary

A fictitious amusement park and a larger-than-life hometown football hero provided participants in the VAST Challenge 2015 with an engaging yet complex storyline and setting in which to analyze movement and communication patterns.
READ LESS

Summary

A fictitious amusement park and a larger-than-life hometown football hero provided participants in the VAST Challenge 2015 with an engaging yet complex storyline and setting in which to analyze movement and communication patterns.

READ MORE

Mission assurance as a function of scale

Published in:
36th NATO Information Systems Technology Panel, 14-16 October 2015.

Summary

Since all Department of Defense (DoD) missions depend on cyber assets and capabilities, a dynamic and accurate cyber dependency analysis is a critical component of mission assurance. Mission analysis aims to identify hosts and applications that are "mission critical" so they can be monitored, and resources preferentially allocated to mitigate risks. For missions limited in duration and scale (tactical missions), dependency analysis is possible to conceptualize in principle, although currently difficult to realize in practice. However, for missions of long duration and large scale (strategic missions), the situation is murkier. In particular, cyber researchers struggle to find technologies that will scale up to large numbers of hosts and applications, since a typical strategic DoD mission might expect to leverage a large enterprise network. In this position paper, we argue that the difficulty is fundamental: as the mission timescale becomes longer and longer, and the number of hosts associated with the mission becomes larger and larger, the mission encompasses the entire network, and mission defense becomes indistinguishable from classic network defense. Concepts generally associated with mission assurance, such as fight-through, are not well suited to these long timescales and large networks. This train of thought leads us to reconsider the concept of "scalability" as it applies to mission assurance, and suggest that a hierarchical abstraction approach be applied. Large-scale, long duration mission assurance may be treated as the interaction of many small-scale, short duration tactical missions.
READ LESS

Summary

Since all Department of Defense (DoD) missions depend on cyber assets and capabilities, a dynamic and accurate cyber dependency analysis is a critical component of mission assurance. Mission analysis aims to identify hosts and applications that are "mission critical" so they can be monitored, and resources preferentially allocated to mitigate...

READ MORE

Control jujutsu: on the weaknesses of fine-grained control flow integrity

Published in:
22nd ACM Conf. on Computer and Communications Security, 12-16 October 2015.

Summary

Control flow integrity (CFI) has been proposed as an approach to defend against control-hijacking memory corruption attacks. CFI works by assigning tags to indirect branch targets statically and checking them at runtime. Coarse-grained enforcements of CFI that use a small number of tags to improve the performance overhead have been shown to be ineffective. As a result, a number of recent efforts have focused on fine-grained enforcement of CFI as it was originally proposed. In this work, we show that even a finegrained form of CFI with unlimited number of tags and a shadow stack (to check calls and returns) is ineffective in protecting against malicious attacks. We show that many popular code bases such as Apache and Nginx use coding practices that create flexibility in their intended control flow graph (CFG) even when a strong static analyzer is used to construct the CFG. These flexibilities allow an attacker to gain control of the execution while strictly adhering to a fine-grained CFI. We then construct two proof-of-concept exploits that attack an unlimited tag CFI system with a shadow stack. We also evaluate the difficulties of generating a precise CFG using scalable static analysis for real-world applications. Finally, we perform an analysis on a number of popular applications that highlights the availability of such attacks.
READ LESS

Summary

Control flow integrity (CFI) has been proposed as an approach to defend against control-hijacking memory corruption attacks. CFI works by assigning tags to indirect branch targets statically and checking them at runtime. Coarse-grained enforcements of CFI that use a small number of tags to improve the performance overhead have been...

READ MORE

Timely rerandomization for mitigating memory disclosures

Published in:
22nd ACM Conf. on Computer and Communications Security, 12-16 October 2015.

Summary

Address Space Layout Randomization (ASLR) can increase the cost of exploiting memory corruption vulnerabilities. One major weakness of ASLR is that it assumes the secrecy of memory addresses and is thus ineffective in the face of memory disclosure vulnerabilities. Even fine-grained variants of ASLR are shown to be ineffective against memory disclosures. In this paper we present an approach that synchronizes randomization with potential runtime disclosure. By applying rerandomization to the memory layout of a process every time it generates an output, our approach renders disclosures stale by the time they can be used by attackers to hijack control flow. We have developed a fully functioning prototype for x86_64 C programs by extending the Linux kernel, GCC, and the libc dynamic linker. The prototype operates on C source code and recompiles programs with a set of augmented information required to track pointer locations and support runtime rerandomization. Using this augmented information we dynamically relocate code segments and update code pointer values during runtime. Our evaluation on the SPEC CPU2006 benchmark, along with other applications, show that our technique incurs a very low performance overhead (2.1% on average).
READ LESS

Summary

Address Space Layout Randomization (ASLR) can increase the cost of exploiting memory corruption vulnerabilities. One major weakness of ASLR is that it assumes the secrecy of memory addresses and is thus ineffective in the face of memory disclosure vulnerabilities. Even fine-grained variants of ASLR are shown to be ineffective against...

READ MORE

Characterizing phishing threats with natural language processing

Author:
Published in:
2015 IEEE Conf. on Communications and Network Security (CNS), 28-30 September 2015.

Summary

Spear phishing is a widespread concern in the modern network security landscape, but there are few metrics that measure the extent to which reconnaissance is performed on phishing targets. Spear phishing emails closely match the expectations of the recipient, based on details of their experiences and interests, making them a popular propagation vector for harmful malware. In this work we use Natural Language Processing techniques to investigate a specific real-world phishing campaign and quantify attributes that indicate a targeted spear phishing attack. Our phishing campaign data sample comprises 596 emails - all containing a web bug and a Curriculum Vitae (CV) PDF attachment - sent to our institution by a foreign IP space. The campaign was found to exclusively target specific demographics within our institution. Performing a semantic similarity analysis between the senders' CV attachments and the recipients' LinkedIn profiles, we conclude with high statistical certainty (p < 10^-4) that the attachments contain targeted rather than randomly selected material. Latent Semantic Analysis further demonstrates that individuals who were a primary focus of the campaign received CVs that are highly topically clustered. These findings differentiate this campaign from one that leverages random spam.
READ LESS

Summary

Spear phishing is a widespread concern in the modern network security landscape, but there are few metrics that measure the extent to which reconnaissance is performed on phishing targets. Spear phishing emails closely match the expectations of the recipient, based on details of their experiences and interests, making them a...

READ MORE

Very large graphs for information extraction (VLG) - detection and inference in the presence of uncertainty

Summary

In numerous application domains relevant to the Department of Defense and the Intelligence Community, data of interest take the form of entities and the relationships between them, and these data are commonly represented as graphs. Under the Very Large Graphs for Information Extraction effort--a one year proof-of-concept study--MIT LL developed novel techniques for anomalous subgraph detection, building on tools in the signal processing research literature. This report documents the technical results of this effort. Two datasets--a snapshot of Thompson Reuters' Web of Science database and a stream of web proxy logs--were parsed, and graphs were constructed from the raw data. From the phenomena in these datasets, several algorithms were developed to model the dynamic graph behavior, including a preferential attachment mechanism with memory, a streaming filter to model a graph as a weighted average of its past connections, and a generalized linear model for graphs where connection probabilities are determined by additional side information or metadata. A set of metrics was also constructed to facilitate comparison of techniques. The study culminated in a demonstration of the algorithms on the datasets of interest, in addition to simulated data. Performance in terms of detection, estimation, and computational burden was measured according to the metrics. Among the highlights of this demonstration were the detection of emerging coauthor clusters in the Web of Science data, detection of botnet activity in the web proxy data after 15 minutes (which took 10 days to detect using state-of-the-practice techniques), and demonstration of the core algorithm on a simulated 1-billion-vertex graph using a commodity computing cluster.
READ LESS

Summary

In numerous application domains relevant to the Department of Defense and the Intelligence Community, data of interest take the form of entities and the relationships between them, and these data are commonly represented as graphs. Under the Very Large Graphs for Information Extraction effort--a one year proof-of-concept study--MIT LL developed...

READ MORE

Cyber network mission dependencies

Published in:
MIT Lincoln Laboratory Report TR-1189

Summary

Cyber assets are critical to mission success in every arena of the Department of Defense. Because all DoD missions depend on cyber infrastructure, failure to secure network assets and assure the capabilities they enable will pose a fundamental risk to any defense mission. The impact of a cyber attack is not well understood by warfighters or leadership. It is critical that the DoD develop better cognizance of Cyber Network Mission Dependencies (CNMD). This report identifies the major drivers for mapping missions to network assets, introduces existing technologies in the mission-mapping landscape, and proposes directions for future development.
READ LESS

Summary

Cyber assets are critical to mission success in every arena of the Department of Defense. Because all DoD missions depend on cyber infrastructure, failure to secure network assets and assure the capabilities they enable will pose a fundamental risk to any defense mission. The impact of a cyber attack is...

READ MORE