Publications

Refine Results

(Filters Applied) Clear All

A taxonomy of buffer overflows for evaluating static and dynamic software testing tools

Published in:
NIST Workshop on Software Security, Assurance Tools, Techniques, and Metrics, 7-8 November 2005.

Summary

A taxonomy that uses twenty-two attributes to characterize C-program overflows was used to construct 291 small C-program test cases that can be used to diagnostically determine the basic capabilities of static and dynamic analysis buffer overflow detection tools. Attributes in the taxonomy include the buffer location (e.g. stack, heap, data region, BSS, shared memory); scope difference between buffer allocation and access; index, pointer, and alias complexity when addressing buffer elements; complexity of the control flow and loop structure surrounding the overflow; type of container the buffer is within (e.g. structure, union, array); whether the overflow is caused by a signed/unsigned type error; the overflow magnitude and direction; and whether the overflow is discrete or continuous. As an example, the 291 test cases were used to measure the detection, false alarm, and confusion rates of five static analysis tools. They reveal specific strengths and limitations of tools and suggest directions for improvements.
READ LESS

Summary

A taxonomy that uses twenty-two attributes to characterize C-program overflows was used to construct 291 small C-program test cases that can be used to diagnostically determine the basic capabilities of static and dynamic analysis buffer overflow detection tools. Attributes in the taxonomy include the buffer location (e.g. stack, heap, data...

READ MORE

Using a diagnostic corpus of C programs to evaluate buffer overflow detection by static analysis tools

Published in:
10th European Software Engineering Conf., 5-9 September 2005.

Summary

A corpus of 291 small C-program test cases was developed to evaluate static and dynamic analysis tools designed to detect buffer overflows. The corpus was designed and labeled using a new, comprehensive buffer overflow taxonomy. It provides a benchmark to measure detection, false alarm, and confusion rates of tools, and also suggests areas for tool enhancement. Experiments with five tools demonstrate that some modern static analysis tools can accurately detect overflows in simple test cases but that others have serious limitations. For example, PolySpace demonstrated a superior detection rate, missing only one detection. Its performance could be enhanced if extremely long run times were reduced, and false alarms were eliminated for some C library functions. ARCHER performed well with no false alarms whatsoever. It could be enhanced by improving inter-procedural analysis and handling of C library functions. Splint detected significantly fewer overflows and exhibited the highest false alarm rate. Improvements in loop handling and reductions in false alarm rate would make it a much more useful tool. UNO had no false alarms, but missed overflows in roughly half of all test cases. It would need improvement in many areas to become a useful tool. BOON provided the worst performance. It did not detect overflows well in string functions, even though this was a design goal.
READ LESS

Summary

A corpus of 291 small C-program test cases was developed to evaluate static and dynamic analysis tools designed to detect buffer overflows. The corpus was designed and labeled using a new, comprehensive buffer overflow taxonomy. It provides a benchmark to measure detection, false alarm, and confusion rates of tools, and...

READ MORE

Dynamic buffer overflow detection

Published in:
Workshop on Defining the State of the Art in Security Software Tools, 10-11 August 2005.

Summary

The capabilities of seven dynamic buffer overflow detection tools (Chaperon, Valgrind, CCured, CRED, Insure++, ProPolice and TinyCC) are evaluated in this paper. These tools employ different approaches to runtime buffer overflow detection and range from commercial products to open-source gcc-enhancements. A comprehensive testsuite was developed consisting of specifically-designed test cases and model programs containing real-world vulnerabilities. Insure++, CCured and CRED provide the highest buffer overflow detection rates, but only CRED provides an open-source, extensible and scalable solution to detecting buffer overflows. Other tools did not detect off-by-one errors, did not scale to large programs, or performed poorly on complex programs.
READ LESS

Summary

The capabilities of seven dynamic buffer overflow detection tools (Chaperon, Valgrind, CCured, CRED, Insure++, ProPolice and TinyCC) are evaluated in this paper. These tools employ different approaches to runtime buffer overflow detection and range from commercial products to open-source gcc-enhancements. A comprehensive testsuite was developed consisting of specifically-designed test cases...

READ MORE

Dynamic buffer overflow detection

Published in:
Workshop on the Evaluation of Software Defect Detection Tools, 10 June 2005.

Summary

The capabilities of seven dynamic buffer overflow detection tools (Chaperon, Valgrind, CCured, CRED, Insure++, ProPolice and TinyCC) are evaluated in this paper. These tools employ different approaches to runtime buffer overflow detection and range from commercial products to open source gcc-enhancements. A comprehensive test suite was developed consisting of specifically-designed test cases and model programs containing real-world vulnerabilities. Insure++, CCured and CRED provide the highest buffer overflow detection rates, but only CRED provides an open-source, extensible and scalable solution to detecting buffer overflows. Other tools did not detect off-by-one errors, did not scale to large programs, or performed poorly on complex programs.
READ LESS

Summary

The capabilities of seven dynamic buffer overflow detection tools (Chaperon, Valgrind, CCured, CRED, Insure++, ProPolice and TinyCC) are evaluated in this paper. These tools employ different approaches to runtime buffer overflow detection and range from commercial products to open source gcc-enhancements. A comprehensive test suite was developed consisting of specifically-designed...

READ MORE

An annotated review of past papers on attack graphs

Published in:
MIT Lincoln Laboratory Report IA-1

Summary

This report reviews past research papers that describe how to construct attack graphs, how to use them to improve security of computer networks, and how to use them to analyze alerts from intrusion detection systems. Two commercial systems are described [I, 2], and a summary table compares important characteristics of past research studies. For each study, information is provided on the number of attacker goals, how graphs are constructed, sizes of networks analyzed, how well the approach scales to larger networks, and the general approach. Although research has made significant progress in the past few years, no system has analyzed networks with more than 20 hosts, and computation for most approaches scales poorly and would be impractical for networks with more than even a few hundred hosts. Current approaches also are limited because many require extensive and difficult-to-obtain details on attacks, many assume that host-to-host reachability information between all hosts is already available, and many produce an attack graph but do not automatically generate recommendations from that graph. Researchers have suggested promising approaches to alleviate some of these limitations, including grouping hosts to improve scaling, using worst-case default values for unknown attack details, and symbolically analyzing attack graphs to generate recommendations that improve security for critical hosts. Future research should explore these and other approaches to develop attack graph construction and analysis algorithms that can be applied to large enterprise networks.
READ LESS

Summary

This report reviews past research papers that describe how to construct attack graphs, how to use them to improve security of computer networks, and how to use them to analyze alerts from intrusion detection systems. Two commercial systems are described [I, 2], and a summary table compares important characteristics of...

READ MORE

Evaluating static analysis tools for detecting buffer overflows in C code

Published in:
Thesis (MLA)--Harvard University, 2005.

Summary

This project evaluated five static analysis tools using a diagnostic test suite to determine their strengths and weaknesses in detecting a variety of buffer overflow flaws in C code. Detection, false alarm, and confusion rates were measured, along with execution time. PolySpace demonstrated a superior detection rate on the basic test suite, missing only one out of a possible 291 detections. It may benefit from improving its treatment of signal handlers, and reducing both its false alarm rate (particularly for C library functions) and execution time. ARCHER performed quite well with no false alarms whatsoever; a few key enhancements, such as in its inter-procedural analysis and handling of C library functions, would boost its detection rate and should improve its performance on real-world code. Splint detected significantly fewer overflows and exhibited the highest false alarm rate. Improvements in its loop handling, and reductions in its false alarm rate would make it a much more useful tool. UNO had no false alarms, but missed a broad variety of overflows amounting to nearly half of the possible detections in the test suite. It would need improvement in many areas to become a very useful tool. BOON was clearly at the back of the pack, not even performing well on the subset of test cases where it could have been expected to function. The project also provides a buffer overflow taxonomy, along with a test suite generator and other tools, that can be used by others to evaluate code analysis tools with respect to buffer overflow detection.
READ LESS

Summary

This project evaluated five static analysis tools using a diagnostic test suite to determine their strengths and weaknesses in detecting a variety of buffer overflow flaws in C code. Detection, false alarm, and confusion rates were measured, along with execution time. PolySpace demonstrated a superior detection rate on the basic...

READ MORE

Passive operating system identification from TCP/IP packet headers

Published in:
ICDM Workshop on Data Mining for Computer Security, DMSEC, 19 November 2003.

Summary

Accurate operating system (OS) identification by passive network traffic analysis can continuously update less-frequent active network scans and help interpret alerts from intrusion detection systems. The most recent open-source passive OS identification tool (ettercap) rejects 70% of all packets and has a high 75-class error rate of 30% for non-rejected packets on unseen test data. New classifiers were developed using machine-learning approaches including cross-validation testing, grouping OS names into fewer classes, and evaluating alternate classifier types. Nearest neighbor and binary tree classifiers provide a low 9-class OS identification error rate of roughly 10% on unseen data without rejecting packets. This error rate drops to nearly zero when 10% of the packets are rejected.
READ LESS

Summary

Accurate operating system (OS) identification by passive network traffic analysis can continuously update less-frequent active network scans and help interpret alerts from intrusion detection systems. The most recent open-source passive OS identification tool (ettercap) rejects 70% of all packets and has a high 75-class error rate of 30% for non-rejected...

READ MORE

The effect of identifying vulnerabilities and patching software on the utility of network intrusion detection

Published in:
Proc. 5th Int. Symp. on Recent Advances in Intrusion Detection, RAID 2002, 16-18 October 2002, pp. 307-326.

Summary

Vulnerability scanning and installing software patches for known vulnerabilities greatly affects the utility of network-based intrusion detection systems that use signatures to detect system compromises. A detailed timeline analysis of important remote-to-local vulnerabilities demonstrates (1) vulnerabilities in widely-used server software are discovered infrequently (at most 6 times a year) and (2) Software patches to prevent vulnerabilities from being exploited are available before or simultaneously with signatures. Signature-based intrusion detection systems will thus never detect successful system compromises on small secure sites when patches are installed as soon as they are available. Network intrusion detection systems may detect successful system compromises on large sites where it is impractical to eliminate all known vulnerabilities. On such sites, information from vulnerability scanning can be used to prioritize the large numbers of extraneous alerts caused by failed attacks and normal background MIC. On one class B network with roughly 10 web servers, this approach successfully filtered out 95% of all remote-to-local alerts.
READ LESS

Summary

Vulnerability scanning and installing software patches for known vulnerabilities greatly affects the utility of network-based intrusion detection systems that use signatures to detect system compromises. A detailed timeline analysis of important remote-to-local vulnerabilities demonstrates (1) vulnerabilities in widely-used server software are discovered infrequently (at most 6 times a year) and...

READ MORE

Automated generation and analysis of attack graphs

Published in:
Proc. of the 2002 IEEE Symp. on Security and Privacy, 12-15 May 2002, pp. 254-265.

Summary

An integral part of modeling the global view of network security is constructing attack graphs. In practice, attack graphs are produced manually by Red Teams. Construction by hand, however, is tedious, error-prone, and impractical for attack graphs have larger than a hundred nodes. In this paper we present an automated technique for generating and analyzing attack graphs. We base our technique on symbolic model checking algorithms, letting us construct attack graphs automatically and efficiently. We also describe two analyses to help decide which attacks would be most cost-effective to guard against. We implemented our techniques in a tool suite and tested it on a small network example, which includes models of a firewall and an intrusion detection system.
READ LESS

Summary

An integral part of modeling the global view of network security is constructing attack graphs. In practice, attack graphs are produced manually by Red Teams. Construction by hand, however, is tedious, error-prone, and impractical for attack graphs have larger than a hundred nodes. In this paper we present an automated...

READ MORE

Extending the DARPA off-line intrusion detection evaluations

Published in:
DARPA Information Survivability Conf. and Exposition II, 12-14 June 2001, pp. 35-45.

Summary

The 1998 and 1999 DARPA off-line intrusion detection evaluations assessed the performance of intrusion detection systems using realistic background traffic and many examples of realistic attacks. This paper discusses three extensions to these evaluations. First, the Lincoln Adaptable Real-time Information Assurance Testbed (LARIAT) has been developed to simplify intrusion detection development and evaluation. LARIAT allows researchers and operational users to rapidly configure and run real-time intrusion detection and correlation tests with robust background traffic and attacks in their laboratories. Second, "Scenario Datasets" have been crafted to provide examples of multiple component attack scenarios instead of the atomic , attacks as found in past evaluations. Third, extensive analysis of the 1999 evaluation data and results has provided understanding of many attacks, their manifestations, and the features used to detect them. This analysis will be used to develop models of attacks, intrusion detection systems, and intrusion detection system alerts. Successful models could reduce the need for expensive experimentation, allow proof-of-concept analysis and simulations, and form the foundation of a theory of intrusion detection.
READ LESS

Summary

The 1998 and 1999 DARPA off-line intrusion detection evaluations assessed the performance of intrusion detection systems using realistic background traffic and many examples of realistic attacks. This paper discusses three extensions to these evaluations. First, the Lincoln Adaptable Real-time Information Assurance Testbed (LARIAT) has been developed to simplify intrusion detection...

READ MORE