Publications
A data-stream classification system for investigating terrorist threats
Summary
Summary
The role of cyber forensics in criminal investigations has greatly increased in recent years due to the wealth of data that is collected and available to investigators. Physical forensics has also experienced a data volume and fidelity revolution due to advances in methods for DNA and trace evidence analysis. Key...
D4M and large array databases for management and analysis of large biomedical imaging data
Summary
Summary
Advances in medical imaging technologies have enabled the acquisition of increasingly large datasets. Current state-of-the-art confocal or multi-photon imaging technology can produce biomedical datasets in excess of 1 TB per dataset. Typical approaches for analyzing large datasets rely on downsampling the original datasets or leveraging distributed computing resources where small...
Rapid sequence identification of potential pathogens using techniques from sparse linear algebra
Summary
Summary
The decreasing costs and increasing speed and accuracy of DNA sample collection, preparation, and sequencing has rapidly produced an enormous volume of genetic data. However, fast and accurate analysis of the samples remains a bottleneck. Here we present D4RAGenS, a genetic sequence identification algorithm that exhibits the Big Data handling...
Using a big data database to identify pathogens in protein data space [e-print]
Summary
Summary
Current metagenomic analysis algorithms require significant computing resources, can report excessive false positives (type I errors), may miss organisms (type II errors/false negatives), or scale poorly on large datasets. This paper explores using big data database technologies to characterize very large metagenomic DNA sequences in protein space, with the ultimate...
Genetic sequence matching using D4M big data approaches
Summary
Summary
Recent technological advances in Next Generation Sequencing tools have led to increasing speeds of DNA sample collection, preparation, and sequencing. One instrument can produce over 600 Gb of genetic sequence data in a single run. This creates new opportunities to efficiently handle the increasing workload. We propose a new method...
Development and use of a comprehensive humanitarian assessment tool in post-earthquake Haiti
Summary
Summary
This paper describes a comprehensive humanitarian assessment tool designed and used following the January 2010 Haiti earthquake. The tool was developed under Joint Task Force -- Haiti coordination using indicators of humanitarian needs to support decision making by the United States Government, agencies of the United Nations, and various non-governmental...
Measurement of aerosol-particle trajectories using a structured laser beam
Summary
Summary
What is believed to be a new concept for the measurement of micrometer-sized particle trajectories in an inlet air stream is introduced. The technique uses a light source and a mask to generate a spatial pattern of light within a volume in space. Particles traverse the illumination volume and elastically...