Publications

Refine Results

(Filters Applied) Clear All

Colorization of H&E stained tissue using deep learning

Published in:
40th Int. Conf. of the IEEE Engineering in Medicine and Biology Society, EMBC, 17-21 July 2018.

Summary

Histopathology is a critical tool in the diagnosis and stratification of cancer. Digital Pathology involves the scanning of stained and fixed tissue samples to produce high-resolution images that can be used for computer-aided diagnosis and research. A common challenge in digital pathology related to the quality and characteristics of staining, which can vary widely from center to center and also within the same institution depending on the age of the stain and other human factors. In this paper we examine the use of deep learning models for colorizing H&E stained tissue images and compare the results with traditional image processing/statistical approaches that have been developed for standardizing or normalizing histopathology images. We adapt existing deep learning models that have been developed for colorizing natural images and compare the results with models developed specifically for digital pathology. Our results show that deep learning approaches can standardize the colorization of H&E images. The performance as measured by the chi-square statistic shows that the deep learning approach can be nearly as good as current state-of-the art normalization methods.
READ LESS

Summary

Histopathology is a critical tool in the diagnosis and stratification of cancer. Digital Pathology involves the scanning of stained and fixed tissue samples to produce high-resolution images that can be used for computer-aided diagnosis and research. A common challenge in digital pathology related to the quality and characteristics of staining...

READ MORE

Mission assurance: beyond secure processing

Published in:
18th IEEE Int. Conf. on Software Quality, Reliability, and Security, QRS 2018, 16-20 July 2018, pp. 593-8.

Summary

The processor of a drone runs essential functions of sensing, communications, coordination, and control. This is the conventional view. But in today's cyber environment, the processor must also provide security to assure mission completion. We have been developing a secure processing architecture for mission assurance. A study on state-of-the-art secure processing technologies has revealed that no one-size-fits-all solution can fully meet our requirements. In fact, we have concluded that the provision of a secure processor as a mission assurance foundation must be holistic and should be approached from a systems perspective. We have thus applied a systems analysis approach to create a secure base for the system. This paper describes our journey of adapting and synergizing various secure processing technologies into a baseline asymmetric multicore processing architecture. We will also describe a functional and security co-design environment, created to customize and optimize the architecture in a design space consisting of hardware, software, performance, and assurance.
READ LESS

Summary

The processor of a drone runs essential functions of sensing, communications, coordination, and control. This is the conventional view. But in today's cyber environment, the processor must also provide security to assure mission completion. We have been developing a secure processing architecture for mission assurance. A study on state-of-the-art secure...

READ MORE

Curator: provenance management for modern distributed systems

Published in:
10th Intl. Workshop on Theory and Practice of Provenance, TaPP, 11-12 July 2018.

Summary

Data provenance is a valuable tool for protecting and troubleshooting distributed systems. Careful design of the provenance components reduces the impact on the design, implementation, and operation of the distributed system. In this paper, we present Curator, a provenance management toolkit that can be easily integrated with microservice-based systems and other modern distributed systems. This paper describes the design of Curator and discusses how we have used Curator to add provenance to distributed systems. We find that our approach results in no changes to the design of these distributed systems and minimal additional code and dependencies to manage. In addition, Curator uses the same scalable infrastructure as the distributed system and can therefore scale with the distributed system.
READ LESS

Summary

Data provenance is a valuable tool for protecting and troubleshooting distributed systems. Careful design of the provenance components reduces the impact on the design, implementation, and operation of the distributed system. In this paper, we present Curator, a provenance management toolkit that can be easily integrated with microservice-based systems and...

READ MORE

A secure cloud with minimal provider trust

Summary

Bolted is a new architecture for a bare metal cloud with the goal of providing security-sensitive customers of a cloud the same level of security and control that they can obtain in their own private data centers. It allows tenants to elastically allocate secure resources within a cloud while being protected from other previous, current, and future tenants of the cloud. The provisioning of a new server to a tenant isolates a bare metal server, only allowing it to communicate with other tenant's servers once its critical firmware and software have been attested to the tenant. Tenants, rather than the provider, control the tradeoffs between security, price, and performance. A prototype demonstrates scalable end-to-end security with small overhead compared to a less secure alternative.
READ LESS

Summary

Bolted is a new architecture for a bare metal cloud with the goal of providing security-sensitive customers of a cloud the same level of security and control that they can obtain in their own private data centers. It allows tenants to elastically allocate secure resources within a cloud while being...

READ MORE

Lessons learned from a decade of providing interactive, on-demand high performance computing to scientists and engineers

Summary

For decades, the use of HPC systems was limited to those in the physical sciences who had mastered their domain in conjunction with a deep understanding of HPC architectures and algorithms. During these same decades, consumer computing device advances produced tablets and smartphones that allow millions of children to interactively develop and share code projects across the globe. As the HPC community faces the challenges associated with guiding researchers from disciplines using high productivity interactive tools to effective use of HPC systems, it seems appropriate to revisit the assumptions surrounding the necessary skills required for access to large computational systems. For over a decade, MIT Lincoln Laboratory has been supporting interactive, on demand high performance computing by seamlessly integrating familiar high productivity tools to provide users with an increased number of design turns, rapid prototyping capability, and faster time to insight. In this paper, we discuss the lessons learned while supporting interactive, on-demand high performance computing from the perspectives of the users and the team supporting the users and the system. Building on these lessons, we present an overview of current needs and the technical solutions we are building to lower the barrier to entry for new users from the humanities, social, and biological sciences.
READ LESS

Summary

For decades, the use of HPC systems was limited to those in the physical sciences who had mastered their domain in conjunction with a deep understanding of HPC architectures and algorithms. During these same decades, consumer computing device advances produced tablets and smartphones that allow millions of children to interactively...

READ MORE

On large-scale graph generation with validation of diverse triangle statistics at edges and vertices

Published in:
2018 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW, 21 May 2018.

Summary

Researchers developing implementations of distributed graph analytic algorithms require graph generators that yield graphs sharing the challenging characteristics of real-world graphs (small-world, scale-free, heavy-tailed degree distribution) with efficiently calculable ground-truth solutions to the desired output. Reproducibility for current generators used in benchmarking are somewhat lacking in this respect due to their randomness: the output of a desired graph analytic can only be compared to expected values and not exact ground truth. Nonstochastic Kronecker product graphs meet these design criteria for several graph analytics. Here we show that many flavors of triangle participation can be cheaply calculated while generating a Kronecker product graph.
READ LESS

Summary

Researchers developing implementations of distributed graph analytic algorithms require graph generators that yield graphs sharing the challenging characteristics of real-world graphs (small-world, scale-free, heavy-tailed degree distribution) with efficiently calculable ground-truth solutions to the desired output. Reproducibility for current generators used in benchmarking are somewhat lacking in this respect due to...

READ MORE

Next-generation embedded processors: an update

Published in:
GOMACTech Conf., 12-15 March 2018.

Summary

For mission assurance, Department of Defense (DoD) embedded systems should be designed to mitigate various aspects of cyber risks, while maintaining performance (size, weight, power, cost, and schedule). This paper reports our latest research effort in the development of a next-generation System-on-Chip (SoC) for DoD applications, which we first presented in GOMACTech 2014. This paper focuses on our ongoing work to enhance the mission assurance of its programmable processor. We will explain our updated processor architecture, justify the use of resources, and assess the processor's suitability for mission assurance.
READ LESS

Summary

For mission assurance, Department of Defense (DoD) embedded systems should be designed to mitigate various aspects of cyber risks, while maintaining performance (size, weight, power, cost, and schedule). This paper reports our latest research effort in the development of a next-generation System-on-Chip (SoC) for DoD applications, which we first presented...

READ MORE

Classifier performance estimation with unbalanced, partially labeled data

Published in:
Proc. Machine Learning Research, Vol. 88, 2018, pp. 4-16.

Summary

Class imbalance and lack of ground truth are two significant problems in modern machine learning research. These problems are especially pressing in operational contexts where the total number of data points is extremely large and the cost of obtaining labels is very high. In the face of these issues, accurate estimation of the performance of a detection or classification system is crucial to inform decisions based on the observations. This paper presents a framework for estimating performance of a binary classifier in such a context. We focus on the scenario where each set of measurements has been reduced to a score, and the operator only investigates data when the score exceeds a threshold. The operator is blind to the number of missed detections, so performance estimation targets two quantities: recall and the derivative of precision with respect to recall. Measuring with respect to error in these two metrics, simulations in this context demonstrate that labeling outliers not only outperforms random labeling, but often matches performance of an adaptive method that attempts to choose the optimal data for labeling. Application to real anomaly detection data confirms the utility of the approach, and suggests direction for future work.
READ LESS

Summary

Class imbalance and lack of ground truth are two significant problems in modern machine learning research. These problems are especially pressing in operational contexts where the total number of data points is extremely large and the cost of obtaining labels is very high. In the face of these issues, accurate...

READ MORE

Improving security at the system-call boundary in a type-safe operating system

Published in:
Thesis (M.E.)--Massachusetts Institute of Technology, 2018.

Summary

Historically, most approaches to operating sytems security aim to either protect the kernel (e.g., the MMU) or protect user applications (e.g., W exclusive or X). However, little study has been done into protecting the boundary between these layers. We describe a vulnerability in Tock, a type-safe operating system, at the system-call boundary. We then introduce a technique for providing memory safety at the boundary between userland and the kernel in Tock. We demonstrate that this technique works to prevent against the aforementioned vulnerability and a class of similar vulnerabilities, and we propose how it might be used to protect against simliar vulnerabilities in other operating systems.
READ LESS

Summary

Historically, most approaches to operating sytems security aim to either protect the kernel (e.g., the MMU) or protect user applications (e.g., W exclusive or X). However, little study has been done into protecting the boundary between these layers. We describe a vulnerability in Tock, a type-safe operating system, at the...

READ MORE

Detecting pathogen exposure during the non-symptomatic incubation period using physiological data

Summary

Early pathogen exposure detection allows better patient care and faster implementation of public health measures (patient isolation, contact tracing). Existing exposure detection most frequently relies on overt clinical symptoms, namely fever, during the infectious prodromal period. We have developed a robust machine learning based method to better detect asymptomatic states during the incubation period using subtle, sub-clinical physiological markers. Starting with highresolution physiological waveform data from non-human primate studies of viral (Ebola, Marburg, Lassa, and Nipah viruses) and bacterial (Y. pestis) exposure, we processed the data to reduce short-term variability and normalize diurnal variations, then provided these to a supervised random forest classification algorithm and post-classifier declaration logic step to reduce false alarms. In most subjects detection is achieved well before the onset of fever; subject cross-validation across exposure studies (varying viruses, exposure routes, animal species, and target dose) lead to 51h mean early detection (at 0.93 area under the receiver-operating characteristic curve [AUCROC]). Evaluating the algorithm against entirely independent datasets for Lassa, Nipah, and Y. pestis exposures un-used in algorithm training and development yields a mean 51h early warning time (at AUCROC=0.95). We discuss which physiological indicators are most informative for early detection and options for extending this capability to limited datasets such as those available from wearable, non-invasive, ECG-based sensors.
READ LESS

Summary

Early pathogen exposure detection allows better patient care and faster implementation of public health measures (patient isolation, contact tracing). Existing exposure detection most frequently relies on overt clinical symptoms, namely fever, during the infectious prodromal period. We have developed a robust machine learning based method to better detect asymptomatic states...

READ MORE