Current Projects

CHISEL: General-purpose software debloating

The size and complexity of modern software have been rapidly increasing, thereby causing security vulnerabilities as well as performance degradation. The goal of this project is to build an automatic program debloating system to reverse this trend. Unlike traditional approaches such as code optimization, our system aggressively removes undesired functionalities and customizes the original program according to a given high-level specification. The project is addressing a multitude of research problems such as how to specify the desired behavior, how to efficiently debloat programs in language-specific and language-independent ways, and what correctness guarantees to provide.
Chisel System Diagram
Papers: [CCS 2018]
Links: [Project Page] [Tool]

PETABLOX: AI-based program analysis framework

This project seeks to achieve a quantum leap in accuracy of static program analyses that target discovering deep semantic bugs and security vulnearibilities in large C/C++/JAVA programs. Even though a variety of techniques have significantly enhanced the performance of program analyzers, traditional systems has been limited in their accuracy due to the fundamental reason, i.e., undecidability. My research has tackled this limitation by incorporating external information such as user feedback. The key insight is to rank analysis alarms by performing inference on a conventional graphical model (e.g., a Bayesian network or a Markov logic network) that quantifies the incompleteness in individual steps of reasoning of the analysis. Petablox System Diagram
Papers: [PLDI 2019] [PLDI 2018]
Links: [Project Page] [Tool]

EUPHONY: A data-driven program synthesizer

A key challenge in program synthesis concerns how to efficiently search for the desired program in the space of possible programs. We propose a general approach to accelerate search-based program synthesis by biasing the search towards likely programs. Our approach targets a standard formulation, syntax-guided synthesis (SyGuS), by extending the grammar of possible programs with a probabilistic model dictating the likelihood of each program. We develop a weighted search algorithm to efficiently enumerate programs in order of their likelihood. We also propose a method based on transfer learning that enables to effectively learn a powerful model, called probabilistic higher order grammar, from known solutions in a domain. We have implemented our approach in a tool called Euphony and evaluate it on SyGuS benchmark problems from a variety of domains. Euphony System Diagram
Papers: [PLDI 2018]
Links: [Tool]

Former Projects