Proceedings of the Conference on Programming Language Design and Implementation Year 2009 Peer-reviewed

BibTeX PDF

Computer Science · Research

Merlin: Specification Inference for Explicit Information Flow Problems

Benjamin Livshits Aditya V. Nori Sriram K. Rajamani Anindya Banerjee

Peer-reviewed

Type

PLDI

Venue

2009

Publication year

§ Abstract

Problem

Abstract The last several years have seen a proliferation of static and run- time analysis tools for ﬁnding security violations that are caused by explicit information ﬂow in programs. Much of this interest has been caused by the increase in the number of vulnerabilities such as cross-site scripting and SQL injection. In fact, these explicit infor- mation ﬂow vulnerabilities commonly found in Web applications now outnumber vulnerabilities such as buffer overruns common in type-unsafe languages such as C and C++. Tools checking for these vulnerabilities require a speciﬁcation to operate. In most cases the task of providing such a speciﬁcation is delegated to the user. More- over, the efﬁcacy of these tools is only as good as the speciﬁca- tion. Unfortunately, writing a comprehensive speciﬁcation presents a major challenge: parts of the speciﬁcation are easy to miss, lead- ing to missed vulnerabilities; similarly, incorrect speciﬁcations may lead to false positives. This paper proposes MERLIN, a new approach for automati- cally inferring explicit information ﬂow speciﬁcations from pro- gram code.

Approach

Such speciﬁcations greatly reduce manual labor, and enhance the quality of results, while using tools that check for secu- rity violations caused by explicit information ﬂow. Beginning with a data propagation graph, which represents interprocedural ﬂow of information in the program, MERLIN aims to automatically in- fer an information ﬂow speciﬁcation. MERLIN models information ﬂow paths in the propagation graph using probabilistic constraints. A na¨ıve modeling requires an exponential number of constraints, one per path in the propagation graph. For scalability, we approx- imate these path constraints using constraints on chosen triples of nodes, resulting in a cubic number of constraints. We characterize this approximation as a probabilistic abstraction, using the theory of probabilistic reﬁnement developed by McIver and Morgan. We solve the resulting system of probabilistic constraints using factor graphs, which are a well-known structure for performing proba- bilistic inference. We experimentally validate the MERLIN approach by applying it to 10 large business-critical Web applications that have been Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proﬁt or commercial advantage and that copies bear this notice and the full citation on the ﬁrst page.

Results

To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior speciﬁc permission and/or a fee. PLDI’09, June 15–20, 2009, Dublin, Ireland. Copyright c⃝2009 ACM 978-1-60558-392-1/09/06...$5.00 ∗Partially supported at Kansas State University by NSF grants ITR- 0326577 and CNS-0627748 and by Microsoft Research, Redmond, by way of a sabbatical visit. analyzed with CAT.NET, a state-of-the-art static analysis tool for .NET. We ﬁnd a total of 167 new conﬁrmed speciﬁcations, which result in a total of 322 additional vulnerabilities across the 10 benchmarks. More accurate speciﬁcations also reduce the false positive rate: in our experiments, MERLIN-inferred speciﬁcations result in 13 false positives being removed; this constitutes a 15% reduction in the CAT.NET false positive rate on these 10 programs. The ﬁnal false positive rate for CAT.NET after applying MERLIN in our experiments drops to under 1%. Categories and Subject Descriptors D.3.4 [Processors]: Com- pilers; D.4.6 [Operating Systems]: Security and Protection— Information ﬂow controls; D.2.4 [Software/Program Veriﬁcation]: Statistical methods General Terms Languages, Security, Veriﬁcation

Cite this paper — BibTeX

@InProceedings{livshits09pldi,
  author = {Benjamin Livshits and Aditya V. Nori and Sriram K. Rajamani and Anindya Banerjee},
  title = {Merlin: Specification Inference for Explicit Information Flow Problems},
  booktitle = Proceedings of the Conference on Programming Language Design and Implementation},
  month =  jun,
  year = 2009,
  location = "Dublin, Ireland",
}