ABSTRACT
Program analysis has been increasingly used in software engineering tasks such as auditing programs for security vulnerabilities and finding errors in general. Such tools often require analyses much more sophisticated than those traditionally used in compiler optimizations. In particular, context-sensitive pointer alias information is a prerequisite for any sound and precise analysis that reasons about uses of heap objects in a program. Context-sensitive analysis is challenging because there are over 1014 contexts in a typical large program, even after recursive cycles are collapsed. Moreover, pointers cannot be resolved in general without analyzing the entire program.
This paper presents a new framework, based on the concept of deductive databases, for context-sensitive program analysis. In this framework, all program information is stored as relations; data access and analyses are written as Datalog queries. To handle the large number of contexts in a program, the database represents relations with binary decision diagrams (BDDs). The system we have developed, called bddbddb, automatically translates database queries into highly optimized BDD programs.
Our preliminary experiences suggest that a large class of analyses involving heap objects can be described succinctly in Datalog and implemented efficiently with BDDs. To make developing application-specific analyses easy for programmers, we have also created a language called PQL that makes a subset of Datalog queries more intuitive to define. We have used the language to find many security holes in Web applications.
- B. Arkin, S. Stender, and G. McGraw. Software penetration testing. Proceedings of the IEEE Symposium on Security and Privacy, 3(1):84--87, 2005.]] Google ScholarDigital Library
- D. Avots, M. Dalton, V. B. Livshits, and M. S. Lam. Using C pointer analysis to improve software security. In Proceedings of the 27th International Conference on Software Engineering, May 2005.]] Google ScholarDigital Library
- I. Balbin and K. Ramamohanarao. A generalization of the differential approach to recursive query optimization. Journal of Logic Programming, 4(3):259--262, Sept. 1987.]] Google ScholarDigital Library
- T. Ball and S. Rajamani. SLIC: A specification language for interface checking (of C). Technical Report MSR-TR-2001-21, Microsoft Research, Jan. 2002.]]Google Scholar
- T. Ball and S. K. Rajamani. A symbolic model checker for boolean programs. In Proceedings of the SPIN 2000 Workshop on Model Checking of Software, pages 113--130, Aug. 2000.]] Google ScholarDigital Library
- F. Bancilhon, D. Maier, Y. Sagiv, and J. D. Ullman. Magic sets and other strange ways to implement logic programs (extended abstract). In PODS '86: Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, pages 1--15, 1986.]] Google ScholarDigital Library
- M. Berndl, O. Lhoták, F. Qian, L. Hendren, and N. Umanee. Points-to analysis using BDDs. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, pages 103--114, June 2003.]] Google ScholarDigital Library
- F. Besson and T. Jensen. Modular class analysis with Datalog. In Proceedings of the 10th International Static Analysis Symposium, pages 19--36, June 2003.]] Google ScholarDigital Library
- D. Beyer, A. Noack, and C. Lewerentz. Simple and efficient relational querying of software structures. In Proceedings of the 10th IEEE Working Conference on Reverse Engineering, pages 216--225, Nov. 2003.]] Google ScholarDigital Library
- R. E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Trans. Comput., 35(8):677--691, 1986.]] Google ScholarDigital Library
- B. Buege, R. Layman, and A. Taylor. Hacking Exposed: J2EE and Java: Developing Secure Applications with Java Technology. McGraw-Hill/Osborne, 2002.]]Google Scholar
- W. R. Bush, J. D. Pincus, and D. J. Sielaff. A static analyzer for finding dynamic programming errors. Software - Practice and Experience (SPE), 30:775--802, 2000.]] Google ScholarDigital Library
- S. Ceri, G. Gottlob, and L. Tanca. Logic programming and databases. Springer-Verlag New York, Inc., 1990.]] Google ScholarDigital Library
- A. Chandra and D. Harel. Horn clauses and generalizations. Journal of Logic Programming, 2(1):1--15, 1985.]]Google ScholarCross Ref
- W. Chen and D. S. Warren. Tabled evaluation with delaying for general logic programs. J. ACM, 43(1):20--74, 1996.]] Google ScholarDigital Library
- B. Chess and G. McGraw. Static analysis for security. Proceedings of the IEEE Symposium on Security and Privacy, 2(6):76--79, 2004.]] Google ScholarDigital Library
- H. Cohen. BuDDy - a Binary Decision Diagram package. http://buddy. sourceforge.net/.]]Google Scholar
- S. Cook. A Web developer's guide to cross-site scripting. http://www.giac.org/practical/GSEC/Steve_Cook_GSEC. pdf, 2003.]]Google Scholar
- M.-M. Corsini, K. Musumbu, A. Rauzy, and B. L. Charlier. Efficient bottom-up abstract interpretation of prolog by means of constraint solving over symbolic finite domains. In PLILP '93: Proceedings of the 5th International Symposium on Programming Language Implementation and Logic Programming, pages 75--91, 1993.]] Google ScholarDigital Library
- M. Das, B. Liblit, M. Fähndrich, and J. Rehof. Estimating the impact of scalable pointer analysis on optimization. In Proceedings of the 8th International Static Analysis Symposium, pages 260--278, July 2001.]] Google ScholarDigital Library
- M. Emami, R. Ghiya, and L. J. Hendren. Context-sensitive interprocedural points-to analysis in the presence of function pointers. In Proceedings of the ACM SIGPLAN'94 Conference on Programming Language Design and Implementation, pages 242--256, June 1994.]] Google ScholarDigital Library
- M. Fähndrich, J. Rehof, and M. Das. Scalable context-sensitive flow analysis using instantiation constraints. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, pages 253--263, June 2000.]] Google ScholarDigital Library
- R. Greiner, A. J. Grove, and D. Roth. Learning cost-sensitive active classifiers. Artif. Intell., 139(2):137--174, 2002.]] Google ScholarDigital Library
- J. Grossman. WASC activities and U.S. Web application security trends. http://www.whitehatsec.com/presentations/WASC_WASF_1.02.pdf, 2004.]]Google Scholar
- O. Grumberg, S. Livne, and S. Markovitch. Learning to order BDD variables in verification. Journal of Artificial Intelligence Research, 18:83--116, 2003.]] Google ScholarDigital Library
- S. Hallem, B. Chelf, Y. Xie, and D. Engler. A system and language for building system-specific, static analyses. In Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation, pages 69--82, 2002.]] Google ScholarDigital Library
- N. Heintze and O. Tardieu. Ultra-fast aliasing analysis using CLA: A million lines of C code in a second. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation, pages 254--263, June 2001.]] Google ScholarDigital Library
- G. Hulme. New software may improve application security. http://www.informationweek.com/story/IWK20010209S0003, 2001.]]Google Scholar
- M. Iwaihara and Y. Inoue. Bottom-up evaluation of logic programs using binary decision diagrams. In ICDE '95: Proceedings of the Eleventh International Conference on Data Engineering, pages 467--474, 1995.]] Google ScholarDigital Library
- R. Johnson and D. Wagner. Finding user/kernel pointer bugs with type inference. In Proceedings of the 2004 Usenix Security Conference, pages 119--134, 2004.]] Google ScholarDigital Library
- A. Klein. Divide and conquer: HTTP response splitting, Web cache poisoning attacks, and related topics. http://www.sanctuminc.com/pdf/Whitepaper_HTTPResponse.pdf, 2004.]]Google Scholar
- W. Landi, B. G. Ryder, and S. Zhang. Interprocedural modification side effect analysis with pointer aliasing. In Proceedings of the ACM SIGPLAN'93 Conference on Programming Language Design and Implementation, pages 56--67, June 1993.]] Google ScholarDigital Library
- O. Lhoták and L. Hendren. Scaling Java points-to analysis using Spark. In Proceedings of the 12th International Conference on Compiler Construction, pages 153--169, April 2003.]] Google ScholarDigital Library
- O. Lhoták and L. Hendren. Jedd: a BDD-based relational extension of Java. In PLDI '04: Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation, pages 158--169, 2004.]] Google ScholarDigital Library
- Y. A. Liu and S. D. Stoller. From Datalog rules to efficient programs with time and space guarantees. In PPDP '03: Proceedings of the 5th ACM SIGPLAN International Conference on Principles and Practice of Declarative Programming, pages 172--183, 2003.]] Google ScholarDigital Library
- R. Manevich, G. Ramalingam, J. Field, D. Goyal, and M. Sagiv. Compactly representing first-order structures for static analysis. In Proceedings of the 9th International Static Analysis Symposium, pages 196--212, Sept. 2002.]] Google ScholarDigital Library
- J. F. Naughton and R. Ramakrishnan. Bottom-up evaluation of logic programs. In Computational Logic - Essays in Honor of Alan Robinson, pages 640--700, 1991.]]Google Scholar
- Open Web Application Security Project. The ten most critical Web application security vulnerabilities. http://umn.dl.sourceforge.net/sourceforge/owasp/OWASPTopTen2004.pdf, 2004.]]Google Scholar
- R. Ramakrishnan, D. Srivastava, and S. Sudarshan. Rule ordering in bottom-up fixpoint evaluation of logic programs. In Proceedings of the 16th International Conference on Very Large Data Bases, pages 359--371, 1990.]] Google ScholarDigital Library
- R. Ramakrishnan, D. Srivastava, S. Sudarshan, and P. Seshadri. Implementation of the CORAL deductive database system. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pages 167--176, May 1993.]] Google ScholarDigital Library
- T. Reps. Demand interprocedural program analysis using logic databases. Applications of Logic Databases, pages 163--196, 1994.]]Google Scholar
- T. Reps. Solving demand versions of interprocedural analysis problems. In Proceedings of the Fifth International Conference on Compiler Construction, pages 389--403, Apr. 1994.]] Google ScholarDigital Library
- O. Ruwase and M. S. Lam. A practical dynamic buffer overflow detector. In Proceedings of the 11th Annual Network and Distributed System Security Symposium, pages 159--169, 2004.]]Google Scholar
- J. Scambray and M. Shema. Web Applications (Hacking Exposed). Addison-Wesley Professional, 2002.]] Google ScholarDigital Library
- U. Shankar, K. Talwar, J. S. Foster, and D. Wagner. Detecting format string vulnerabilities with type qualifiers. In Proceedings of the 2001 Usenix Security Conference, pages 201--220, 2001.]] Google ScholarDigital Library
- M. Sharir and A. Pnueli. Two approaches to interprocedural data-flow analyis. In S. Muchnick and N. D. Jones, editors, Program Flow Analysis: Theory and Applications, chapter 7, pages 189--234. Prentice-Hall, 1981.]]Google Scholar
- N. J. A. Sloane. The on-line encyclopedia of integer sequences: A000670. http://www.research.att.com/as/sequences, 2003.]]Google Scholar
- B. Steensgaard. Points-to analysis in almost linear time. In Proceedings of the 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 32--41, Jan. 1996.]] Google ScholarDigital Library
- H. Tamaki and T. Sato. Old resolution with tabulation. In Proceedings on Third international conference on logic programming, pages 84--98, 1986.]] Google ScholarDigital Library
- J. D. Ullman. Bottom-up beats top-down for Datalog. In PODS '89: Proceedings of the 8th ACM SIGACT-SIGMOD-SIGART symposium on Principles of Database Systems, pages 140--149, 1989.]] Google ScholarDigital Library
- J. D. Ullman. Principles of Database and Knowledge-Base Systems. Computer Science Press, Rockville, Md., volume II edition, 1989.]] Google ScholarDigital Library
- M. Vernon. Top five threats. ComputerWeekly.com (http://www.computerweekly.com/Article129980.htm), Apr. 2004.]]Google Scholar
- D. Wagner, J. Foster, E. Brewer, and A. Aiken. A first step towards automated detection of buffer overrun vulnerabilities. In Proceedings of Network and Distributed Systems Security Symposium, pages 3--17, 2000.]]Google Scholar
- WebCohort, Inc. Only 10% of Web applications are secured against common hacking techniques. http://www.imperva.com/company/news/2004-feb-02.html, 2004.]]Google Scholar
- J. Whaley and M. Lam. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation, pages 131--144, 2004.]] Google ScholarDigital Library
- J. Whaley and M. S. Lam. An efficient inclusion-based points-to analysis for strictly-typed languages. In Proceedings of the 9th International Static Analysis Symposium, pages 180--195, Sept. 2002.]] Google ScholarDigital Library
- J. Whaley and M. Rinard. Compositional pointer and escape analysis for Java programs. In Conference of Object Oriented Programming: Systems, Languages, and Applications, pages 187--206, Nov. 1999.]] Google ScholarDigital Library
- R. P. Wilson and M. S. Lam. Efficient context-sensitive pointer analysis for C programs. In Proceedings of the ACM SIGPLAN'95 Conference on Programming Language Design and Implementation, pages 1--12, June 1995.]] Google ScholarDigital Library
- T. Yavuz-Kahveci and T. Bultan. Automated verification of concurrent linked lists with counters. In Proceedings of the 9th International Static Analysis Symposium, pages 69--84, Sept. 2002.]] Google ScholarDigital Library
- J. Zhu. Symbolic pointer analysis. In Proceedings of the International Conference in Computer-Aided Design, pages 150--157, Nov. 2002.]] Google ScholarDigital Library
- J. Zhu and S. Calman. Symbolic pointer analysis revisited. In Proceedings of the ACM SIGPLAN 2004 Conference on Programming Language Design and Implementation, pages 145--157, 2004. Research Session 1: Querying XML & Semistructured Data / Query Languages]] Google ScholarDigital Library
Recommendations
Demand-driven context-sensitive alias analysis for Java
ISSTA '11: Proceedings of the 2011 International Symposium on Software Testing and AnalysisSoftware tools for program understanding, transformation, verification, and testing often require an efficient yet highly-precise alias analysis. Typically this is done by computing points-to information, from which alias queries can be answered. This ...
Refinement-based context-sensitive points-to analysis for Java
Proceedings of the 2006 PLDI ConferenceWe present a scalable and precise context-sensitive points-to analysis with three key properties: (1) filtering out of unrealizable paths, (2) a context-sensitive heap abstraction, and (3) a context-sensitive call graph. Previous work [21] has shown ...
Precise and scalable context-sensitive pointer analysis via value flow graph
ISMM '13: Proceedings of the 2013 international symposium on memory managementIn this paper, we propose a novel method for context-sensitive pointer analysis using the value flow graph (VFG) formulation. We achieve context-sensitivity by simultaneously applying function cloning and computing context-free language reachability (...
Comments