ABSTRACT
A software system interacts with third-party libraries through various APIs. Using these library APIs often needs tofollow certain usage patterns. Furthermore, ordering rules (specifications) exist between APIs, and these rules govern the secure and robust operation of the system using these APIs. But these patterns and rules may not be well documented by the API developers. Previous approaches mine frequent association rules, itemsets, or subsequences that capture API call patterns shared by API client code. However, these frequent API patterns cannot completely capture some useful orderings shared by APIs, especially when multiple APIs are involved across different procedures. In this paper, we present a framework to automatically extract usage scenarios among user-specified APIs as partial orders, directly from the source code (API client code). We adapt a model checker to generate interprocedural control-flow-sensitive static traces related to the APIs of interest. Different API usage scenarios are extracted from the static traces by our scenario extraction algorithm and fed to a miner. The miner summarizes different usage scenarios as compact partial orders. Specifications are extracted from the frequent partial orders using our specification extraction algorithm. Our experience of applying the framework on 72 X11 clients with 200K LOC in total has shown that theextracted API partial orders are useful in assisting effective API reuse and checking.
- M. Acharya, T. Xie, and J. Xu. Mining interface specifications for generating checkable robustness properties. In Proc. International Symposium on Software Reliability Engineering (ISSRE), pages 311--320, 2006. Google ScholarDigital Library
- R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proc. International Conference on Very Large Data Bases (VLDB), pages 487--499, 1994. Google ScholarDigital Library
- R. Alur, P. Cerny, P. Madhusudan, and W. Nam. Synthesis of interface specifications for Java classes. In Proc. Symposium on Principles of Programming Languages (POPL), pages 98--109, 2005. Google ScholarDigital Library
- G. Ammons, R. Bodik, and J. Larus. Mining specifications. In Proc. Symposium on Principles of Programming Languages (POPL), pages 4--16, 2002. Google ScholarDigital Library
- G. Ammons, D. Mandein, R. Bodik, and J. Larus. Debugging temporal specifications with concept analysis. In Proc. Programming Language Design and Implementation (PLDI), pages 182--195, 2003. Google ScholarDigital Library
- H. Chen. Lightweight Model Checking for Improving Software Security. PhD thesis, University of California, Berkeley, 2004. Google ScholarDigital Library
- H. Chen and D. Wagner. MOPS: an infrastructure for examining security properties of software. In Proc. ACM Conference on Computer and Communications Security (CCS), pages 235--244, 2002. Google ScholarDigital Library
- D. Engler, D. Y. Chen, S. Hallem, A. Chou, and B. Chelf. Bugs as deviant behavior: A general approach to inferring errors in systems code. In Proc. ACM Symposium on Operating Systems Principles (SOSP), pages 57--72, 2001. Google ScholarDigital Library
- J. Esparza, D. Hansel, P. Rossmanith, and S. Schwoon. Efficient algorithms for model checking push down systems. In Proc. International Conference on Computer Aided Verification (CAV), pages 232--247, 2000. Google ScholarDigital Library
- G. Grahne and J. Zhu. Efficiently using prefix-trees in mining frequent itemsets. In Proc. IEEE ICDM Workshop on Frequent Itemset Mining Implementations, 2003.Google Scholar
- T. Henzinger, R. Jhala, R. Majumdar, and G. Sutre. Software verification with BLAST. In Proc. Workshop on Model Checking Software, pages 235--239, 2003. Google ScholarDigital Library
- T. A. Henzinger, R. Jhala, and R. Majumdar. Permissive interfaces. In Proc. European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pages 31--40, 2005. Google ScholarDigital Library
- J. Hopcroft and J. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979. Google ScholarDigital Library
- T. Kremenek, P. Twohey, G. Back, D. Engler, and A. Ng. From uncertainty to belief: Inferring the specification within. In Proc. Symposium on Operating Systems Design and Implementation (OSDI), pages 161--176, 2006. Google ScholarDigital Library
- Z. Li and Y. Zhou. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In Proc. European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pages 306--315, 2005. Google ScholarDigital Library
- C. Liu, E. Ye, and D. J. Richardson. Software library usage pattern extraction using a software model checker. In Proc. IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 301--304, 2006. Google ScholarDigital Library
- B. Livshits and T. Zimmermann. DynaMine: finding common error patterns by mining software revision histories. In Proc. European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE), pages 296--305, 2005. Google ScholarDigital Library
- A. Michail. Data mining library reuse patterns using generalized association rules. In Proc. International Conference on Software Engineering (ICSE), pages 167--176, 2000. Google ScholarDigital Library
- J. Pei, H. Wang, J. Liu, K. Wang, J. Wang, and P. Yu. Discovering frequent closed partial orders from strings. IEEE Transactions on Knowledge and Data Engineering, 18(11):1467--1481, 2006. Google ScholarDigital Library
- A. V. Raman and J. D. Patrick. The sk-strings method for inferring PFSA. In Proc. Workshop on Automata Induction, Grammatical Inference and Language Acquisition, 1997.Google Scholar
- D. Rosenthal. Inter-client communication Conventions Manual (ICCCM), Version 2.0. X Consortium, Inc. 1994.Google Scholar
- J. Wang and J. Han. BIDE: Efficient mining of frequent closed sequences. In Proc. International Conference on Data Engineering (ICDE), pages 79--90, 2004. Google ScholarDigital Library
- W. Weimer and G. C. Necula. Mining temporal specifications for error detection. In Proc. International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS), pages 461--476, 2005. Google ScholarDigital Library
- J. Whaley, M. C. Martin, and M. S. Lam. Automatic extraction of object-oriented component interfaces. In Proc. International Symposium on Software Testing and Analysis (ISSTA), pages 218--228, 2002. Google ScholarDigital Library
- C. C. Williams and J. K. Hollingsworth. Automatic mining of source code repositories to improve bug finding techniques. IEEE Trans. Softw. Eng., 31(6):466--480, 2005. Google ScholarDigital Library
- T. Xie and J. Pei. MAPO: Mining API usages from open source repositories. In Proc. International Workshop on Mining Software Repositories (MSR), pages 54--57, 2006. Google ScholarDigital Library
- J. Yang, D. Evans, D. Bhardwaj, T. Bhat, and M. Das. Perracotta: Mining temporal API rules from imperfect traces. In Proc. International Conference on Software Engineering (ICSE), pages 282--291, 2006. Google ScholarDigital Library
Index Terms
- Mining API patterns as partial orders from source code: from usage scenarios to specifications
Recommendations
On the Variations and Evolutions of API Usage Patterns: Case Study on Android Applications
ICSEW'20: Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering WorkshopsSoftware developers can reduce the implementation cost by calling already provided functions through accessing library Application Programming Interface (API). APIs are often used in combination but how to combine them are not well-documented. Existing ...
Discovering Frequent Closed Partial Orders from Strings
Mining knowledge about ordering from sequence data is an important problem with many applications, such as bioinformatics, Web mining, network management, and intrusion detection. For example, if many customers follow a partial order in their purchases ...
Use at your own risk: the Java unsafe API in the wild
OOPSLA '15Java is a safe language. Its runtime environment provides strong safety guarantees that any Java application can rely on. Or so we think. We show that the runtime actually does not provide these guarantees---for a large fraction of today's Java code. ...
Comments