Abstract
The problem of confidential information leak can be addressed by using automatic tools that take a set of annotated inputs (the source) and track their flow to public sinks. Unfortunately, manually annotating the code with labels specifying the secret sources is one of the main obstacles in the adoption of such trackers.
In this work, we present an approach for the automatic generation of labels for confidential data in Java programs. Our solution is based on a graph-based representation of Java methods: starting from a minimal set of known API calls, it propagates the labels both intra- and inter-procedurally until a fix-point is reached.
In our evaluation, we encode our synthesis and propagation algorithm in Datalog and assess the accuracy of our technique on seven previously annotated internal code bases, where we can reconstruct 75% of the pre-existing manual annotations. In addition to this single data point, we also perform an assessment using samples from the SecuriBench-micro benchmark, and we provide additional sample programs that demonstrate the capabilities and the limitations of our approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Checker framework. https://checkerframework.org/manual/
Doop framework. https://bitbucket.org/yanniss/doop/src/master/
Java Vulnerability Detection. https://labs.oracle.com/pls/apex/f?p=labs:49:::::P49_PROJECT_ID:122
MUDetect. https://github.com/stg-tud/MUDetect
SecuriBench-micro. https://github.com/too4words/securibench-micro
Soufflé. https://souffle-lang.github.io
Amann, S., Nguyen, H.A., Nadi, S., Nguyen, T.N., Mezini, M.: Investigating next steps in static API-misuse detection. In: MSR 2019, 26–27 May 2019, Montreal, Canada (2019)
Arzt, S., et al.: Flowdroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In: PLDI 2014, Edinburgh, United Kingdom, 09–11 June 2014, pp. 259–269 (2014)
Broberg, N., van Delft, B., Sands, D.: Paragon - practical programming with information flow control. J. Comput. Secur. 25(4–5), 323–365 (2017)
Buiras, P., Vytiniotis, D., Russo, A.: HLIO: mixing static and dynamic typing for information-flow control in haskell. In: Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming, ICFP 2015, Vancouver, BC, Canada, 1–3 September 2015, pp. 289–301 (2015)
Christakis, M., Bird, C.: What developers want and need from program analysis: an empirical study. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 332–343 (2016)
ECMA International: Standard ECMA-262 - ECMAScript Language Specification. 5.1 edn, June 2011
Enck, W., et al.: Taintdroid: an information-flow tracking system for realtime privacy monitoring on smartphones. In: Proceedings of 9th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2010, 4–6 October 2010, Vancouver, BC, Canada, pp. 393–407 (2010)
Hammer, C., Snelting, G.: Flow-sensitive, context-sensitive, and object-sensitive information flow control based on program dependence graphs. Int. J. Inf. Secur. 8(6), 399–422 (2009)
Hedin, D., Birgisson, A., Bello, L., Sabelfeld, A.: JSFlow: tracking information flow in JavaScript and its APIs. In: SAC (2014)
Hedin, D., Sabelfeld, A.: A perspective on information-flow control. In: Software Safety and Security - Tools for Analysis and Verification, pp. 319–347 (2012)
Li, B., Ma, R., Wang, X., Wang, X., He, J.: DepTaint: a static taint analysis method based on program dependence. In: Proceedings of the 2020 4th International Conference on Management Engineering, Software Engineering and Service Sciences, pp. 34–41 (2020)
Livshits, V.B., Nori, A.V., Rajamani, S.K., Banerjee, A.: Merlin: specification inference for explicit information flow problems. In: PLDI 2009, Dublin, Ireland, 15–21 June 2009, pp. 75–86 (2009)
Mover, S., Sankaranarayanan, S., Olsen, R.B.P., Chang, B.E.: Mining framework usage graphs from app corpora. In: 25th International Conference on Software Analysis, Evolution and Reengineering, SANER 2018, Campobasso, Italy, 20–23 March 2018 (2018)
Myers, A.C., Zheng, L., Zdancewic, S., Chong, S., Nystrom, N.: Jif 3.0: Java information flow, July 2006. http://www.cs.cornell.edu/jif
Nguyen, T.T., Nguyen, H.A., Pham, N.H., Al-Kofahi, J.M., Nguyen, T.N.: Graph-based mining of multiple object usage patterns. In: ESEC/FSE, 2009, Amsterdam, The Netherlands, 24–28 August 2009 (2009)
Pottier, F., Simonet, V.: Information flow inference for ML. In: Conference Record of POPL 2002: The 29th SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Portland, OR, USA, 16–18 January 2002, pp. 319–330 (2002)
Sabelfeld, A., Russo, A.: From dynamic to static and back: riding the roller coaster of information-flow control research. In: Pnueli, A., Virbitskaite, I., Voronkov, A. (eds.) PSI 2009. LNCS, vol. 5947, pp. 352–365. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11486-1_30
Schoepe, D., Balliu, M., Pierce, B.C., Sabelfeld, A.: Explicit secrecy: a policy for taint tracking. In: IEEE European Symposium on Security and Privacy, EuroS&P 2016, Saarbrücken, Germany, 21–24 March 2016, pp. 15–30 (2016)
Schwartz, E.J., Avgerinos, T., Brumley, D.: All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In: 31st IEEE Symposium on Security and Privacy, S&P 2010, 16–19 May 2010, Berleley/Oakland, California, USA, pp. 317–331 (2010)
Stefan, D., Russo, A., Mitchell, J.C., Mazières, D.: Flexible dynamic information flow control in haskell. In: Proceedings of the 4th ACM SIGPLAN Symposium on Haskell, Haskell 2011, Tokyo, Japan, 22 September 2011, pp. 95–106 (2011)
Zhu, H., Dillig, T., Dillig, I.: Automated inference of library specifications for source-sink property verification. In: Shan, C. (ed.) APLAS 2013. LNCS, vol. 8301, pp. 290–306. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-03542-0_21
Acknowledgments
This work was partially supported by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Bastys, I., Bolignano, P., Raimondi, F., Schoepe, D. (2022). Automatic Annotation of Confidential Data in Java Code. In: Aïmeur, E., Laurent, M., Yaich, R., Dupont, B., Garcia-Alfaro, J. (eds) Foundations and Practice of Security. FPS 2021. Lecture Notes in Computer Science, vol 13291. Springer, Cham. https://doi.org/10.1007/978-3-031-08147-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-08147-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08146-0
Online ISBN: 978-3-031-08147-7
eBook Packages: Computer ScienceComputer Science (R0)