ABSTRACT
Cloud storage providers such as Dropbox and Google drive heavily rely on data deduplication to save storage costs by only storing one copy of each uploaded file. Although recent studies report that whole file deduplication can achieve up to 50% storage reduction, users do not directly benefit from these savings-as there is no transparent relation between effective storage costs and the prices offered to the users. In this paper, we propose a novel storage solution, ClearBox, which allows a storage service provider to transparently attest to its customers the deduplication patterns of the (encrypted) data that it is storing. By doing so, ClearBox enables cloud users to verify the effective storage space that their data is occupying in the cloud, and consequently to check whether they qualify for benefits such as price reductions, etc. ClearBox is secure against malicious users and a rational storage provider, and ensures that files can only be accessed by their legitimate owners. We evaluate a prototype implementation of ClearBox using both Amazon S3 and Dropbox as back-end cloud storage. Our findings show that our solution works with the APIs provided by existing service providers without any modifications and achieves comparable performance to existing solutions.
- Amazon S3 Pricing. http://aws.amazon.com/s3/pricing/.Google Scholar
- Bitcoin real-time stats and tools. http://blockexplorer.com/q.Google Scholar
- Google Cloud Storage. https://cloud.google.com/storage/.Google Scholar
- The MySQL Query Cache. http://dev.mysql.com/doc/refman/5.1/en/query-cache.html.Google Scholar
- PBC Library. http://crypto.stanford.edu/pbc/, 2007.Google Scholar
- Cloud Market Will More Than Triple by 2014, Reaching $150 Billion. http://www.msptoday.com/topics/msp-today/articles/364312-cloud-market-will-more-than-triple-2014-reaching.htm, 2013.Google Scholar
- JPBC:Java Pairing-Based Cryptography Library. http://gas.dia.unisa.it/projects/jpbc/#.U3HBFfna5cY, 2013.Google Scholar
- Bitcoin as a public source of randomness. https://docs.google.com/presentation/d/1VWHm4Moza2znhXSOJ8FacfNK2B_vxnfbdZgC5EpeXFE/view?pli=1#slide=id.g3934beb89_034, 2014.Google Scholar
- These are the cheapest cloud storage providers right now. http://qz.com/256824/these-are-the-cheapest-cloud-storage-providers-right-now/, 2014.Google Scholar
- Armknecht, F., Bohli, J., Karame, G. O., Liu, Z., and Reuter, C. A. Outsourced proofs of retrievability. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, Scottsdale, AZ, USA, November 3--7, 2014 (2014), pp. 831--843. Google ScholarDigital Library
- Ateniese, G., Burns, R. C., Curtmola, R., Herring, J., Kissner, L., Peterson, Z. N. J., and Song, D. X. Provable data possession at untrusted stores. In ACM Conference on Computer and Communications Security (2007), pp. 598--609. Google ScholarDigital Library
- Baric, N., and Pfitzmann, B. Collision-free accumulators and fail-stop signature schemes without trees. In EUROCRYPT (1997), W. Fumy, Ed., vol. 1233 of Lecture Notes in Computer Science, Springer, pp. 480--494. Google ScholarDigital Library
- Bellare, M., and Keelveedhi, S. Interactive message-locked encryption and secure deduplication. In Public-Key Cryptography - PKC 2015 - 18th IACR International Conference on Practice and Theory in Public-Key Cryptography, Gaithersburg, MD, USA, March 30 - April 1, 2015, Proceedings (2015), J. Katz, Ed., vol. 9020 of Lecture Notes in Computer Science, Springer, pp. 516--538.Google ScholarCross Ref
- Bellare, M., Keelveedhi, S., and Ristenpart, T. DupLESS: Server-aided encryption for deduplicated storage. In Proceedings of the 22Nd USENIX Conference on Security (Berkeley, CA, USA, 2013), SEC'13, USENIX Association, pp. 179--194. Google ScholarDigital Library
- Bellare, M., Keelveedhi, S., and Ristenpart, T. Message-locked encryption and secure deduplication. In Advances in Cryptology - EUROCRYPT 2013, 32nd Annual International Conference on the Theory and Applications of Cryptographic Techniques, Athens, Greece, May 26--30, 2013. Proceedings (2013), T. Johansson and P. Q. Nguyen, Eds., vol. 7881 of Lecture Notes in Computer Science, Springer, pp. 296--312.Google Scholar
- Blasco, J., Di Pietro, R., Orfila, A., and Sorniotti, A. A tunable proof of ownership scheme for deduplication using bloom filters. In Communications and Network Security (CNS), 2014 IEEE Conference on (Oct 2014), pp. 481--489.Google ScholarCross Ref
- Boldyreva, A. Efficient threshold signature, multisignature and blind signature schemes based on the gap-diffie-hellman-group signature scheme.Google Scholar
- Boneh, D., Lynn, B., and Shacham, H. Short signatures from the weil pairing. J. Cryptology 17, 4 (2004), 297--319. Google ScholarDigital Library
- Brent Boyer. Robust Java benchmarking. http://www.ibm.com/developerworks/library/j-benchmark2/j-benchmark2-pdf.pdf.Google Scholar
- Buldas, A., Laud, P., and Lipmaa, H. Eliminating counterevidence with applications to accountable certificate management. Journal of Computer Security 10, 3 (2002), 273--296. Google ScholarDigital Library
- Camenisch, J., and Lysyanskaya, A. Dynamic accumulators and application to efficient revocation of anonymous credentials. In Advances in Cryptology - CRYPTO 2002 (2002), Springer, pp. 61--76. Google ScholarCross Ref
- Damgård, I., and Triandopoulos, N. Supporting non-membership proofs with bilinear-map accumulators. IACR Cryptology ePrint Archive 2008 (2008), 538.Google Scholar
- Di Pietro, R., and Sorniotti, A. Boosting efficiency and security in proof of ownership for deduplication. In Proceedings of the 7th ACM Symposium on Information, Computer and Communications Security (New York, NY, USA, 2012), ASIACCS '12, ACM, pp. 81--82. Google ScholarDigital Library
- Dobre, D., Karame, G., Li, W., Majuntke, M., Suri, N., and Vukolić, M. Powerstore: Proofs of writing for efficient and robust storage. In Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security (New York, NY, USA, 2013), CCS '13, ACM, pp. 285--298. Google ScholarDigital Library
- Douceur, J. R., Adya, A., Bolosky, W. J., Simon, D., and Theimer, M. Reclaiming space from duplicate files in a serverless distributed file system. In ICDCS (2002), pp. 617--624. Google ScholarDigital Library
- Fiat, A., and Shamir, A. How to prove yourself: Practical solutions to identification and signature problems. In Proceedings on Advances in cryptology--CRYPTO '86 (London, UK, UK, 1987), Springer-Verlag, pp. 186--194. Google ScholarDigital Library
- Halevi, S., Harnik, D., Pinkas, B., and Shulman-Peleg, A. Proofs of ownership in remote storage systems. In Proceedings of the 18th ACM Conference on Computer and Communications Security (New York, NY, USA, 2011), CCS '11, ACM, pp. 491--500. Google ScholarDigital Library
- Harnik, D., Pinkas, B., and Shulman-Peleg, A. Side channels in cloud services: Deduplication in cloud storage. IEEE Security & Privacy 8, 6 (2010), 40--47. Google ScholarDigital Library
- Karame, G. O., Androulaki, E., and Capkun, S. Double-spending fast payments in bitcoin. In Proceedings of the 2012 ACM conference on Computer and communications security (New York, NY, USA, 2012), CCS '12, ACM, pp. 906--917. Google ScholarDigital Library
- Kate, A., Zaverucha, G. M., and Goldberg, I. Constant-size commitments to polynomials and their applications. In Advances in Cryptology-ASIACRYPT 2010. Springer, 2010, pp. 177--194.Google ScholarCross Ref
- Keelveedhi, S., Bellare, M., and Ristenpart, T. DupLESS: Server-aided encryption for deduplicated storage. In Presented as part of the 22nd USENIX Security Symposium (USENIX Security 13) (Washington, D.C., 2013), USENIX, pp. 179--194. Google ScholarDigital Library
- Li, J., Li, N., and Xue, R. Universal accumulators with efficient nonmembership proofs. In Applied Cryptography and Network Security, 5th International Conference, ACNS 2007, Zhuhai, China, June 5--8, 2007, Proceedings (2007), pp. 253--269. Google ScholarDigital Library
- Lipmaa, H. Secure accumulators from euclidean rings without trusted setup. In Applied Cryptography and Network Security - 10th International Conference, ACNS 2012, Singapore, June 26--29, 2012. Proceedings (2012), pp. 224--240. Google ScholarDigital Library
- Liu, S., Huang, X., Fu, H., and Yang, G. Understanding data characteristics and access patterns in a cloud storage system. In 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013, Delft, Netherlands, May 13--16, 2013 (2013), pp. 327--334.Google Scholar
- Meyer, D. T., and Bolosky, W. J. A study of practical deduplication. In Proceedings of the 9th USENIX Conference on File and Stroage Technologies (Berkeley, CA, USA, 2011), FAST'11, USENIX Association, pp. 1--1. Google ScholarDigital Library
- Meyer, D. T., and Bolosky, W. J. A study of practical deduplication. Trans. Storage 7, 4 (Feb. 2012), 14:1--14:20. Google ScholarDigital Library
- Micali, S., Rabin, M., and Kilian, J. Zero-knowledge sets. In Foundations of Computer Science, 2003. Proceedings. 44th Annual IEEE Symposium on (2003), IEEE, pp. 80--91. Google ScholarDigital Library
- NetEm. NetEm, the Linux Foundation. Website, 2009. Available online at http://www.linuxfoundation.org/collaborate/workgroups/networking/netem.Google Scholar
- Nguyen, L. Accumulators from bilinear pairings and applications. In Topics in Cryptology - CT-RSA 2005, The Cryptographers' Track at the RSA Conference 2005, San Francisco, CA, USA, February 14--18, 2005, Proceedings (2005), pp. 275--292. Google ScholarDigital Library
- Shacham, H., and Waters, B. Compact Proofs of Retrievability. In ASIACRYPT (2008), pp. 90--107. Google ScholarDigital Library
- Soriente, C., Karame, G. O., Ritzdorf, H., Marinovic, S., and Capkun, S. Commune: Shared ownership in an agnostic cloud. In Proceedings of the 20th ACM Symposium on Access Control Models and Technologies, Vienna, Austria, June 1--3, 2015 (2015), pp. 39--50. Google ScholarDigital Library
- Stanek, J., Sorniotti, A., Androulaki, E., and Kencl, L. A secure data deduplication scheme for cloud storage. In Financial Cryptography and Data Security - 18th International Conference, FC 2014, Christ Church, Barbados, March 3--7, 2014, Revised Selected Papers (2014), pp. 99--118.Google Scholar
- van Dijk, M., Juels, A., Oprea, A., Rivest, R. L., Stefanov, E., and Triandopoulos, N. Hourglass schemes: How to prove that cloud files are encrypted. In Proceedings of the 2012 ACM Conference on Computer and Communications Security (New York, NY, USA, 2012), CCS '12, ACM, pp. 265--280. Google ScholarDigital Library
- Xu, J., Chang, E.-C., and Zhou, J. Weak leakage-resilient client-side deduplication of encrypted data in cloud storage. In Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security (New York, NY, USA, 2013), ASIA CCS '13, ACM, pp. 195--206. Google ScholarDigital Library
Index Terms
- Transparent Data Deduplication in the Cloud
Recommendations
Differentially private client-side data deduplication protocol for cloud storage services
Cloud storage service providers apply data client-side deduplication across multiple users to achieve cost savings of network bandwidth and disk storage. However, deduplication can be used as a side channel by attackers who try to obtain sensitive ...
Payment-based incentive mechanism for secure cloud deduplication
Data deduplication is a very important technique to reduce the storage cost in cloud storage and management systems. Currently, various secure deduplication encryption schemes have be designed to protect the privacy of clients' data. However, they ...
A secure data deduplication framework for cloud environments
PST '12: Proceedings of the 2012 Tenth Annual International Conference on Privacy, Security and Trust (PST)Cloud computing has empowered the individual user by providing seemingly unlimited storage space and availability and accessibility of data anytime and anywhere. Cloud service providers are able to maximize data storage space by incorporating data ...
Comments