Elsevier

Computers & Security

Volume 91, April 2020, 101730
Computers & Security

Game theoretical study on client-controlled cloud data deduplication

https://doi.org/10.1016/j.cose.2020.101730Get rights and content

Abstract

Data deduplication eliminates redundant data and is receiving increasing attention in cloud storage services due to the proliferation of big data and the demand for efficient storage. Data deduplication not only requires a consummate technological designing, but also involves multiple parties with conflict interests. Thus, how to design incentive mechanisms and study their acceptance by all relevant stakeholders remain important open issues. In this paper, we detail the payoff structure of a client-controlled deduplication scheme and analyze the feasibilities of unified discount and individualized discount under this structure. Through game theoretical study, a privacy-preserving individualized discount-based incentive mechanism is further proposed with detailed implementation algorithms for choosing strategies, setting parameters and granting discounts. After theoretical analysis on the requirements of individual rationality, incentive compatibility, and profitability, we conduct extensive experiments based on a real-world dataset to demonstrate the effectiveness of the proposed incentive mechanism.

Introduction

Storing data in the cloud saves local storage spaces and reduces data management and operation costs. A data user can easily access its data in the cloud at any time and everywhere. Significant efforts have been made to securely and efficiently outsource data to the cloud in recent years, ranging from protecting data security and privacy (Chu, Chow, Tzeng, Zhou, Deng, 2014, Wan, Liu, Deng, 2012, Wang, Wang, Ren, Lou, 2010a, Wang, Wang, Ren, Lou, Li, 2011, Wei, Zhu, Cao, Dong, Jia, Chen, Vasilakos, 2014), reducing copyright risks (Hwang, Kulkareni, Hu, 2009, Hwang, Li, 2010), controlling data access (Ruj, Stojmenovicn, Nayak, 2014, Wang, Liu, Wu, 2010b, Yan, Li, Wang, Vasilakos, 2017a, Yang, Jia, 2014, Zhou, Varadharajan, Hitchens, 2013), to encrypted data deduplication (Harnik, Pinkas, Shulman-Peleg, 2010, Li, Chen, Li, Li, Lee, Lou, 2014, Li, Li, Chen, Lee, Lou, 2015, Liu, Asokan, Pinkas, 2015, Xu, Chang, Zhou, 2009, Yan, Ding, Yu, Zhu, Deng, 2016a, Yan, Wang, Li, Vasilakos, 2016b, Yan, Zhang, Ding, Zheng, 2017b).

Cloud data deduplication greatly benefits Cloud Service Providers (CSPs). One data file might be uploaded by many users or by a single user multiple time either intentionally or unintentionally. A CSP with deduplication only stores one copy of every data file in either plaintext or encrypted form, and be able to provide all of its users a way to access the data based on certain access control policies. Hence, data deduplication can greatly reduce the storage overhead of CSPs and allow them to pass the cost savings to their users. There is no doubt that a CSP with less service fee can attract more data users, thus possibly gain more profits.

There are many schemes in the literature on deduplication over encrypted data in the cloud to achieve both data security and economic data storage. One prominent method is the client-side deduplication scheme, in which a data user only needs to upload the real data if the data has never been stored before. It saves more uplink bandwidth and CSP operation costs, and are widely studied by most researchers. Based on the eligibility verification and access control policies, we can classify the client-side deduplication scheme into client-controlled client-side deduplication (C-DEDU) and server-controlled client-side deduplication (S-DEDU).

However, almost all existing deduplication schemes (Harnik, Pinkas, Shulman-Peleg, 2010, Li, Chen, Li, Li, Lee, Lou, 2014, Li, Li, Chen, Lee, Lou, 2015, Liu, Asokan, Pinkas, 2015, Xu, Chang, Zhou, 2009, Yan, Ding, Yu, Zhu, Deng, 2016a, Yan, Wang, Li, Vasilakos, 2016b, Yan, Zhang, Ding, Zheng, 2017b) are designed and analyzed only from technological perspectives. Few efforts in the literature were made to investigate the acceptability of deduplication schemes by all stakeholders (e.g., data users and CSPs). Only when a scheme brings tangible profits to its stakeholders can it be adopted in practice. In this paper, we focus on how to promote the acceptance of C-DEDU while the practical deployment problems in S-DEDU are studied in another line of our work.

Three kinds of stakeholders are involved in C-DEDU including data owners, data holders and CSPs. A data owner is the first one to upload a data and the later ones to upload are called data holders. Once a data holder requires to store this data, the data owner checks the eligibility of this holder and only grants the access right to the eligible ones. CSP is the entity that provides cloud storage service.

Data owners have privileges due to the access control rights, however, need to keep online to perform this control. Various techniques have been proposed to mitigate the online requirement of data owners. In Yan et al. (2016a), a scheme was proposed to allow data owners to hand over the right of controlling data deduplication to a server. Harnik et al. (2010) introduced a simple mechanism that turns off deduplication artificially to ensure privacy preservation. A flexible deduplication scheme, which adaptively selects stakeholders control data deduplication according to the data protection policies of data owners, was introduced in Yan et al. (2017b). However, these methods either change C-DEDU into S-DEDU to avoid the online requirement or are not intelligent. Because the combination of client-side access control and deduplication introduces service-delay to data holders when the owner is temporarily offline, some time-sensitive data holders may be reluctant to adopt C-DEDU.

CSP is the direct beneficiary of deduplication schemes since they are primarily designed to save the storage cost of CSPs. Therefore, it is essential for a CSP to motivate the participation enthusiasm of data owners and data holders. Even though this necessity of incentives is mentioned by researchers, they either failed to propose a concrete mechanism (Liu et al., 2015), or the proposed mechanisms have privacy defect (Armknecht et al., 2015) or are proved to be not incentive compatible (Liang, Yan, Chen, Yang, Lou, Hou, 2019, Miao, Jiang, You, 2015). Moreover, the complex interdependence among the various stakeholders in deduplication schemes increases the difficulty of weighting the profits from the stakeholders’ perspective. Game theory, as a mathematical model of conflict and cooperation study between rational players, has natural advantages to address this problem. It helps to analyze how data owners and data holders choose strategies based on their utility functions. Unfortunately, to our knowledge, no systematically economic model for C-DEDU has been proposed until now.

In this paper, we first specify the employed economic model and introduce the detailed utility functions of data owners, data holders and CSPs. Then we apply game theory to analyze how data owners and data holders react according to different discount-charging models of CSPs and discuss the existence of Nash Equilibrium. To overcome the free-riding behaviors privacy issues, we propose a privacy-preserving incentive mechanism that can motivate rational players (i.e., data owners and data holders) to be honest. Furthermore, we conduct experiments to verify our theoretical analysis and illustrate the effectiveness of the incentive mechanism with a real-world dataset. Specifically, the contributions of this paper can be summarized as below:

  • 1.

    We systematically propose an economic model for a cloud storage system with C-DEDU. The detailed utility function of each stakeholder is deeply discussed as well.

  • 2.

    We analyze the advantages and disadvantages of two incentive mechanisms (i.e., unified discount and individualized discount) with a game model between a data owner and a data holder. We find that the individualized discount is more desirable due to the existence of Nash Equilibrium although it may intrude privacy.

  • 3.

    We further present a new privacy-preserving incentive mechanism that is incentive compatible and motivates rational players (i.e., data owners and data holders) to be honest, thus eliminate the disadvantage of individualized discount.

  • 4.

    We provide Parameter-Setting Algorithm, Discount-Granting Algorithm, and Strategy-Choosing Algorithms to instruct how our proposed incentive mechanism can be implemented in practice.

  • 5.

    We discuss how the proposed incentive mechanism is compatible with existing encrypted data deduplication schemes and its scalability and robustness when being triggered by modification attacks.

The rest of the paper is organized as follows. Background and related works are briefly reviewed in Section 2. Section 3 overviews the cloud storage system with C-DEDU, and details its deployment problems, along with clearly specified game-model assumptions. An economic model for the cloud storage system with C-DEDU is proposed in Section 4 based on the assumptions. In Section 5, we perform game-theoretical analysis on two discount methods and propose a privacy-preserving individualized discount-based incentive mechanism, which is able to achieve individual rationality, incentive compatibility and profitability. In Section 6, we evaluate the effectiveness of our proposed incentive mechanism in promoting the acceptance of C-DEDU through a set of experiments based on a real-world dataset and further discuss its compatibility, scalability and robustness. Finally, concluding remarks are drawn in the last section.

Section snippets

Game theory

Game theory is a branch of applied mathematics but develops considerably in the field of economics. It has been widely deployed in many fields, such as economics, psychology, and even biology. It can flexibly and masterly capture the interactions between different participants. It studies how a rational entity will choose its strategy based on its preference and known information about the others at each step. Researchers in the field of security and privacy (Do, Tran, Hong, Kamhoua, Kwiat,

System model

There is a CSP ck with Mk unique data to be stored and Nk data users. Let Dk and Uk represent the data set and user set. For each data dkmDk, the number of its data users is denoted as nkm, then m=1Mknkm=Nk. Then Dk={dkm|m=1,2,,Mk} and Uk={ukn|uknn=1,2,,Nkn=1,2,,Nk}=mUk,m, where Uk,m={uk,ms|s=1,2,,nkm}.

When ck adopts C-DEDU, Uk can be divided into a data owner set Ok and a data holder set Hk. Ok={oki|okii=1,2,,Mki=1,2,,Mk}={uk,m1|m=1,2,,Mk} is composed of the first data uploader of

Economic model

Before starting to solve the above problems from the view of economics, we summarize the notations used in this paper in Table 1 for clear presentation and easy reference.

In this section, we build up an economic model to calculate the utilities of data owners, data holders and CSPs, respectively. Since CSPs hold absolute control over designing the charging model, we mainly consider the game between the data owner and data holders. We present the utilities of all stakeholders under different

Discount-based incentive mechanism

There are two kinds of discount-based incentive mechanism for CSPs to choose. The first one is a CSP grants discounts to all of its subscribers undifferentiatedly based on its saved storage space. The second one is to set the discount value for individual data based on how many times this data has been deduplicated. For simplification, we call these two kinds of mechanism as unified discount and individualized discount, respectively.

In this section, we first present the requirements for an

Evaluation

We conducted a set of experiments to analyze the effectiveness of our incentive mechanism in promoting the acceptance of the C-DEDU scheme by all system players. In this section, we also discuss how to make the incentive mechanism compatible with existing deduplication schemes and its scalability and robustness with regard to modification attacks.

Conclusion

In this paper, we detailed an economic model for cloud storage systems with C-DEDU. A game theoretical approach is employed to analyze the feasibilities of two discount-based incentive mechanisms: unified discount and individualized discount. The unified discount ensures data privacy but introduces free-riding behaviors, which is difficult to eliminate without changing the design of C-DEDU. The individualized discount can suppress free-riding behaviors in some cases; however, data holders can

Declaration of Competing Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The work is supported in part by the National Natural Science Foundation of China under Grants 61672410 and 61802293, the National Postdoctoral Program for Innovative Talents under grant BX20180238, the Project funded by China Postdoctoral Science Foundation under grant 2018M633461, the Academy of Finland under Grants 308087 and 314203, the Key Lab of Information Network Security, Ministry of Public Security under grant No. C18614, the open grant of the Tactical Data Link Lab of the 20th

Xueqin Liang received the B.Sc. degree on Applied Mathematics from Anhui University, Anhui, China, 2015. She is currently working for her Ph.D. degree at Xidian University, Xi’an, China, and Aalto University, Finland. Her research interests are in game theory based security solutions, cloud computing security and trust, and Blockchain.

References (44)

  • C.K. Chu et al.

    “key-aggregate cryptosystem for scalable data sharing in cloud storage,”

    IEEE Trans. Parallel Distrib. Syst.

    (2014)
  • C.Y. Do et al.

    “game theory for cyber security and privacy,”

    ACM Computing Surveys (CSUR)

    (2017)
  • Gao L., Yan Z., Yang L.Y. “game theoretical analysis on acceptance of a cloud data access control system based on...
  • D. Harnik et al.

    “side channels in cloud services: deduplication in cloud storage,”

    IEEE S&P

    (2010)
  • O. Heen et al.

    “improving the resistance to side-channel attacks on cloud storage services,”

    NTMS’12

    (2012)
  • K. Hwang et al.

    “cloud security with virtualized defense and reputation-based trust management,”

    IEEE DASC’09

    (2009)
  • K. Hwang et al.

    “trusted cloud computing with secure resources and data coloring,”

    IEEE Internet Comput

    (2010)
  • J. Li et al.

    “secure deduplication with efficient and reliable convergent key management,”

    IEEE Trans Parallel Distrib Syst

    (2014)
  • J. Li et al.

    “a hybrid cloud approach for secure authorized deduplication,”

    IEEE Trans. Parallel Distrib. Syst.

    (2015)
  • X. Liang et al.

    “a survey on game theoretical methods in human-machine networks,”

    Future Gener. Comput. Syst.

    (2019)
  • X. Liang et al.

    “game theoretical analysis on encrypted cloud data deduplication,”

    IEEE Trans. Ind. Inform.

    (2019)
  • J. Liu et al.

    “secure deduplication of encrypted data without additional independent servers,”

    CCS’15

    (2015)
  • Cited by (13)

    • A tripartite game model of trust cooperation in cloud service

      2021, Computers and Security
      Citation Excerpt :

      Yang and Lu (2018) researched a dynamic monitor behavior for wireless sensor networks based on game theory, studied the trade-off relationship between network security and energy efficiency, proposed a trust evaluation mechanism, and integrated into the cluster-based routing protocol, and obtained a higher network lifetime. Liang and Yan (2020) introduced the payment structure of duplication schemes, further proposed a personalized discount incentive mechanism to privacy protection by game theory, and gave the concrete implementation methods of policy selection, parameter setting, and discount granting. Proposed a repeated Bayesian Stackelberg game to provide the optimal distribution of VM for the hypervisor, detected multiple types of attacks, and reduced the number of attacked services as much as possible, and can run effectively and distribute fairly.

    • Game-theoretic analysis of encrypted cloud data deduplication

      2020, Data Deduplication Approaches: Concepts, Strategies, and Challenges
    View all citing articles on Scopus

    Xueqin Liang received the B.Sc. degree on Applied Mathematics from Anhui University, Anhui, China, 2015. She is currently working for her Ph.D. degree at Xidian University, Xi’an, China, and Aalto University, Finland. Her research interests are in game theory based security solutions, cloud computing security and trust, and Blockchain.

    Zheng Yan received the BEng degree in electrical engineering and the M.Eng. degree in computer science and engineering from the Xin Jiaotong University, Xin, China in 1994 and 1997, respectively, the second M.Eng. degree in information security from the National University of Singapore, Singapore in 2000, and the licentiate of science and the doctor of science in technology in electrical engineering from Helsinki University of Technology, Helsinki, Finland. She is currently a professor at the Xidian University, China and a visiting professor and Finnish academy research fellow at the Aalto University, Finland. Before joining academia in 2011, she was a senior researcher at the Nokia Research Center, Helsinki, Finland, since 2000. Her research interests are in trust, security, privacy, and security-related data analytics. She is an inventor of 24 patents, all of them having been adopted in industry. She is an associate editor of IEEE Internet of Things Journal, Information Fusion, Information Sciences, IEEE Access, and JNCA. She served as a general chair or program chair for a number of international conferences including IEEE TrustCom 2015. She is a founder steering committee co-chair of IEEE Blockchain conference. She received several awards, including the 2017 Best Journal Paper Award issued by IEEE Communication Society Technical Committee on Big Data and the Outstanding Associate Editor of 2017/2018 for IEEE Access.

    Robert H. Deng is AXA Chair Professor of Cybersecurity and Professor of Information Systems in the School of Information Systems, Singapore Management University since 2004. His research interests include data security and privacy, multimedia security, network and system security. He served/is serving on the editorial boards of many international journals, including TFIS, TDSC. He has received the Distinguished Paper Award (NDSS 2012), Best Paper Award (CMS 2012), Best Journal Paper Award (IEEE Communications Society 2017). He is a fellow of the IEEE.

    View full text