Blockchain and GDPR – A Study on Compatibility Issues of the Distributed Ledger Technology with GDPR Data Processing

This research work aims to investigate Blockchain technology and GDPR compliance studies. This will analyze the data privacy perspective with respect to distributed ledger technology. Blockchain has become one of the most frequently discussed technologies for its ability to allow for peer-to-peer transactions without a centralized intermediary. The GDPR was implemented in May 2018 for EU member states to maintain data privacy. DLT, the underlying technology of blockchain as is a decentralized system without any monetary authority. This research conducted a thorough literature review on prior conducted research to investigate the problems and determine the gaps of GDPR compliance with blockchain technologies and discuss the technical, use-case designs or solutions that make blockchain more compliant GDPR in terms of privacy. This systematic literature review addresses the gaps, feasibility, efficiency, and data privacy issues on compatibility problems that are primarily concerned with how a distributed ledger technology system in which recorded data or transaction cannot be changed or erased is challenging the GDPR data subject access rights (DSAR), where every data subject’s personal data which is compliant to GDPR has a right to exercise their Rights to rectify, delete or limit the processing of your personal data at any time if necessary.


Introduction
Recently, major improvements have been made in the way businesses collect and manage personal data. The dependence on data to drive routine businesses and utilizing it for innovation has raised potential threats and risks to individuals' privacy. Privacy is an individual's right to monitor how personal data is collected, with whom it is shared, and how it is processed, retained, or deleted. GDPR is one of its kinds of regulation in protecting user data. Blockchain is considered to be shared and immutable for recording or registering transactions in a decentralized, shared storage system in a free and transparent manner. Such properties allow for the complete distribution of blockchain without a central authority and yet in terms of user privacy. This raises a question to find the gap: Will GDPR and Blockchain comply with the data protection issues enough? In the event that Blockchain and the GDPR are compliant to the degree the open distributed. As in blockchain, stored data cannot be modified and removed. This paper provides detailed attention to how distributed ledger can be complied with and adapted to GDPR regulations and laws and how it can be beneficial for data subjects.

GDPR
It is stated that EU GDPR is considered one among the modern regulatory frameworks providing guidelines for personal data processing from European Union (EU) residents and ensuring data subjects' privacy [6]. Under the GDPR, organizations must be compliant with personal data processing and comply with limiting collection purposes and protect the same from misuse [5]. As stated in GDPR, Personal data is defined as means to any information concerning an identified or identifiable individual data subject [7].
Here, Data processing in GDPR is interpreted as an operation or collective operations performed on personal data, such as gathering, storing, saving, modifying, retrieving, publishing, rendering available, erasing, or destroying such data. As per Article 32 of the GDPR: Data controllers are known to be the principal owners and are accountable for the fair and reasonable processing of the information by means of measures and procedures [17]. At the same time, data processors are liable to data controllers and notify the controller of any data breaches.

Blockchains
In simpler terms, blockchain is a chain of blocks that could define blockchain as a database that ensures security, transparency, and decentralization of transactions [8]. A larger group of technologies together combined are known as "Distributed Ledger Technology," which is connected to "Blockchain," which is protected by using reliable, public, private key signature technology [4] is shown in Figure 1. For businesses that need a database, required shared access amongst parties that may not be known or trusted or may have competing interests, and it is not practically possible for a third party to be trusted to manage the database, then blockchain comes into play as the distributed architecture of the blockchain is more resilient, reducing the ability for hacks to happen. Blockchain transactions are verifiable, traceable, and auditable, creating transparency [26]. Next to bitcoin, there were many other Cryptocurrencies introduced in the market. Some of them are Ethereum, XRP, Tether, Litecoin, and EOS. They leverage blockchain technology to gain transparency, decentralization, and immutability (Politou et al., 2019). Blockchain has properties such as: i) All transactions are open, and any participant in the blockchain can see any user's information. ii) Transactions are in the nature of shared and decentralized form, which makes many duplicates of the blockchain coexist together. iii) Also, transactions in blockchain are considered permanent in nature, which implies that any transaction information stored or documented cannot be changed or erased easily [1].
As the transaction is decentralized and data is encrypted and stored on multiple storage devices, a public blockchain is considered a transparent ledger, making it almost impossible to hack it [9]. Besides, permissioned blockchains are considered open and transparent to everyone or bounded, depending on the case-to-case basis. A private blockchain is considered to be less secure when compared to a public blockchain as it mostly works on the basis of access controls that restrict the participants who can participate in the network [24].

Blockchain and GDPR Compliance
However, while blockchain has been touted as a fail-safe technology for securing personal data and privacy, there are concerns that it could potentially impinge. Most notably, blockchain poses lawful enforcement issues when storing information. It is said that any data which is written in the blockchain is considered to be permanent. This property of blockchain technology is why we hear that a blockchain is referred to as being immutable in nature [19].
Due to blockchain's immutable nature, the transactions in each block of the blockchain have the previous or predecessor block hash, which results in the formation of a cryptographically secure chain, and this property makes altering the chain practically impossible, as any change would invalidate all subsequent blocks [20]. If we consider the requirements of European Union GDPR regulation, therefore, the very nature of blockchain's protection lies with the privacy needed to protect personal data. Blockchain also opposes the Data Minimization principle of GDPR, which means collecting only the data which is required to fulfill a specific purpose [28]. Notably, conflicts between GDPR and Blockchain continue to exist between data subjects' rights to rectify, alter, remove data and Data controllers, Data processor's distinctive proof, and obligations on the blockchain [10]. Figure 2 discusses the phase by phase approach followed for this systematic literature review.

Research Objective and Prior Research
The idea behind this research is to chalk out the previous research papers, their results, reviewing the efforts of GDPR-compatible Blockchain research. Explicitly focused on data subjects entitled rights to rectify transaction data and delete data whenever data is processed excessively. For this purpose, we have created 2 research questions to perform the research work more progressively is shown in Table 1. The transactions between participants are permanent, immutable, open, and visible in nature to every participant in Blockchain technology. In addition, the basic principle of this blockchain technology is to distribute data. But these discussed properties of blockchain make it difficult for the GDPR data subjects to utilize personal data privacy rights and the wide distribution of data in blockchain contradicts the principle of data minimization. RQ 2: Which methods or techniques are available for blockchain to exercise the GDPR data subject rights capacity on the right to erase, right to rectify the processed personal data?
A few articles addressed approaches on Selfsovereignty, Hashing techniques, Encryption, Decentralized identities, and Zero-knowledge proofs methods that help the data subject to utilize their rights. This question revolves around those papers which discussed the use-cases, applications, and types of techniques used to address data privacy issues of blockchain in concern to GDPR.

Literature Review of Primary Studies
In relation to compliance design between Blockchain technology and GDPR, so far as we might reasonably learn, Systematic Literature Reviews (SLRs) tend to be particularly limited on this topic. One of the very recent articles covered GDPR and Blockchain technology compliant design [25]. In [15], existential resolutions for self-sovereign distinctiveness on blockchain also investigate the problems associated with GDPR. However, there seems a requirement for the case through case analysis to understand the authorized uncertainties and privacy-enhancing technologies. When it comes to technologies, it cites a comprehensive review of technical and advanced cryptographic techniques to resolve conflicts when applied in permissioned and permissionless blockchains. Given sluggish adoption into real-life applications through blockchain, scientist's approach towards researching methods is remarkable [29]. Consent from the user is one of the major responsibilities of the Data controller when collecting personal data; the user here in terms of GDPR means data subject [30]. A decentralized model guarantees access to user data only by approved parties on the basis of user consent. This discusses the correctness, completeness mechanisms for user consent [3]. It discusses a comprehensive analysis of the current cutting-edge technologies in the field of privacy that retains research approaches and processes in blockchain privacy issues. There is still a need to be discussed about the main problems resulting from the applicability of techniques to protect privacy due to cryptographic operations. Further, there is a need to focus on exploring current threats to next-generation DFS technologies which are yet to be fully exploited.
In [13] discussed the information precise to be disremembered in concern with the right to erase the data if they believe that their personal data is no longer needed to be stored by the data controller. This paper expresses the intent of presenting digital information lawfully, which obliges others to obscure personal information about others at the data subject request. There are several questions and problems that have arisen in relation to the effect of GDPR on information security operations [11]. This work has given researchers [12] clear discussion in support of exchanging data in cybersecurity [27].
The studies explained the role of technologies such as Jolocom, Decentralized identities, Hashing techniques, and Encryption techniques and how those help data subjects to perform the secured transactions. A need for potential research in further to design-based solutions, use-cases, and reduced latency is much more required from the researchers in the nearby future. This helps to address the problem with increased accuracy and maintain the conduct of data privacy. Therefore, more researches need to be performed, as this is a newly developed concept, and there is a lot of areas for future researchers to explore more in this area and find out more feasible concepts and architectural solutions [26].

Research Methodology
We followed the Systematic Literature Review to build this paper according to the direction of [14] paper to achieve the aim of the research questions.

Primary studies
Research project studies were conducted by performing a search on particular keywords in respective search engine databases. Search engine databases that were used for searching research papers are mentioned below in Figure 3.

Quality Assessment Criteria
Criteria for inclusion and exclusion impose restrictions for the literature review. They are usually determined before the search is conducted after the research questions are set. However, scoping searches may need to be undertaken to determine appropriate criteria. Information about the requirements for inclusion and exclusion is generally reported as a paragraph.
This Systematic review addresses the necessities for inclusion and exclusion, which report observational findings dependent on the idea of the papers tending to the new territories of blockchain. Other than this, it talks about fortifying data protection issues with effective methodologies for use-cases.
The total number of studies found for the respective platforms for the initial keyword searches was 352. Then a search was done to remove the duplicates, and it was reduced to 210 articles. This SLR selected inclusion and exclusion criteria to set the boundaries for the systematic review. After following this criterion, the papers were reduced to a total of 60. Those 60 articles were interpreted in articles containing full-text and a remainder of 34 papers included. Using the Forward and Back snowballing method, 7 articles were identified, totaling the final number to 41 research studies.

Inclusion criteria
x The article ought to contain a clarification of sorts of blockchain transactions, their attributes, and concerns of privacy. x The article should be focused on the GDPR access rights of data subjects and whether distributed ledger technology is legitimate with consumer privileges to exercise their rights based on this paper research questions a) The subject data rights for processing of data. b) Compliance concerns of distributed ledger and GDPR. x The article ought to examine the use-cases that offer powerful and potential advancements that follow blockchain and information security, strategies for structure protection, self-sovereignty, and encryption.

Exclusion Criteria
x The articles are focusing solely on blockchain cryptocurrencies like Bitcoins, Ethereum, Libra, Litecoin, etc. x The papers were offering non-peer-reviewed literature, including technical reports, editorials.
x The articles are non-English and focus on different advances like IoT, programming, etc. Table 2 interprets the excluded studies, which are run on full-text analysis. We have found a total of 7 primary studies to be excluded after performing the quality assessment process suggested in [23]. Figure 4 demonstrates the Primary studies published over time.

Research Findings
All the primary studies were read and evaluated, both qualitative and quantitative, in the full specification. The research focused on irreversible blockchain existence, the permanence of blockchain-written data, and incorrigible transactions. Studies based on self-sovereignty, collective identification, hashing techniques, etc. The trends found in the primary studies highlight that due to its free and transparent nature of transactions, nearly half of all studies on blockchain and its data privacy issues concerned. Privacy techniques are the second most common theme, with 20%. The studies provide potential technological approaches for a self-sovereign individuality on blockchains also examine the problems that occur in concern to the European Union GDPR. This also speaks about how blockchain can get around these issues. ZKP applications hold great promise in terms of data protection through design and self-sovereign control. Each block contains the hash of the predecessor one, which means the block is connected linearly back to the original block of genesis. The challenge of modifying one block and finding correct hashes for all the following blocks is what makes the blockchain almost incorruptible or immutable. Some research papers discussed the data security strategies that need to be placed in place and how to rectify the data, although few papers discussed how data controllers can play a role for a data subject itself and manage the transactions in the blockchain.
In the blockchain, bitcoin is the most famous application of blockchain technology so far. Other cryptocurrencies such as Ethereum, Libra, Litecoin, and Zcash are also considered to be familiar [21]. In fact, Ethereum is considered to be the second-largest cryptocurrency. To show details of transaction nodes in the blockchain, we have taken bitcoin transactions as an example. Bitcoin blocks generally contain around 1500-2000 transactions. Blocks are limited to 1MB in size. A timestamp is a nonce, a hash list of predecessor blocks in a transaction chain as shown in Figure 5. Transactions records are historical, verifiable, incorruptible in mature. Each record adds to the chain of blocks.

Research Discussion
The initial keyword searches indicate a good amount of blockchain-related papers exist. But here, in this research paper, the only blockchain is not the consideration. Besides, it also deals with GDPR. So research keyword searches were mostly focused on blockchain and GDPR combined instead of solely searching on blockchain technology and GDPR separately. Though we have considerable papers on Blockchain and GDPR [31] if segregated separately, there are fewer amounts of papers that discuss qualitatively on the combination of data privacy GDPR and blockchain technology. The scope for research has been slowly

Research Question 1: How is blockchain technology non-aligned with the privileges of the data subject in GDPR?
Blockchain has compliance issues in handling GDPR personal data processing. Supposedly, it is because of the permanence nature of blockchain with respect to the GDPR principle of storage limitation. Besides this, it is also noted that blockchains can be effective in providing solutions and meet the necessities imposed by GDPR regulation. For instance, the permanence of activities completed on blockchains can empower arrangements that successfully follow the consent of data subject [1] . As a data privacy law, GDPR speaks to advancement instead of a revolution. The decentralized model used by blockchain brings about a large number of actors engaged with the processing. This adds a layer of unpredictability to compliance with a legitimate structure that was not planned in light of blockchain [33].
Participants should carefully select the type of blockchain that aligns with their design to the data protection processing principles under GDPR and always try to minimize the personal data stored in a chain. In a blockchain: a) Participants with the right to make entry can act as data controllers; b) Miners who validate the transaction containing personal data on a blockchain can act as processors; and c) Accessors may be acting either as processors or controllers.
Data subject access rights, right to access personal data and right to data portability are not, from the outset, but especially risky on the blockchain. Actualizing the rights to delete, object, and rectify can be challenging; however, there are few technical solutions that were talked about by some research papers prior in their studies which will help to exercise those rights that can draw nearer towards compliance with GDPR [16], [2]. As a data controller and data processor, an enterprise must be able to show compliance with the GDPR requirements, or at most, record how the implementation is progressing by performing risk assessments, data protection impact assessments company-wide.

Research Question 2:
Which methods or techniques are available for blockchain to exercise the GDPR data subject rights capacity on the right to erase, right to rectify the processed personal data? In GDPR, controllers or processors are organizations handling personal data. The test for deciding who is acting as a controller is focused on reality. The controller's job is to define the data processing means and ends. This is also unique to the processing carried out: an individual may behave as a controller in respect of a specific process related to a specific set of personal data and simultaneously as a processor in respect Version (4) hashPrevBlock (32) Times (4) Bits (4) Nonce (4) Version (4) hashPrevBlock (32) Times (4) Bits (4) Nonce ( of a different process related to the same set of personal data. Here entity means the transaction initiator, i.e., the data subject, maybe someone who is processing the data. Article 17 of GDPR states data subjects reserve the option to have individual information that is required no longer with the end goal of legitimate preparing to be erased. As discussed, permission blockchain is one of the answers for the option to restrict the processing; there are studies about the selfsovereignty method explained by researchers. The Sovereign or Decentralized Network is the first public permission blockchain as a global public utility to support self-sovereign identity and verifiable statements exclusively. Recent advances in blockchain technology now allow each public key to have its own address, known as a decentralized identifier (DID). A DID store on the public ledger along with a DID essay which includes the identification key for the DID, all other sensitive authorizations that the identity owner chooses to reveal to the identification, also the network statements for communication. A large no. of studies indicate that the identity owner manages the DID record using the Sovrin network by accessing the corresponding private key [34].
Jolocom framework helps in the storage of DID's on the public permission less blockchains. Sovereign, Jolocom does not store the authorizations on a blockchain. The authorizations are mutual with an agent, a cloud provider. The supervisor could be the distributing organization, some agent. Frequently authorizations can only be provided at the data subject request so that a credential is revoked under their control, while the distributing organization may only enhance notice of the revocation. A data subject is itself deemed to be responsible for the personal data processing; the GDPR may not extend to them. However, it might be applied to computers on behalf of the data subject [15].
To understand how to categorize the controller, the processor, we need to understand the transaction procedure in the blockchain. DIDs are deposited on a blockchain. On blockchains, we have to differentiate among the levels of supervisor on the blockchain level, transaction-level, besides -if applicable -the controller on the clever contract level. A real solution would be to simply store the personal data somewhere else, somewhere where we have read and write access. Let's say a secure server or cloud server. Then we can store a reference to that data on our blockchain [22]. Almost like a shortcut or pointer. To create this link, we make a digital fingerprint of our data using a hash function, and then we store that hash on the blockchain as shown in Figure 6 a. Hash has two interesting properties: x Hashes work in one way, meaning participants can create a hash of some data but cannot take the hash and turn it back into data. x The hash function allows us to verify that the files on the central server haven't been tampered with.
The hash stored inside the blockchain is just a string of random letters and numbers, but it qualifies as personal data [35] because it can be linked to the data on the server is shown in Figure 6 b. In order to exercise the right to erase a data subject, they just remove actual data from the central server. In that case, the hash in our blockchain becomes useless and no longer considered personal data because it points towards nothing.

Conclusion
Regardless of the fact that blockchain technology provides the upsides of transparency and immutability, but these properties of blockchain cause significant conflicts with GDPR data protection regulation. Blockchain developer's tasks ought to, in this way, cautiously investigate the information proposed for capacity in blockchain and weigh up its favorable circumstances and disservices of the sort on how blockchain to be utilized.
The beneficial thing about this topic is that blockchain is at a phase where the establishments are yet being constructed, and a portion of these establishments will have the option to consolidate the spirit and the letter of the GDPR over time.

Future Scope
Blockchain is a newer technology and EU GDPR is a new data privacy law, a further research scope is large to understand more about these two, as it build on it gives the opportunities to researchers to find the different approaches which would be feasible to maintain compliance with respect to personal data protection. We need to remember that data protection is a journey, but not a destination. The deeper the technologies get developed, the more there will be scope of understanding and resolving the issues with respect to the data privacy laws. There's needed to be a thorough research done on the following: x How Blockchain customers can rely on its transparency and be assured of the confidentiality and integrity of data using newly developed technologies?
x What are all of blockchain's cybersecurity issues that need to be discussed to prevent attacks?