FHIRChain: Applying Blockchain to Securely and Scalably Share Clinical Data

Secure and scalable data sharing is essential for collaborative clinical decision making. Conventional clinical data efforts are often siloed, however, which creates barriers to efficient information exchange and impedes effective treatment decision made for patients. This paper provides four contributions to the study of applying blockchain technology to clinical data sharing in the context of technical requirements defined in the “Shared Nationwide Interoperability Roadmap” from the Office of the National Coordinator for Health Information Technology (ONC). First, we analyze the ONC requirements and their implications for blockchain-based systems. Second, we present FHIRChain, which is a blockchain-based architecture designed to meet ONC requirements by encapsulating the HL7 Fast Healthcare Interoperability Resources (FHIR) standard for shared clinical data. Third, we demonstrate a FHIRChain-based decentralized app using digital health identities to authenticate participants in a case study of collaborative decision making for remote cancer care. Fourth, we highlight key lessons learned from our case study.


The Importance of Data Sharing in Collaborative Decision Making
Secure and scalable data sharing is essential to provide effective collaborative treatment and care decisions for patients. Patients visit many different care providers' offices during their lifetime. These providers should be able to exchange health information about their patients in a timely and privacy-sensitive manner to ensure they have the most up-to-date knowledge about patient health conditions.
As another example, in telemedicine practice Berman and Fenaughty [1]-where patients are remotely diagnosed and treated-the ability to exchange data securely and scalably is particularly important for enabling clinical communications regarding remote patient cases. Data sharing helps improve diagnostic accuracy Castaneda et al. [2] by gathering confirmations or recommendations from a group of medical experts, as well as preventing inadequacies Singh et al. [3] and errors in treatment plan and medication Kaushal et al. [4]; Schiff et al. [5]. Likewise, aggregated intelligence and insights Taichman et al. [6]; Warren [7]; Geifman et al. [8] helps clinicians understand patient needs and in turn apply more effective in-person and remote treatments.
Data sharing is also essential in cancer care, where groups of physicians with different specialties form tumor boards. These boards meet on a regular basis to analyze cancer cases, exchange knowledge, and collaboratively create effective treatment and care plans for each patient Gross [9]. Regional virtual tumor boards are also being implemented via telemedicine Ricke and Bartelink [10]; Marshall et al. [11] for institutions that lack inter-specialty cancer care due to limited oncology expertise and resources Levit et al. [12].

Administrative Support for Coordinating Health IT Efforts
The Office of the National Coordinator for Health Information Technology (ONC) is a division of the Office of the Secretary within the United States Department of Health and Human Services. ONC is the principal federal entity to oversee and coordinate health IT efforts, including the development of interoperable, privacy-preserving, and secure nationwide health information systems and the promotion of widespread, meaningful use of health IT to improve healthcare.
• Security and privacy concerns. Despite the need for data sharing, concerns remain regarding protection of patient identity and confidentiality Terry [13]. For instance, virtual medical interactions may increase the risk of clinical data breaches due to electronic transmission of data without highly secure infrastructures in place, which can result in severe financial and legal consequences Downey et al. [14]. Likewise, medical identity theft may occur more frequently, especially in telemedicine Terry [13], where virtual (i.e., networked) interactions are replacing face-to-face interactions between providers and patients. • Lack of trust relationships between healthcare entities. Trust relationships between healthcare entities Hripcsak et al. [15] (e.g., care providers and/or healthcare institutions) are an important precondition to digital communications Hartvigsen et al. [16] and data sharing in the absence of custody over shared data. Larger healthcare facilities (such as enterprise hospital systems) may be networked Maheu et al. [17], but communications between private or smaller practices may not be established. • Scalability concerns. Large-scale datasets may be hard to transmit electronically due to restrictive firewall settings or limitations in bandwidth (which is still common in rural areas LaRose et al. [18]). Lack of scalability can also impact overall system response time and data transaction speed Bondi [19]. • Lack of interoperable data standards enforcement. Without the enforcement of existing interoperable data standards (such as HL7's Fast Healthcare Interoperability Resources (FHIR)Bender and Sartipi [20] for shared data), health data can vary in formats and structures that are hard to interpret and integrate into other systems Richesson and Krischer [21].
What is needed, therefore, is a standards-based architecture that can integrate with existing health IT systems (and related mobile apps) to enable secure and scalable clinical data sharing for improving continuous, collaborative decision support.
Research focus and contributions → Architectural considerations for secure and scalable blockchain-based clinical data sharing systems. Blockchain technologies have recently been touted Das [22]; Mettler [23]; Azaria et al. [24] as a technical infrastructure to support clinical data sharing that promotes care coordination. A key property of blockchains is their support for "trustless disintermediation." This property enables multiple parties who do not fully trust each other to exchange digital assets (such as the Bitcoin cryptocurrency Nakamoto [25]), while still protecting their sensitive, personal data from each other.
Our prior work Zhang et al. [26] provided evaluation recommendations for blockchain-based health IT solutions on a high-level, focusing on common software patterns Zhang et al. [27] that can be applied to improve the design of blockchain-based health apps. This paper examines previously unexplored research topics related to alleviating the data sharing barriers described above, namely: what are the architectural consideration associated with properly leveraging blockchain technologies to securely and scalably share healthcare data for improving collaborative clinical decision support?
This paper provides the following contributions to using blockchain technologies in clinical data sharing to improve collaborative decision support: • We summarize key technical requirements defined in the "Shared Nationwide Interoperability Roadmap" DeSalvo and Galvez [28] drafted by the Office of the National Coordinator for Health Information Technology (ONC) for creating an interoperable health IT system and analyze the implications for blockchain-based system design. • We present the structure and funcationality of a blockchain-based architecture called FHIRChain that meets the ONC technical requirements for sharing clinical data between distributed providers. FHIRChain uses HL7's FHIR data elements (which have uniquely identifying tags) in conjunction with a token-based design to exchange data resources in a decentralized and verifiable manner without requiring duplicated efforts of uploading data to a centralized repository. • We demonstrate a FHIRChain-based decentralized app (DApp) that uses digital health identities to more readily authenticate participants and manage data access authorizations in a case study of clinical data sharing in remote cancer care. This DApp enables users to share specific and structured pieces of information (rather than an entire document), thereby increasing the readability of data and flexibility of sharing options. • We highlight key lessons learned from our case study and discuss how our FHIRChain-based DApp can be further extended to support other technical requirements for improving advanced healthcare interoperability issues, such as coordinating other stakeholders (e.g., insurance companies and pharmacies) across the industry and providing patients with direct and secure access to their own medical records. We also explore the data exchange issues that blockchains cannot yet address effectively, including semantic interoperability, healthcare malpractice, and unethical use of the data, which remain as future research problems in this space.

Paper Organization
The remainder of this paper is organized as follows: Section 2 provides an overview of blockchain technologies and the Ethereum platform, which is an open-source blockchain implementation that supports the development of DApps via "smart contracts;" Section 3 surveys different blockchain-based research approaches in the healthcare domain and compares our research on FHIRChain with related work; Section 4 summarizes ONC's key technical requirements for sharing clinical data and analyzes their implications for blockchain-based designs; Section 5 describes how the blockchain-based architecture of FHIRChain is designed to meet ONC requirements and motivates why we made certain architectural decisions; Section 6 analyzes the benefits and limitations of a case study that applied a FHIRChain-based DApp to provide collaborative clinical decision support; and Section 7 presents concluding remarks and outlines our key lessons learned and future work on extending the FHIRChain architecture described in this paper.

Overview of Blockchain
The most popular application of blockchain is the Bitcoin blockchain Nakamoto [25], which is a public distributed ledger designed to support financial transactions via the Bitcoin cryptocurrency. This blockchain operates in a peer-to-peer fashion with all transactions distributed to each network maintainer node (called a "miner") for verification and admittance onto the blockchain. These miners validate available transactions and group them into blocks, as shown in Fig. 1.
Miners then compete in solving a computationally expensive cryptographic puzzle, known as "proof-of-work," where a targeted hash value associated with the last valid block in the chain is calculated. The first miner to solve this puzzle receives a reward (i.e., an amount of Bitcoin) and appends their block of validated transactions to the blockchain sequence.
The Bitcoin blockchain uses the "proof-of-work" process outlined above to achieve consensus (agreement on the shared state and order of transactions) by • incentivizing miners to contribute powerful hardware and electricity to the network with small amounts of cryptocurrency as rewards and • discouraging rogue actors from attempting to manipulate or maliciously control the system.
After a block is added to the blockchain, its transaction history is secured from tampering via cryptography.
The Bitcoin blockchain is the most widely deployed example of this distributed ledger technology. In recent years, however, other types of blockchain technologies have emerged. For example, the Ethereum blockchain Buterin et al. [29] provides a more generalized framework via "smart contracts" Johnston et al. [30] that allow programs to run on the blockchain and store/retrieve information.
Smart contracts enable code to execute autonomously when certain conditions are met. They can also store information as internal state variables and define custom functions to manipulate or update this state. Operations in smart contracts are published as transactions and thus occur in a globally sequential order, in a similar fashion as shown in Fig. 1. These operations are deterministic and verifiable by miners in the Ethereum blockchain to ensure their validity.
The mechanisms described above make a blockchain decentralized and immutable, thereby removing the need for a trusted central authority. These properties make blockchain technologies attractive to certain communities of health IT researchers and practitioners as means to improve clinical communications while protecting the privacy of healthcare participants. The remainder of this paper examines how to effectively leverage blockchains for securely and scalably sharing clinical data that enables collaborative decision support.

Related Work Summary and Comparison
Due to the growing interest in using distribute ledger technologies for health IT systems, related work has explored various blockchainbased design considerations and prototypes. This section summarizes this related work and compares it with our research on FHIRChain and DApps that provide collaborative clinical decision support for remote patients.

Conceptual Blockchain-Based Design Considerations
Krawiec et al. Krawiec et al. [31] presented several existing pain points in current health information exchange systems and the corresponding opportunities provided by blockchain technologies. They also discussed how blockchain can be leveraged in the health IT systems so that patients, health providers, and/or health organizations can collaborate. Nichol et al. Peter B. Nichol [32] presented an analysis that assembles concepts in blockchain-related technologies and speculates on how blockchain can be used to solve common interoperability problems facing healthcare.
A team at IBM Team [33] took a broader approach by highlighting the challenges in the healthcare industry and providing concrete use cases to showcase potential applications of blockchain technologies. Our prior work also provided software design recommendations for creating general blockchain-based health IT systems Zhang et al. [27] and proposed assessment metrics for blockchain-based health systems Zhang et al. [26], which include a subset of the technical requirements defined in the ONC roadmap. This prior work of ours focused on providing more general or high-level recommendations for developers creating blockchain-based health IT systems.
The review paper by Kuo et al. Kuo et al. [34] presented several blockchain applications in healthcare, such as improved medical record management and advanced healthcare data ledger, and their benefits for each described application. They then analyzed key challenges associated with using blockchain technology for healthcare, including issues like confidentiality, scalability, and treat of a 51% attack on the blockchain network. According to the authors, some example implementation techniques that may mitigate the challenges are (1) encryption of sensitive data or dissemination of only meta data and storing sensitive data off-chain to protect confidentiality, (2) keeping only partial, ongoing verified transactions on-chain rather than the entire transaction history to increase scalability of the blockchain network, and (3) the adoption of a virtual private network or HIPAA-compliant components to prevent the 51% attack.

Blockchain Prototype Designs
Ekblaw et al. Ekblaw et al. [35] created a decentralized record management platform that enables patients to access their medical history across multiple providers. This platform used a so-called "permissioned" blockchain (which is only accessible by authorized users, rather than one that is open to the public) to manage authentication, data sharing, and other security properties in the medical domain. Their blockchain design integrated with existing provider data storage to enable interoperability by curating a representation of patient medical records. Medical researchers were incentivized to contribute to mining of the blockchain by collecting aggregated metadata as mining rewards.
Peterson et al. Peterson et al. [36] presented a healthcare blockchain also considers the integration with FHIR standards. They proposed a merkle-tree based blockchain system that introduces" Proof of Interoperability" as the consensus mechanism during block mining. Proof of interoperability is based on conformance to the FHIR protocol, meaning that miners must verify the clinical messages sent to their blockchain to ensure they are interoperable with known structural and semantic standards.
Dubovitskaya et al. Dubovitskaya et al. [37] also proposed a permissioned blockchain framework on managing and sharing medical records for cancer patient care. Their design employed a membership service to authenticate registered users using a username/password scheme. Patient identity was created via a combination of personally identifying information (including social security number, date of birth, names, and zip code) and encrypted for security. Medical data files were uploaded to a secure cloud server, with their access managed by the blockchain logic.
Unlike other blockchain designs, Gropper's "HIE of One" system Gropper [38] focused on the creation and use of blockchain-based identities to credential physicians and address the patient matching challenge facing health IT systems. Patients are expected to install a digital wallet on their personal devices to create their blockchain-based IDs, which can then be used to communicate with the rest of the network. Instead of storing patient information, Gropper's system would consume only the blockchain-based ID and use it to secure and manage access to patient data located in EHR systems.

Differentiating our Research Focus of FHIRChain from Related Work
This paper presents our blockchain-based framework, called FHIRChain, whose architectural choices were explicitly designed to meet key technical requirements defined by the ONC interoperability roadmap. Our design differs from related work on blockchain infrastructures and associated consensus mechanisms since it is decoupled from any particular blockchain framework and instead focuses on design decisions of smart contract and other blockchain-interfacing components. FHIRChain is thus compatible with any existing blockchains that support the execution of smart contracts.
In the remainder of this paper we describe how our FHIRChainbased DApp demonstrates the use of digital health identities that do not directly encode private information and can thus be replaced for lost or stolen identities, even in a blockchain system. While our approach is similar to the use of digital IDs in the HIE of One Gropper (2016) system, FHIRChain provides a more streamlined solution. In addition, we incorporate a token-based access exchange mechanism in FHIRChain that conforms with the FHIR clinical data standards. Finally, we leverage public key cryptography to simplify secure authentication and permission authorizations, while simultaneously preventing attackers from obtaining unauthorized data access.

Technical Requirements for Blockchain-Based Clinical Data Sharing
The "Shared Nationwide Interoperability Roadmap" defines technical requirements and guiding principles for creating interoperable health IT systems DeSalvo and Galvez [28]. Based on our experiences to date, we contend that crafting a blockchain architecture to meet these requirements necessitates overcoming significant challenges to utilize blockchain technology in healthcare most effectively.
This section first analyzes five key technical requirements fundamental to clinical data sharing systems and then discusses the implications of these requirements on blockchain-based architectures. Sections 5 and 6 subsequently describe how we developed and applied our FHIRChain blockchain-based architecture to create a decentralized app (DApp) that meets the ONC requirements in the context of collaborative clinical decision making.

ONC Requirement Summary
The ONC requirements state that an identity ecosystem should be employed to minimize identity theft and provide redress in case of medical identity fraud, while complying with individual privacy regulations. Providers, hospitals, and their health IT systems should be easily identity-proofed and authenticated when exchanging electronic health information. Healthcare systems today, however, lack "consistently applied methods and criteria" for identity proofing and authentication across organizations DeSalvo and Galvez [28]. For example, different network service providers have different policies or requirements and may not acknowledge the methods applied by other network service providers.
One of the most popular-and least complex-approaches to exchange data is through direct secure messaging DeSalvo and Galvez [28]. For example, the Direct project (HealthIT.gov [39]) was launched to create a standard way for participants to send authenticated, encrypted health information directly to known, trusted recipients over the Internet. Providers or care centers using EHR systems without Direct integration, however, cannot benefit from the direct exchange capability.

Implications for Blockchain-Based System Design
For a blockchain-based system, storing identification information (such as personal email) directly on-chain is problematic (Greenspan, [59,40]). In particular, a property of blockchains is information "openness," i.e., all data and associated modification records are immutably recorded and publicly available to all network participants. In the case of Bitcoin, data is open to everyone with Internet access Nakamoto [25], whereas in a non-public blockchain (such as a consortium blockchain Buterin et al. [29]) data access is limited only to authenticated blockchain participants.
To meet the requirement of openness while complying to health privacy regulations for Disease Control et al. [41], a blockchain-based system should thus support user identity-proofing and authentication while encapsulating sensitive personal information. Section 5.2.1 shows how FHIRChain addresses this identifiability and authorization requirement via digital health identities based on public key cryptography Menezes et al. [42].

ONC Requirement Summary
The ONC requirements state that data should be shared securely and privately without unauthorized or unintended alteration, while making the information available to authorized parties. Data encryption is a recommended both when data is sent over networks (data-in-motion) and when it is stored (data-at-rest). Management and distribution of encryption keys must be" secure and tightly controlled" DeSalvo and Galvez [28].

Implications for Blockchain-Based System Design
There has been recent interest Al Omar et al. [43]; Yue et al. [44] in using blockchain technologies as decentralized storage for encrypted health data. As discussed in Section 2, however, the open and transparent nature of blockchain raises privacy concerns when attempting to integrate blockchain into the health IT domain. Although sensitive data can be encrypted, flaws in encryption algorithms or software implementations may expose the data contents in the future. To ensure long-term data security, therefore, a data storage design should be "simple" to minimize software bugs (Shea, [45]), e.g., by not storing sensitive data (encrypted or not) on-chain, yet still enable data flow from one user to another Zhang et al. [26].
Another implication of storing data on a blockchain is scalability. All blockchain transactions (such as storing data in a smart contract and modifying the data) and data records are distributed as an entire copy to all blockchain nodes. In a public blockchain, moreover, transaction fees are paid to miners to reward their validation efforts, as described in Section 2. As new data is added or modified, each change must be propagated to all nodes, raising scalability challenges and potentially incurring significant long-term operational costs. Section 5.2.2 shows how FHIRChain addresses this requirement via a hybrid on-chain/offchain storage model.

ONC Requirement Summary
The ONC advocates "computable privacy" that represents and communicates the permission to share and use identifiable health information DeSalvo and Galvez [28]. Individuals should be able to document their permissions electronically, which are then honored as needed. Permission authorizations to receive or access an individual's clinical data should be accurate and trustworthy, requiring both the data requestor and holder to have a common understanding of what is authorized.

Implications for Blockchain-Based System Design
Unfortunately, smart contract operations only occur in the blockchain space to ensure deterministic outcomes. Services (such as OAuth Hardt [46]) that exist off the blockchain therefore cannot be used. Given this constraint, incorporating other alternatives to provide data access permissioning should be a key component of a blockchainbased design. Section 5.2.3 shows how FHIRChain addresses this requirement via a token-based permission model.

ONC Requirement Summary
To satisfy interoperability needs, the ONC requirements state that health IT systems should be implemented with an "intentional movement and bias" DeSalvo and Galvez [28] toward a clinical data standard identified by ONCs recently finalized Interoperability Standards Advisory (Introduction to the isa, [47]). The data exchanged should be structured, standardized, and contain discrete (granular Kim Futrell [48]) information. Likewise, standards should use metadata to communicate their context along with pieces of structured data.

Implications for Blockchain-Based System Design
To provide collaborative clinical decision support, health IT systems must present shared data to clinicians in a structured and readable format Kawamoto et al. [49]. This requirement implies the enforcement of existing, commonly accepted clinical data standard(s), rather than introducing new data exchange formats. Section 5.2.4 shows how FHIRChain addresses this requirement by enforcing the FHIR standard. The ONC requirements state that since technology inevitably changes over time, health IT system designs should be capable of evolving by maintaining modularity. When divided into connected, modular components, health IT systems become more resilient to change with increased flexibility. In turn, these properties enable the adoption of newer, more efficient technologies over time without rebuilding the entire system.

Implications for Blockchain-Based System Design
Modularity requires a carefully crafted design to avoid "information lock-in" due to the immutability of smart contracts. Every change to a smart contract code creates a new contract instance on the blockchain, nullifying previous versions and their data. To minimize dependencies and the need to upgrade, therefore, smart contracts should be loosely coupled with other components in the system. Section 5.2.5 shows how FHIRChain addresses this requirement by applying the modelview-controller (MVC) pattern Leff and Rayfield [50].

FHIRChain: A Blockchain-Based Architecture for Clinical Data Sharing
This section first presents an overview of FHIRChain, which is a blockchain-based architecture we designed to meet the ONC requirements for secure and scalable sharing of clinical data described in Section 4. We then explain why we made specific architectural decisions in FHIRChain to address each requirement and how they solve the five challenges facing blockchain technology described in Section 4. This architecture provides a general data sharing solution applicable to a wide range of health IT systems. It also serves as the basis for our decentralized app (DApp) prototype describe in Section 6, which customizes FHIRChain to support collaborative clinical decision making using a case study of cancer care in telemedicine.

FHIRChain Overview
The dashed ellipse in Fig. 2 represents a blockchain component that mediates data sharing between collaborating medical professionals (represented by providers with green check marks). Clinical data silos are represented by heterogeneous database symbols, which we normalized with the FHIR standards to enforce a common structure of shared data. Secure database connectors (represented as small circles) connect siloed data sources to the blockchain by exposing secure access tokens to data references that can be obtained only by authorized entities. The secure tokens are recorded in a smart contract (represented by linked documents) for decentralized access and also traceability.
In addition to storing secure access tokens, the smart contract also maintains an immutable timestamped transaction log (represented as a keyed file symbol) of all events related to exchanging and actually consuming these tokens. These logs include specific information regarding what access has been granted to which user by whom, who has consumed which token to access what resource, etc. To ensure the validity of shared data, FHIRChain can be configured to only approve participation from certified clinicians and healthcare organizations with a membership registry.

FHIRChain Architectural Decisions that Address Key ONC Technical Requirements
Below we explain why specific architectural decisions were made to address each ONC requirement presented in Section 4. 5.2.1.2. Problem. By design, public blockchains are globally accessible to anyone with Internet access and allow users to hold any number of blockchain accounts to minimize the identifiability of account holders. This ONC requirement, however, specifies that all U.S. healthcare participants should be identifiable, implying the need for an entirely separate, traceable user base from blockchains' native identities. A key problem is thus how to properly define identities for healthcare users participating in clinical data sharing, while protecting sensitive personal information on the blockchain. address generation mechanism, FHIRChain employs public key cryptography Menezes et al. [42] to create and manage health identities. In public key cryptography, a pair of mathematically related public and private keys is used to create digital signatures and encrypt data. Since it is computationally infeasible to obtain the private key given its paired public key, these public keys can be shared freely, thereby allowing users to encrypt content and verify digital signatures. In contrast, private keys are kept secret to ensure only their owners can decrypt content and create digital signatures. FHIRChain generates a cryptographic public/private key pair (also used for encryption, as described in Section 5.2.3) for each participating provider, e.g., in-house providers and remote physicians in telemedicine clinics. The public keys represent users' digital health identities. These identities are recorded in the blockchain for both identity-and tamper-proofing, thereby ensuring that users holding the corresponding private keys can be authenticated to use FHIRChain's data sharing service.
FHIRChain's design applies a smart contract to maintain health users' identifiability without exposing personal information on the blockchain. It also replaces the need for a traditional username/password authentication scheme with the use of a public/private cryptographic key pair for authentication. In a general clinical setting, these digital health identities (i.e., their private keys) would be hard to manage for patients. FHIRChain, however, only creates these identities for clinicians to facilitate data sharing, which enables more effective collaborative decision making for patients.

Context.
A key capability offered by blockchains is their support for "trustless" transactions between parties who lack trust relationships established between them. Bitcoin is the most common example of this "trustless" exchange via its native cryptocurrency. Blockchains are peerto-peer by nature and thus contribute to the ubiquitousness of digital assets being transacted.

Problem.
Health data represented via digital assets are more complex and harder to share en masse. There are also privacy and security concerns associated with its storage in an "open" peer-to-peer system (i.e., public blockchains), such as encryption algorithms applied to protect data being decryptable in the future Zhang et al. [26]. A key problem is thus how to design a blockchain-based health IT system so that it balances the need for ubiquitous store and exchange and the concerns regarding privacy of the data and scalability of the system.

Design Choice → Keeping Sensitive Data off-Chain and Exchanging
Reference Pointers on-Chain. Rather than storing encrypted health data in the blockchain, a more scalable and secure alternative is to store and exchange encrypted metadata referencing protected data (i.e., a reference pointer to a data set), which can be combined with an expiration configuration for short-term data sharing. Exchanging encrypted reference pointers allows providers to maintain their data ownership and choose to share data at will. This technique also prevents an attacker who intercepts the encrypted pointers from obtaining unauthorized data access.
FHIRChain attaches a secure connector to each database, as shown in Fig. 2. Each connector generates appropriate reference pointers that grant access to the data. These reference pointers are digital health assets that can be transacted ubiquitously with reduced risks of exposing the data.
An added benefit of exchanging metadata en masse is more scalability compared to exchanging the original data source. As discussed in Section 4.2, each transaction or operation on the blockchain (e.g., querying a smart contract state variable value or updating it) is associated with a small fee paid to the miner for verification and then included onto the blockchain. Transacting these lightweight reference pointers is more efficient in terms of time and cost in production because small changes to data generally require no modifications to reference pointers.

5.2.3.1.
Context. Data references can be stored on the blockchain for ubiquitous access via a smart contract. Access rights, however, must be granted only to authorized providers for viewing the data. As discussed in Section 4.3, OAuth is a popular platform for communicating permissions in web-based apps that are not based on blockchain.

5.2.3.2.
Problem. Smart contracts cannot directly use external services like OAuth since they do not produce deterministic outcomes that can be verified by blockchain miners. A key problem is thus how to design a mechanism that balances the need of permission authorization for clinical data and blockchain requirements for deterministic outcomes.

Design Choice → Token-Based Permission Model.
To overcome the limitation with public blockchains, FHIRChain protects the shared content via a secure cryptographic mechanism called "sign then encrypt" Krawczyk [52]. This design employs the users' digital health identities to encrypt content so that only users holding the correct digital identity private keys can decrypt the content. FHIRChain also generates a new pair of signing keys for each participant and registers the public portion of signing keys alongside users' digital identities.
To concretely demonstrate this workflow, Fig. 3 provides an example of using FHIRChain to create and retrieve an access token.
Suppose provider Alice would like to initiate sharing of her patient's data, denoted as D Alice (with a reference pointer, denoted as RP Alice ) with another provider Bob. FHIRChain creates a digital signature on the shared content RP Alice , with Alice's private signing key SKS Alice for tamper-proofing as a first step. With Bob's public encryption key, PK Bob , FHIRChain encrypts the signed RPS Alice to obtain an encrypted token EncRPS Alice , and then stores EncRPS Alice in a smart contract for ubiquitous access.
When Bob wants to obtain the content Alice sent, he must use his corresponding private encryption key SK Bob to decipher the real content of EncRPS Alice . Bob also verifies that this content was indeed provided by Alice with her public signing key PKS Alice . This authentication process is automated by the DApp server component interfacing the smart contract, as discussed in Section 5.2.5.
Digital signing ensures that a resource is indeed shared by the sender and is not tampered with. Likewise, encryption protects the information against unauthorized access and spoofing. The data requestor's access to a resource can be approved or revoked at any time via a state update in the smart contract by the data holder where all permissions are logged.
Role-based or attribute-based permissions can also be implemented off-chain in the same manner as in a traditional centralized system (e.g., via Active Directory). In this case, a meta-cryptographic key pair would be created for each role or type of attribute and securely stored within the system's database. The system can then be configured so that only allows users meeting certain permission criteria to use the key for data access, while shielding users from unessential details.

Problem.
Blockchain-based health IT systems should facilitate data sharing, while adhering to some existing standard(s) for representing the clinical data. A key problem is thus how to design a blockchain-based architecture to enforce the application of existing clinical data standard(s).

Design
Choice → Enforcing FHIR Standards. FHIR, a proposed interoperability standard developed by HL7, is based on modern web services (i.e., HTTP-based RESTful protocol) and supports the use of JSON Crockford [53], which is a popular format for exchanging information on the web. JSON is more compact and readable compared to the XML format used by other data formatting standards, thereby enabling more efficient transmission of JSON-encoded data. It is also compatible with many software libraries and packages. As more health IT systems upgrade their data exchange protocols to comply to FHIR standards, FHIRChain enforces the use of FHIR to shared clinical data by validating whether the generated reference pointers follow the FHIR API standards Bender and Sartipi [20]. 5.2.5. Addressing Requirement 5: Maintaining Modularity 5.2.5.1. Context. Health IT system updates and/or upgrades are necessary to adopt more efficient, secure, or prevalent technology as it advances.

Problem.
If functions in a smart contract have too many dependencies on the rest of a health IT system, then each upgrade to the system must deploy a new contract, which requires restoring data from previous versions to prevent loss. A key problem is thus how to design a modular data sharing system that minimizes the need to create new versions of existing contracts when the system is upgraded. For example, when more user friendly features are needed, a good design should separate those updates from the underlying back-end services so that a change in the user interface does not require modifications of the server or blockchain component.

Design
Choice → Applying the Model-View-Controller (MVC) Pattern. The MVC pattern Leff and Rayfield (2001) separates a system into three components: (1) the model, which manages the behavior and data of a system and responds to requests for information about its state and instructions to change state, (2) the view, which manages the display of information, and (3) the controller, which interprets user inputs into appropriate messages to pass onto the view or model.
The FHIRChain architecture applies the MVC pattern to separate concerns with individually testable modules as follows: (1) a model in the form of an immutable blockchain component is used to store necessary meta data via smart contracts; (2) a view provides a front-end user interface that accepts user inputs and presents data; (3) a controller is a server component with control logic that facilitates interactions with data between the user interface and blockchain component, such as queries, updates, encrypting and decrypting contents; and (4) a controller-invoked data connector service is used to validate the implementation of FHIR standards and create reference pointers for the data sources upon requests from the server.
The workflow for updating data access is shown in Fig. 4 by the following steps 1-4: 1. A user first authenticates through the user interface (UI), and when successfully authenticated, data access permission request can be input to the system; 2. The UI forwards users request to the server; 3. The server logs permissioned or revoked access in the blockchain component (BC); and 4. The server updates UI with proper response to notify the user.
Likewise, the workflow for accessing a data source is outlined in the following steps a-e: a. The user first authenticates via the UI, and when successfully authenticated data access request can be input to the system; b. UI forwards user's request to the server; c. The server queries BC for current user's access token(s); d. When permission is valid, the server decodes the access token(s) with correct keys supplied by user and uses the decrypted reference pointer to obtain actual data from the DB connector to the proper database; e. When data has been retrieved from the data source via DB connector, the server updates UI to display data in a readable format.
FHIRChain stores all relevant information in smart contracts, decoupling data store from the rest of the system. This decoupling enables future upgrades to all other components without losing access to -or locking out-existing users or their permission information.

Case Study: Applying FHIRChain to Create a Prototype DApp
This section first describes the structure and functionality of a decentralized app (DApp) that customizes the FHIRChain architecture described in Section 5 to support collaborative clinical decision making via a remote tumor board case study. We then analyze the benefits and limitations of our DApp case study.

Overview of the FHIRChain DApp Case Study
The FHIRChain DApp is written in Javascript. It consists of ∼1000 lines of core app code that interacts with a private testnet of the Ethereum blockchain and three Solidity smart contracts, each containing ∼50 lines of code. Our DApp customizes the FHIRChain architecture in a private Ethereum testnet to address the various ONC requirements described in Section 4. This DApp has an intuitive user interfacing portal that facilitates the sharing and viewing of patient cancer data for a remote tumor board to collaboratively create treatment plan for cancer patients. In addition, the DApp implements a notification service Zhang et al. [27] that broadcasts events to appropriate event subscribers. The FHIRChain DApp notification service is used to alert collaborative tumor board members when new data access is available for review.
Verifying identity and authenticating participants with digital identities, as discussed in Section 5.2.1. Our DApp contains a Registry smart contract that maintains the digital health identities of providers who registered with our app. The registry maps provider email addresses (or phone numbers) from a public provider directory to both their public encryption (used as digital identity) and signing keys, which are generated automatically at user registration time. Fig. 5 demonstrates the user registration and authentication workflow.
Storing and exchanging data securely with FHIR-based reference pointers, as discussed in Sections 5.2.2 and 5.2.4. Our DApp defines two cancer patient databases and referencing paths to patient data entries using the open-source HapiFHIR (HapiFHIR, [54]) public test server. Validation of the FHIR implementation is performed via regular expression parsing of the paths against the FHIR APIs Bender and Sartipi [20].
Permissioning data access with token-based exchange, as discussed in Section 5.2.3. Our DApp also contains an Access smart contract that logs all user interactions and requests on the portal, e.g., what resource is shared or no longer shared with which provider by whom and when. These access logs are structured as a mapping between user digital health identities (public encryption keys) and authorizations to custom-named access tokens (represented as a nested object associated with a true/false boolean value indicating if an access token access is granted for a provider). If an access revocation occurs, authorization is set to false and the associated token is set to an empty value. The workflow of this process is shown in Fig. 6.
Maintaining modularity with the MVC pattern, as discussed in Section 5.2.5. The view component is a user interfacing portal that accepts provider user input, including registration and authentication credentials (corresponding keys) and data access information (e.g., tumor board member email to query, a reference pointer to securely access data, and approval/revocation of access). Fig. 7 is a screenshot of our DApp, presenting the following features (1) display recent sharing events related to the user, (2) display reference pointer APIs created by logged in user and available actions, and (3) display all references shared with logged in user and the option to view data.
The portal then forwards the user requests along with data input to the sever component, where all the complex logic is encapsulated.
Our FHIRChain DApp server performs all functions and control logic, including verifying provider user email account, generating cryptographic keys, token creation via signing and encryption, token retrieval via decryption and signature verification, forwarding requests and delegating tasks between the portal and blockchain. The blockchain component is an independent model component containing two smart contracts for ubiquitous storing and persisting event logs of data access.

Benefits of our FHIRChain DApp Case Study
Our FHIRChain Dapp case study achieved the following benefits: • Increased modularity. To increase modularity, we applied the "separation of concerns" principle Ossher and Tarr [55] to decompose our DApp into independent components. FHIRChain employs a peer-topeer API exchange protocol that references data pointers stored in a smart contract on the blockchain. In this design, exchanged information becomes lightweight, which increases scalability since system performance remains the same regardless of the original size of the data. Likewise, data is not transmitted electronically across institutional boundaries, thereby reducing the risk of data being compromised. • Scalable data integrity. To ensure scalable data integrity, our design maintains a hash of the original data to exchange in addition to the reference pointer of the data. Suppose that the original data being exchanged is of size N and that the size of its reference pointer is ϵ.
The total amount of data stored on-chain in terms of space complexity is then O(hash(N) + ϵ). Since the hashed output of a variable-length input can be a fixed value, it consumes a constant amount of space. The size of a data reference pointer would be scalably smaller than the actual data size. This design therefore enhances scalability by using constant-sized representations of the data, rather than using the actual data. • Fine-grained access control. To enable fine-grained access control, permissions to access a data source can be given or revoked at will by providers across various institutions regardless of their trust relationships. By implementing the FHIR standards, more granular access can be granted to selected pieces of data rather than an entire document, which also increases data readability. Moreover, all events related to data sharing and data access are logged in a transparent history for auditability. • Enhanced trust. The DApp applies public key cryptography, which enhances trust to participants in the following ways:

Identifiability and Authentication
Given the computation power today, it is infeasible to impersonate a user without knowing their private key, and the only way a user can be authenticated to use our service is to provide the correct private key paired with their public key registered on the blockchain. On the other hand, it is trivial to create a new public/private key pair in case of a user's private key being lost or stolen. This "digital identity" approach has been successfully adopted in Estonia's government and healthcare infrastructure Alvarez et al. [56].

Permission Authorization
With public key encryption securing their data reference pointers, users can trust that none other than the intended data recipient can view what they have shared. FHIRChain never shares the reference pointer with any user. Instead, RP is used to display the data content when it is decrypted with an authorized user's private key. In addition, users can approve or revoke data access at any time, and the request takes effect immediately.

Limitations of our FHIRChain DApp Case Study
Since our FHIRChain DApp was designed based on several assumptions it incurs the following limitations: • Does not address semantic interoperability. FHIRChain cannot address data exchange challenges related to semantic interoperability that are not yet fully captured by the FHIR standards. To provide semantics to clinical data, therefore, manual inspection and mapping of predefined ontologies from medical and health data experts are required, which remain the focus of our future research in this space. • May not be compatible with legacy systems not supporting FHIR.
Many legacy systems may use other messaging standards, such as the more prevalent HL7 v2 standards Dolin et al. [57], and do not support FHIR protocols. The goal of this paper, however, is to present the underlying representations and theories of our blockchain-based system. Although we advocate FHIR in the paper because it has been used quite frequently and it supports fine-grained data exchange, the To overcome these limitations in future work, we will deploy our DApp in a permissioned consortium blockchain platform with trusted parties to ensure consensus through a variation of proof-of-work that incentivizes mining with cryptocurrency rewards. For instance, Ekblaw et al. [35] proposes to use aggregated data as mining rewards in their system, while MultiChain Greenspan [60] enforces a round-robin mining protocol in their blockchain. With the ability to replace monetary incentives to maintain consensus on the blockchain, the cost to use this blockchain-based service will be lower in the long run, although the initial deployment may still be expensive.
Although permissioned systems may be prone to collusion due to the 51% attack problem Buterin et al. [29], the permissioned system used for healthcare would be maintained and managed by relatively large-scale entities/stakeholders within the healthcare industry. Unless majority of them (major hospitals, insurance companies, etc.) collude, therefore, the chance of experiencing this type of attack is quite low. Moreover, legal actions would most likely occur immediately upon the attack.

Concluding Remarks
This paper described the FHIRChain prototype we designed to provide patients with more collaborative clinical decision support using blockchain technology and the FHIR data standards. Complemented by the adoption of public key cryptography, our FHIRChain design addressed five key requirements provided by the ONC interoperability roadmap, including user identifiability and authentication, secure data exchange, permissioned data access, consistent data formats, and system modularity.
The following are the key lessons we learned from designing and implementing our DApp based on FHIRChain: • FHIRChain can provide trustless, decentralized storage for necessary meta information and audit logs. FHIRChain alleviates proprietary vendor-lock found in conventional health IT systems by leveraging its blockchain component as a decentralized storage of necessary reference information as secure access points into those databases. It enables the sharing of clinical data without established trusts, providing clinicians with secure and scalable collaborative care decision support. In addition, each public key generated for a user is stored in the blockchain via a smart contract used to associate healthcare participants with their digital identities. Similarly, permission authorizations established between those participants are recorded in a smart contract as well, creating a traceable permission database with an audit log of data exchange history (i.e., meta information involved during the data exchange and not the actual data). Storing these data on the blockchain ensures that our app is not subject to a single point of failure or corruption of records so that it is always accessible by healthcare participants. • FHIRChain facilitates data exchange without the need to upload/download data thus maintains data ownership. The FHIR standards provide resource APIs to reference specific pieces of structured data while maintaining original data ownership. By adopting FHIR and combining it with blockchain technologies, FHIRChain creates lightweight reference pointers to siloed databases and exchange these pointers via the blockchain component instead of actual data. For telemedicine clinics or clinics in rural areas in particular, this approach can overcome network limitations by enabling scalable data sharing without requiring data to be uploaded to some other centralized repository, through which data can be shared and downloaded by other parties. In addition, this approach reduces risks of compromised data and ensures that original data ownership is respected. The reference pointers are encrypted with the intended recipient's public key, i.e., digital identity to permission data access. When successfully authenticated (i.e., reference pointers are correctly decrypted) the data will be downloaded directly from the source and present properly formatted data to the user. • Public key cryptography can be effective for managing digital health identity in data sharing. FHIRChain creates public keys as digital health identities associated with each collaborating care entity (provider or organization administrator). The benefits to this strategy include: (1) easy authentication since a clinician only needs to provide their private key associated with their identity, (2) integrity since by signing the exchanged reference pointers FHIRChain can easily verify that it was provided by the signed provider and has not been modified, and (3) remedy to lost or stolen keys since a new key can be created easily to replace the old key and associate with the same user. There is a drawback, however, to using digital identities for patients in a general clinical setting. Managing these identities-private keys-is hard because private keys are harder to remember than conventional passwords and require technical training for patients to manage their own keys. Nevertheless, there are approaches for managing private keys for larger populations, such as using key wallets Even et al. [61]; Nakamoto [25] or embedding private keys to physical medical ID cards Anthes [62].
In summary, our FHIRChain-based DApp demonstrates the potential of blockchain to foster effective healthcare data sharing while maintaining the security of original data sources. FHIRChain can be further extended to address other healthcare interoperability issues, such as coordinating other stakeholders (e.g., insurance companies) across the industry and providing patients with easier (and secure) access to their own medical records.
In our future work, we plan to refine the simulations for more rigorously evaluating the performance of our FHIRChain prototype. We will do so by deploying and comparing a number of different blockchain configurations in a testbed environment, such as using the blockchain template provided by Amazon Web Services (AWS, [63]). Moreover, we will research techniques for identity management targeting the patient population.