Balancing the demand for open data with the need to protect sensitive data

With the rise in narratives of open data, access to sensitive, legally controlled data via Trusted Research Environments (TREs) is sometimes criticized as contrary to the FAIR (Findable, Accessible, Interoperable, Reuseable) principles of openness. But TREs that follow international best practice can, in fact, increase, not diminish, FAIRness of data.  Adopting the internationally recognized Five Safes framework could help to better protect sensitive data while still meeting demands for a more open and accessible service delivery on the FAIR principles.

Angesichts der zunehmenden Bedeutung offener Daten wird der Zugang zu sensiblen, rechtlich kontrollierten Daten über vertrauenswürdige Forschungsumgebungen (Trusted Research Environments, TREs) manchmal als Widerspruch zu den FAIR-Grundsätzen (Findable, Accessible, Interoperable, Reuseable) der Offenheit kritisiert. Doch TREs, die internationalen Best Practices folgen, können die FAIRness von Daten tatsächlich erhöhen und nicht verringern. Die Übernahme des international anerkannten Five-Safes-Rahmens könnte dabei helfen, sensible Daten besser zu schützen und gleichzeitig die Forderungen nach einem offeneren und zugänglicheren Dienst zu erfüllen, der den FAIR-Grundsätzen entspricht.

DOI: 10.34879/gesisblog.2024.74


The need for a new framework for secure data access

In recent years we’ve seen a strong drive towards open data and open science underpinned by the FAIR Principles. Simultaneously, we’ve seen an increasing demand for sensitive, legally protected data, especially for more detailed data. These data come with higher risks of disclosure so are subject to stricter access controls and accessible only via TREs.

Strict access controls are often seen as contrary to open science and FAIR principles, but this is not necessarily accurate. The international TRE community works to ensure that these sensitive, detailed data are accessible for researchers through building trustworthy governance infrastructures. A key component in these efforts is the Five Safes Framework which was developed in the UK in 2003 as a way of thinking about how to provide safe and ethical access to potentially disclosive, sensitive data.1

The Framework consists of five principles which together enable the consideration of the potential risks and inform appropriate controls. The idea is not for the Five Safes to be used in a rigidly prescriptive way, rather it is intended that each principle be adjusted to the level of risk and the needs of the access requirements of the specific TRE.

Areas of potential improvement for TRE

By applying the Five Safes, we’ve identified several areas of potential improvement for any TRE. We will show those improvements in the Secure Data Center at GESIS. The Five Safes have not yet been adopted by the Secure Data Center, but many of its current procedures fit within the framework and work is underway to more fully and formally adopt the Five Safes. In the current security model, many of the data protection controls are focused on the data and the setting, and whilst this is secure, it leads to a more restrictive system for the research community.

Safe Projects

Currently researchers must complete a Data Use Agreement (DUA), which outlines what data they want and what they want to use it for. Researchers don’t always specify a very clearly defined project, and often the proposed analysis is quite exploratory or focused on a single paper. Current work includes redefining what a project is and writing clearer guidance for researchers in how to complete the DUAs.

Safe People

In countries using the Five Safes, there is often mandatory training for researchers applying to access data. This training focuses on data protection and producing safe outputs.Apart from a short in-person discussion with new researchers on their first visit, there is no formal ‘safe researcher’ training in Germany. From the UK example, the benefits of such training are clear, so a new training program is being developed by the Secure Data Center in partnership with the German Human Genome Phenome Archive and Ludwig Maximilian University.

Safe Setting

At present the Secure Data Center at GESIS makes data available via its Safe Room in Cologne and via remote access points in partner Safe Rooms in Mannheim and the UK. These physical safe rooms have a range of physical controls in line with long established good practice across the TRE community. A new IT infrastructure launched in 2022 ensures that GESIS has the most up-to-date security features. Additional measures including 2-Factor Authentication are being developed.

Safe Outputs

All research outputs produced by researchers should be checked twice, with expert staff carrying out 2 independent SDC checks. Recent work includes implementing new sensitivity rules to further harmonize with internationally recognized best practice and testing a new semi-automated output checking system.

Advancing the FAIR Principles

Utilizing the Five Safes to review and update the Secure Data Center will enable it to offer a more efficient and researcher focused service. But there are further benefits. The Five Safes will help to ensure that GESIS offers a service that both supports and advances the FAIR principles.  

Accessible

While controlled access may not appear “open”, in reality, without TREs, this data would not be shared at all.  The Five Safes is a well-established system, and crucially, ensures access does not require altering data (by removing disclosive but valuable information). Through adopting the Five Safes a broader ecology of protections is in place, meaning that disclosive data can be shared safely.

TREs plus the Five Safes are expanding data access in several ways. Firstly, the smoother, more efficient processes resulting from the Five Safes review should make accessing these data a more viable option for researchers by reducing waiting time for output checks and make it easier for researchers to navigate the application process.

Secondly, by adopting the Five Safes, GESIS adheres to internationally recognized standards, which is key to opening up access to international data, as common standards build trust and encourage more data providers to share their data! For example, researchers can now access sensitive data from the UK and France in the GESIS Safe Rooms at Cologne and Mannheim.

Reusable

The Five Safes also works to enhance reusability.  Sensitive data is often only factually anonymized (i.e. reidentification is potentially possible), and therefore considered personal data under data protection laws. Without the application of additional safeguards, these data would not be available for reuse. So, adopting internationally recognized frameworks like the Five Safes helps to sort out these legal and ethical barriers, so more data can be reused.

Conclusion

The Five Safes Framework offers many benefits to Trusted Research Environments. Active engagement with the international TRE community is leading to access to a widening array of sensitive data that researchers in Germany would otherwise not be able to use, and adopting internationally recognized data governance standards will serve to advance this work even further.

References

  1. Ritchie, Felix (2008). “Secure access to confidential microdata: four years of the Virtual Microdata Laboratory” (PDF). Economic and Labour Market Statistics. 2:5 (5): 29–34. doi:10.1057/elmr.2008.73.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from GESIS Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading