Evaluating FAIR Digital Object and Linked Data as distributed object systems

FAIR Digital Object (FDO) is an emerging concept that is highlighted by European Open Science Cloud (EOSC) as a potential candidate for building an ecosystem of machine-actionable research outputs. In this work we systematically evaluate FDO and its implementations as a global distributed object system, by using five different conceptual frameworks that cover interoperability, middleware, FAIR principles, EOSC requirements and FDO guidelines themself. We compare the FDO approach with established Linked Data practices and the existing Web architecture, and provide a brief history of the Semantic Web while discussing why these technologies may have been difficult to adopt for FDO purposes. We conclude with recommendations for both Linked Data and FDO communities to further their adaptation and alignment.


INTRODUCTION
The FAIR principles (Mark D. Wilkinson et al. 2016) encourage sharing of scientific data with machinereadable metadata and the use of interoperable formats, and are being adapted by a wide range of research infrastructures.They have been widely recognised by the research community and policy makers as a goal to strive for.In particular, the European Open Science Cloud (EOSC) has promoted adaptation of FAIR data sharing of data resources across electronic research infrastructures (Mons et al. 2017).The EOSC Interoperability Framework (Corcho, Eriksson, et al. 2021) puts particular emphasis on how interoperability can be achieved technically, semantically, organisationally, and legally -laying out a vision of how data, publication, software and services can work together to form an ecosystem of rich digital objects.
Specifically, the EOSC Interoperability framework highlights the emerging FAIR Digital Object (FDO) concept (Schultes and Wittenburg 2019) as a possible foundation for building a semantically interoperable ecosystem to fully realise the FAIR principles beyond individual repositories and infrastructures.The FDO approach has great potential, as it proposes strong requirements for identifiers, types, access and formalises interactive operations on objects.
In other discourse, Linked Data (Bizer et al. 2009) has been seen as an established set of principles based on Semantic Web technologies that can achieve the vision of the FAIR principles (Bonino Da Silva Santos et al. 2016;Hasnain and Rebholz-Schuhmann 2018).Yet regular researchers and developers of emerging platforms for computation and data management are reluctant to adapt such a FAIR Linked Data approach fully (Verborgh and Vander Sande 2020), opting instead for custom in-house models and JSON-derived formats from RESTful Web services (Meroño-Peñuela et al. 2021a;Neumann et al. 2021).While such focus on simplicity gives rapid development and highly specialised services, it raises wider concerns on interoperability (Turcoane 2014; S. R. Wilkinson et al. 2022).
One challenge that may, perhaps counter-intuitively, steer developers towards a not-invented-here mentality (Stefi 2015;Stefi and Hess 2015) when exposing their data on the Web is the heterogeneity and apparent complexity of Semantic Web approaches themselves (Meroño-Peñuela et al. 2021b).
These approaches, thus, form two of the major avenues for allowing developers and the wider research community to achieve the goal of FAIR data.Given their importance, in this article, we aim to • Building on the Digital Object concept, using the simplified DOIPV2.0 (2018) specification, which detail how to exchange JSON objects through a text-based protocol 1 (usually TCP/IP over TLS).
The main DOIP operations are retrieving, creating and updating digital objects.These are mostly realised using the reference implementation Cordra.FDO types are registered in the local Cordra instance, where they are specified using JSON Schema (Wright et al. 2022) and PIDs are assigned using the Handle system.Several type registries have been established.• Following the traditional Linked Data approach, but using the DOIP protocol, e.g. using JSON-LD and schema.orgwithin DOIP (NIST for material science).• Approaching the FDO principles from existing Linked Data practices on the Web (e.g.Work-flowHub use of RO-Crate and schema.org).
From this it becomes apparent that there is a potentially large overlap between the goals and approaches of FAIR Digital Objects and Linked Data, which we'll cover on page 4.

Next steps for FDO
The FAIR Digital Object Forum (FAIR Digital Objects Forum -2022) working groups have prepared detailed requirement documents (FDO Specification Documents -November 2022 2022) setting out the path for realising FDOs, named FDO Recommendations.As of 2023-02-02, most of these documents are open for public review, while some are still in draft stages for internal review.As these documents clarify the future aims and focus of FAIR Digital Objects (Lannom, Schwardmann, Christophe Blanchi, et al. 2022), we provide their brief summaries below: FAIR Digital Object Overview and Specifications (FDO-Overview-PEN-2.0) is a comprehensive overview of FAIR Digital Object specifications listed below.It serves as a primer that introduces FDO concepts and the remaining documents.It is accompanied by an FDO Glossary (Broeder and Wittenburg 2022).
The FDO Forum Document Standards (WD-DocProcessStd-1.1) documents the recommendation process within the forum, starting at Working Draft (WD) status within the closed working group and later within the open forum, then Proposed Recommendation (PR) published for public review, finalised as FDO Forum Recommendation (REC) following any revisions.In addition, the forum may choose to endorse existing third-party notes and specifications.
The FDO Requirement Specifications (PR-RequirementSpec-3.0) is an update of (Bonino et al. 2019) as the foundational definition of FDO.This sets the criteria for classifying an digital entity as a FAIR Digital Object, allowing for multiple implementations.The requirements shown in Table 3 on page 11 are largely equivalent, but in this specification clarified with references to other FDO documents.
The Machine actionability (PR-MachineActionDef-2.2) sets out to define what is meant by machine actionability for FDOs.Machine readable is defined as elements of bit-sequences defined by structural specification, machine interpretable elements that can be identified and related with semantic artefacts, while machine actionable are elements with a type with operations in a symbolic grammar.The document largely describes requirements for resolving an FDO to metadata, and how types should be related to possible operations.
Configuration Types (PR-ConfigurationTypes-2.1) classifies different granularities for organising FDOs in terms of PIDs, PID Records, Metadata and bit sequences, e.g. as a single FDO or several daisychained FDOs.Different patterns used by current DOIP deployments are considered, as well as FAIR Signposting (Van de Sompel et al. 2022) PID Profiles & Attributes (PR-PIDProfileAttributes-2.1) specifies that PIDs must be formally associated with a PID Profile, a separate FDO that defines attributes required and recommended by FDOs following said profile.This forms the kernel attributes, building on recommendations from RDA's PID Information Types working group (Weigel et al. 2018).This document makes a clear distinction between a minimal set of attributes needed for PID resolution and FDO navigation, which needs to be part of the PID Record (Islam 2023), compared with a richer set of more specific attributes as part of the metadata for an FDO, possibly represented as a separate FDO.
Kernel Attributes & Metadata (PR-KernelAtributues-2.0) elaborates on categories of FDO Mandatory, FDO Optional and Community Attributes, recommending kernel attributes like dateCreated, Sci-entificDomain, PersistencePolicy, digitalObjectMutability, etc.This document expands on RDA Recommendation on PID Kernel Information (Weigel et al. 2018).It is worth noting that both documents are relatively abstract and do not establish PIDs or namespaces for the kernel attributes.
Granularity, Versioning, Mutability (PR-Granularity-2.2) considers how granularity decisions for forming FDOs must be agreed by different communities depending on their pragmatic usage requirements.The affect on versioning, mutability and changes to PIDs are considered, based on use cases and existing PID practices.
DOIP Endorsement Request (PED-DOIPEndorsement-1.1) is an endorsement of the DOIP v2.0 (DOIPV2.02018) specification as a potential FDO implementation, as it has been applied by several institutions (Wittenburg, Anders, et al. 2022).The document proposes that DOIP shall be assessed for completeness against FDO -in this initial draft this is justified as "we can state that DOIP is compliant with the FDO specification documents in process" (the documents listed above).
Upload of FDO (PEN-FDO-Upload) illustrates the operations for uploading an FDO to a repository, what checks it should do (for instance conformance with the PID Profile, if PIDs resolve).ResourceSync (ResourceSync Framework Specification (ANSI/NISO Z39.99-2017) 2017) is suggested as one type of service to list FDOs.This document highlights potential practices by repositories and their clients, without adding any particular requirements.
Typing FAIR Digital Objects (PR-TypingFDOs-2.0)defines what type means for FDOs, primarily to enable machine actionability and to define an FDO's purpose.This document lays out requirements for how FDO Types should themselves be specified as FDOs, and how an FDO Type Framework allows organising and locating types.Operations applicable to an FDO is not predefined for a type, however operations naturally will require certain FDO types to work.How to define such FDO operations is not specified.
Implementation of Attributes, Types, Profiles and Registries (WD-ImplAttributesTypesProfiles) details how to establish FDO registries for types and FDO profiles, with their association with PID systems.This document suggest policies and governance structures, together with guidelines for implementations, but without mandating any explicit technology choices.Differences in use of attributes are examplified using FDO PIDs for scientific instruments, and the proto-FDO approach of DARIAH-DE (Schwardmann and Kálmán 2022).
It is worth pointing out at that, except for the DOIP endorsement, all of these documents are conceptual, in the sense that they permit any technical implementation of FDO, if used according to the recommendations.See bibliography on page 39 for the citation per document above.

From the Semantic Web to Linked Data
In order to describe Linked Data as it is used today, we'll start with an (opinionated) description of the evolution of its foundation, the Semantic Web.

A brief history of the Semantic Web
The Semantic Web was developed as a vision by Tim Berners-Lee (T.Berners-Lee and Fischetti 1999), at a time that the Web had already become widely established for information exchange, being a global set of hypermedia documents which are cross-related using universal links in the form of URLs.The foundations of the Web (e.g.URLs, HTTP, SSL/TLS, HTML, CSS, ECMAScript/JavaScript, media types) were standardised by W3C, Ecma, IETF and later WHATWG.The goal of Semantic Web was to further develop the machine-readable aspects of the Web, in particular adding meaning (or semantics) to not just the link relations, but also to the resources that the URLs identified, and for machines thus being able to meaningfully navigate across such resources, e.g. to answer a particular query.
Through W3C, the Semantic Web was realised with the Resource Description Framework (RDF) (RDF 1.1 Primer 2022) that used triples of subject-predicate-object statements, with its initial serialisation format (Resource Description Framework (RDF) Model and Syntax Specification 2022) being RDF/XML (XML was at the time seen as a natural data-focused evolution from the document-centric SGML and HTML).
While triple-based knowledge representations were not new (Stanczyk 1987), the main innovation of RDF was the use of global identifiers in the form of URIs 2 as the primary identifier of the subject (what the statement is about), predicate (relation/attribute of the subject) and object (what is pointed to).By using URIs not just for documents 3 , the Semantic Web builds a self-described system of types and properties, where the meaning of a relation can be resolved by following its hyperlink to the definition within a vocabulary.By applying these principles as well to any kind of resource that could be described at a URL, this then forms a global distributed Semantic Web.
The early days of the Semantic Web saw fairly lightweight approaches with the establishment of vocabularies such as FOAF (to describe people and their affiliations) and Dublin Core (for bibliographic data).Vocabularies themselves were formalised using RDFS or simply as human-readable HTML web pages defining each term.The main approach of this Web of Data was that a URI identified a resource (e.g. an author) with a HTML representation for human readers, along with a RDF representation for machine-readable data of the same resource.By using content negotiation in HTTP4 , the same identifier could be used in both views, avoiding index.htmlvs index.rdfexposure in the URLs.The concept of namespaces gave a way to give a group of RDF resources with the same URI base from a Semantic Web-aware service a common prefix, avoiding repeated long URLs.
The mid-2000s saw large academic interest and growth of the Semantic Web, with the development of more formal representation system for ontologies, such as OWL (W3C OWL Working Group 2022), allowing complex class hierarchies and logic inference rules following open world paradigm (e.g. a ex:Parent is equivalent to a subclass of foaf:Person which must ex:hasChild at least one foaf:Person, then if we know :Alice a ex:Parent we can infer :Alice ex:hasChild [a foaf:Person] even if we don't know who that child is).More human-readable syntaxes of RDF such as Turtle (shown in this paragraph) evolved at this time, and conferences such as ISWC (Horrocks and Hendler 2002) gained traction, with a large interest in knowledge representation and logic systems based on Semantic Web technologies evolving at the same time.
Established Semantic Web services and standards include SPARQL (SPARQL 1.1 Overview 2022) (pattern-based triple queries), named graphs (Wood et al. 2014) (triples expanded to quads to indicate statement source or represent conflicting views), triple/quad stores (graph databases such as OpenLink Virtuoso, GraphDB, 4Store), mature RDF libraries (including Redland RDF, Apache Jena, Eclipse RDF4J, RDFLib, RDF.rb, rdflib.js),and numerous graph visualisation (many of which struggle with usability for more than 20 nodes).
The creation of RDF-based knowledge graphs grew particularly in fields like bioinformatics, e.g. for describing genomes and proteins (Goble and Stevens 2008;Williams et al. 2012).In theory, the use of RDF by the life sciences would enable interoperability between the many data repositories and support combined views of the many aspects of bio-entities -however in practice most institutions ended up making their own ontologies and identifiers, for what to the untrained eye would mean roughly the same.One can argue that the toll of adding the semantic logic system of rich ontologies meant that small, but fundamental, differences in opinion (e.g. should a gene identifier signify just the particular DNA sequence letters, or those letters as they appear in a particular position on a human chromosome?) lead to large differences in representational granularity, and thus needed different identifiers.
Facing these challenges, thanks to the use of universal identifiers in the form of URIs, mappings could retrospectively be developed not just between resources, but also across vocabularies.Such mappings can be expressed themselves using lightweight and flexible RDF vocabularies such as SKOS (SKOS Simple Knowledge Organization System Primer 2022) (e.g.dct:title skos:closeMatch schema:name to indicate near equivalence of two properties).Automated ontology mappings have identified large potential overlaps (e.g.372 definitions of Person) (Hu et al. 2011).
The move towards Open Science data sharing practices did from the late 2000s encourage knowledge providers to distribute collections of RDF descriptions as downloadable datasets5 , so that their clients can avoid thousands of HTTP requests for individual resources.This enabled local processing, mapping and data integration across datasets (e.g.Open PHACTS (Groth et al. 2014)), rather than relying on the providers' RDF and SPARQL endpoints (which could become overloaded when handling many concurrent, complex queries).
With these trends, an emerging problem was that adopters of the Semantic Web primarily utillised it as a set of graph technologies, with little consideration to existing Web resources.This meant that links stayed mainly within a single information system, with little URI reuse even with large term overlaps (Kamdar et al. 2017).Just like link rot affect regular Web pages and their citations from scholarly communication (Klein et al. 2014), for a majority of described RDF resources in the Linked Open Data (LOD) Cloud's gathering of more than thousand datasets, unfortunately do not actually link to (still) downloadable (dereferenceable) Linked Data (Polleres et al. 2020).Another challenge facing potential adopters is the plethora of choices, not just to navigate, understand and select to reuse the many possible vocabularies and ontologies (Carriero et al. 2020), but also technological choices on RDF serialisation (at least 7 formats), type system (RDFS (RDF Schema 1.1 2022), OWL (W3C OWL Working Group 2022), OBO (Tirmizi et al. 2011), SKOS (SKOS Simple Knowledge Organization System Primer 2022)), hash vs slash, HTTP status codes and PID redirection strategies (Sauermann et al. 2011).

Linked Data: Rebuilding the Web of Data
The Linked Data concept (Bizer et al. 2009) was kickstarted as a set of best practices (T.Berners-Lee 2006) to bring the Web aspect back into focus.Crucially to Linked Data is the reuse of existing URIs, rather than making new identifiers.This means a loosening of the semantic restrictions previously applied, and an emphasis on building navigable data resources, rather than elaborate graph representations.
Vocabularies like schema.orgevolved not long after, intended for lightweight semantic markup of existing Web pages, primarily to improve search engines' understanding of types and embedded data.In addition to several such embedded microformats (The Open Graph Protocol 2022; RDFa 1.1 Primer -Third Edition 2022; Microdata 2023) or we find JSON-LD (Sporny, Longley, et al. 2020) as a Web-focused RDF serialisation that aims for improved programmatic generation and consumption, including from Web applications.JSON-LD is as of 2023-05-18 used6 by 45% of the top 10 million websites (Usage Statistics of JSON-LD for Web 2023).
Recently there has been a renewed emphasis to improve the Developer Experience (Verborgh 2018) for consumption of Linked Data, for instance RDF Shapes -expressed in SHACL (Shapes Constraint Language (SHAC 2022) or ShEx (Shape Expressions (ShEx) 2.1 Primer 2022) -can be used to validate RDF Data (Gayo et al. 2017;Thornton et al. 2019) before consuming it programmatically, or reshaping data to fit other models.While a varied set of tools for Linked Data consumptions have been identified, most of them still require developers to gain significant knowledge of the underlying Semantic Web technologies, which hampers adaption by non-LD experts (Klímek et al. 2019), which then tend to prefer non-semantic twodimensional formats such as CSV files.
A valid concern is that the Semantic Web research community has still not fully embraced the Web, and that the "final 20%" engineering effort is frequently overlooked in favour of chasing new trends such as Big Data and AI, rather than making powerful Linked Data technologies available to the wider groups of Web developers (Verborgh and Vander Sande 2020).One bridging gap here by the Linked Data movement has been "linked data by stealth" approaches such as structured data entry spreadsheets powered by ontologies (K.Wolstencroft et al. 2011), the use of Linked Data as part of REST Web APIs (Page et al. 2011), and as shown by the big uptake by publishers to annotate the Web using schema.org(Bernstein et al. 2016), with vocabulary use patterns documented by copy-pastable JSON-LD examples, rather than by formalised ontologies or developer requirements to understand the full Semantic Web stack.

METHOD
Our main motivation for this article is to investigate how the promises of FAIR Digital Objects may differ from the learnt experiences of Linked Data and the Web.We also reflect back from FDO's motivation of machine-actionability to consider the Web as a distributed computational system.
To better understand the relationship between the FDO framework and other exisiting approaches, we use the following for analysis: 1.An Interoperability Framework and Distributed Platform for Fast Data Applications (Delgado 2016), which proposes quality measurements for comparing how frameworks support interoperability, particularly from a service architectural view.2. The FAIR Digital Object guidelines (Bonino et al. 2019), validated against its current implementations for completeness.3. A Comparison Framework for Middleware Infrastructures (Zarras 2004), which suggest dimensions like openness, performance and transparency, mainly focused on remote computational methods.4. Cross-checks against RDA's FAIR Data Maturity Model (Bahim et al. 2020) to find how the FAIR principles are achieved in FDO, in particular considering access, sharing and openness.5. EOSC Interoperability Framework (Corcho, Eriksson, et al. 2021) which gives recommendations for technical, semantic, organisational and legal interoperability, particularly from a metadata perspective.
The reason for this wide-ranged comparison is to exercise the different dimensions that together form FAIR Digital Objects: Data, Metadata, Service, Access, Operations, Computation.We have left out further comparisons on type systems, persistent identifiers and social aspects as principles and practices within these dimensions are still taking form within the FDO community (as detailedon page 3).Some of these frameworks invite a comparison on a conceptual level, while others relate better to implementations and current practices.For these we consider FAIR Digital Objects and the Web conceptually, and for implementations we contrast between the main FDO realisation using the DOIPv2 protocol (DOIPV2.02018) against Linked Data in general practice.

Considering FDO/Web as interoperability framework for Fast Data
The Interoperability Framework for Fast Data Applications (Delgado 2016) categorises interoperability between applications along 6 strands, covering different architectural levels: from symbiotic (agreement to cooperate) and pragmatic (ability to choreograph processes), through semantic (common understanding) and syntactic (common message formats), to low-level connective (transport-level) and environmental (deployment practices).
We have chosen to investigate using this framework as it covers the higher levels of the OSI Model (Stallings 1990) better with regards to automated machine-to-machine interaction (and thus interoperability), which is a crucial aspect of the FAIR principles.In Table 1 on the next page we use the interoperability framework to compare the current FAIR Digital Object approach against the Web and its Linked Data practices.
Based on the analysis shown in Table 1, we draw the following conclusions: The Web has already showed us how one can compose workflows of hetereogeneous Web Services (Katherine Wolstencroft et al. 2013).However, this is mostly done via developer or human interaction (Lamprecht et al. 2021).Similiarly, FDO does not enable automatic composition because operation semantics are not well defined.There is a question as to whether the extebsuve documentation and broad developer usage that is available for Web APIs could potentially be utillised for FDO.
A difference between Web technologies and FDO is the stringency of the requirements for both syntax and semantics.Whereas the Web allows many different syntactic formats (e.g. from HTML to XML, PDFs), FDO realised with DOIP requires JSON.On the semantic front, FDO mandates that every object have a well-defined type and structured form.This is clearly not the case on the Web.
In terms of connectivity and the deployment of applications, the Web has a plethora of software, services, and protocols that are widely deployed.These have shown interoperability.The Web standards bodies (e.g.IETF and W3C) follow the OpenStand principles (The Modern Standards Paradigm -Five Key Principles 2023) to embrace openness, transparency, and broad consensus.In contrast, FDO has a small number of implementations and corresponding protocols, although with a growing community, as evidenced at the first international FDO conference (Loo 2022).This is not to say that it is not worth developing further Handle+DOIP implementations in the future, but we note that the current FDO functionality can easily be implemented using Web technologies, even as DOIP-over-HTTP (DOIP API for HTTP Clients -Cordra Documentatio 2023).
It's also a question as to whether a highly constrained protocol revolving around persistent identifiers is in fact necessary.For example, DOIs are mostly resolved on the web (DOI Resolution Documentation 2020) using HTTP redirects with the common https://doi.org/prefix, hiding their Handle nature as an implementation detail ("DOI Handbook -Resolution" 2017).Cordra storage backends include file system, S3, MongoDB (itself scalable).Unique DOIP protocol can be hard to add to existing Web application frameworks, although proxy services have been developed (e.g.B2SHARE adapter).
HTTP services widely deployed in a myriad of ways, ranging from single instance servers, horizontally & vertically scaled application servers, to (for static content) multi-cloud Content-Delivery Networks (CDN).Current scalable cloud technologies for Web hosting may not support HTTP features previously seen as important for Semantic Web, e.g.content negotiation and semantic HTTP status codes.

Mapping of Metamodel concepts
The Interoperability Framework for Fast Data also provides a brief metamodel which we use in Table 2 to map and examplify corresponding concepts in FDO's DOIP realization and the Web using HTTP semantics (Roy T. Fielding, Nottingham, and Reschke 2022).
From this mapping we can identify the conceptual similarities between DOIP and HTTP, often with common terminology.Notable are that neither DOIP or HTTP have strong support for transactions (explored further on page 15), as well that HTTP has poor direct support for processes, as the Web is primarily stateless by design.

Assessing FDO implementations
The FAIR Digital Object guidelines (Bonino et al. 2019) sets out recommendations for FDO implementations.In Table 3 on the following page we evaluate the two current implementations, using DOIPv2 (DOIPV2.0 2018) and using Linked Data Platform (Speicher et al. 2015), as proposed by (Bonino da Silva Santos 2022).
Note that the draft update to FDO specification (PR-RequirementSpec-3.0)clarifies these definitions with equivalent identifiers7 and relates them to further FDO requirements such as FDO Data Type Registries.
A key observation from this is that simply using DOIP does not achieve many of the FDO guidelines.Rather the guidelines set out how a protocol like DOIPs should be used to achieve FAIR Digital Object goals.The DOIP Endorsement (PED-DOIPEndorsement-1.1) sets out that to comply, DOIP must be used according to the set of FDO requirement documents (details on page 3), and notes Achieving FDO compliance requires more than DOIP and full compliance is thus left to system designers.Likewise, a Linked Data approach will need to follow the same requirements to comply as an FDO implementation.
From our evaluation, we can observe: • G1 and G2 call for stability and trustworthiness.While the foundations of both DOIP and Linked Data approaches are now well established -the FDO requirements and in particular how they can be implemented are still taking shape and subject to change.• Machine actionability (G4, G6) is a core feature of both FDOs and Linked Data.Conceptually they differ in the which way types and operations are discovered, with FDO seemingly more rigorous.
In practice, however, we see that DOIP also relies on dynamic discovery of operations and that operation expectations for types (FDOF7) have not yet been defined.• FDO proposes that types can have additional operations beyond CRUD (FDOF5, FDOF6), while Linked Data mainly achieves this with RESTful patterns using CRUD on additional resources, e.g.order/152/items.These are mainly stylistics but affect the architectural view -FDOs have more of an object-oriented approach.

Comparing FDO and Web as middleware infrastructures
In this section we take the perspective that FDO principles are in effect proposing a global infrastructure of machine-actionable digital objects.As such we can consider implementations of FDO as middleware infrastructures for programmatic usage, and can evaluate them based on expectations for client and server developers.
We argue that the Web, with its now ubiquitous use of REST API (Roy Thomas Fielding 2000), can be compared as a similar global middleware.Note that while early moves for developing Semantic Web Services (Fensel et al. 2011) attempted to merge the Web Service and RDF aspects, we are here considering mainly the current programmatic Web and its mostly light-weight use of 3 out of possible 5 stars Linked Data (Michael Hausenblas et al. 2012).
For this purpose, we here utillise the Comparison Framework for Middleware Infrastructures (Zarras 2004) that formalise multiple dimensions of openness, scalability, transparency, as well as characteristics known from Object-oriented programming such as modularity, encapsulation and inheritance.
Based on the analysis in Table 4 on the next page, we make the following observations: • With respect to the aspect of Performance, it is interesting to note that while the first version of DOIP (Reilly 2009) supported multiplexed channels similar to HTTP/2 (allowing concurrent transfer of several digital objects).Multiplexing was removed for the much simplified DOIP 2.0 (DOIPV2.02018).Unlike DOIP 1.0, DOIP 2.0 will require a DO response to be sent back completely, as a series of segments (which again can be split the bytes of each binary element into sized chunks), before transmission of another DO response can start on the transport channel.It is unclear what is the purpose of splitting a binary into chunks on a channel which no longer can be multiplexed and the only property of a chunk is its size8 .• HTTP has strong support for scalability and caching, but this mostly assumes read-operations from static resources.FDO has no view on immutability or validity of retrieved objects, but this should be taken into consideration to support large-scale usage.• HTTP optimisations for performance (e.g.HTTP/2, multiplexing) is largely used for commercial media distribution (e.g.Netflix), and not commonly used by providers of FAIR data • Cloud deployment of Web applications give many middleware benefits (Scalability, Distribution, Access transparancy, Location transparancy) -it is unclear how DOIP as a custom protocol would perform in a cloud setting as most of this infrastructure assumes HTTP as the protocol.• Programmatically the Web is rather unstructured as middleware, as there are many implementation choices.Usually it is undeclared what to expect for a given URI/service, and programmers follow documented examples for a particular service rather than automated programmatic exploration across providers.This mean one can consider the Web as an ecosystem of smaller middlewares with commonalities.• Many providers of FAIR Linked Data also provide programmatic REST API endpoints, e.g.UNIPROT, ChEMBL, but keeping the FAIR aspects such as retrieving metadata in such a scenario may require combining different services using multiple formats and identifier conventions.Distribution transparency: application perceived as a consistent whole rather than independent elements.
Each FDO is accessed separately along with its components (typically from the same endpoint).FDOs should provide the mandatory kernel metadata fields.FDOs of the same declared type typically share additional attributes (although that schema may not be declared).DOIP does not enforce metadata typing constraints, this need to be established as FDO conventions.
Each URL accessed separately.

Modularity: application as collection of connected/distributed elements
FDOs are inheritedly modular using global PID spaces and their cross-references.In practice, FDOs of a given type are exposed through a single server shared within a particular community/institution.
The Web is inheritently modular in that distributed objects are cross-referenced within a global URI space.In practice, an API's set of resources will be exposed through a single HTTP service, but modularity enables fine-grained scalability in backend.Assessing FDO against FAIR In addition to having "FAIR" in its name, the FAIR Digital Object guidelines (PR-RequirementSpec-3.0)also include G3: FDOs must offer compliance with the FAIR principles through measurable indicators of FAIRness.

Quality
Here we evaluate to what extent the FDO guidelines and its implementation with DOIP and Linked Data Platform (Bonino da Silva Santos 2022) comply with the FAIR principles (Mark D. Wilkinson et al. 2016).Here we've used the RDA's FAIR Data Maturity Model (FAIR Data Maturity Model Working Group 2020) as it has decomposed the FAIR principles to a structured list of FAIR indicators (Bahim et al. 2020), importantly considering Data and Metadata separately.In our interpretation for Table 5 on the following page we have for simplicity chosen to interpret "data" in FDOs as the associated bytestream of arbitrary formats, with remaining JSON or RDF structures always considered as metadata.
From this evaluation we observe: • Linked Data in general is strong on metadata indicators, but LDP approach is weak as it has little concrete metadata guidance.• FDO/DOIP are stronger on identifier indicators, while Linked Data approach for identifiers relies on best practices.• Indicators on standard protocols (RDA-A1-04M, RDA-A1-04D, RDA-A1.1-01M,RDA-A1.1-01D)favour LDP's mature standards (HTTP, URI) -the DOIPv2 specification (DOIPV2.02018) has currently only a couple of implementations and is expressed informally.The underlying Handle system for PIDs is arguably mature and commonly used by researchers (this article alone references about 80 DOIs), however DOIs are more commonly accessed as HTTP redirects through resolvers like https://doi.org/and http://hdl.handle.net/rather than the Handle protocol.• RDA-A1-02M and RDA-A1-02D highlights access by manual intervention, which is common for http/https URIs, but also using above PID resolvers for DOIP implementation CORDRA (e.g.https://hdl.handle.net/21.14100/90ec1c7b-6f5e-4e12-9137-0cedd16d1bce),yet neither LDP, FDO nor DOIP specifications recommends human-readable representations to be provided • Neither DOIP nor LDP require license to be expressed (RDA-R1.1-01M, RDA-R1.1-02M,RDA-R1.1-03M),yet this is crucial for re-use and machine actionability of FAIR data and metadata to be legal • Machine-understandable types, provenance and data/metadata standards (RDA-R1.1-03MRDA-R1.3-02M,RDA-R1.3-02M,RDA-R1.3-02D) are important for machine actionability, but are currently unspecified for FDOs.(WD-ImplAttributesTypesProfiles) explores possible machinereadable FDO types, however the type systems themselves have not yet been formalised.Linked Data on the other side have too many semantic and syntactic type systems, making it difficult to write consistent clients.• Indicators for FAIR data are weak for either approach, as too much reliance is put on metadata.
For instance in Linked Data, given a URL of a CSV file, what is its persistant identifier or license information?FAIR Signposting (Van de Sompel et al. 2022) can improve findability of metadata using HTTP Link relations, which enable an FDO-like overlay for any HTTP resource.In DOIP, responses for bytestreams can include the data identifier: if that is a PID (not enforced by DOIP), its metadata is accessible.• Resolving FDOs via Handle PIDs to the corresponding DOIP server is currently undefined by FDO and DOIP specifications.0.TYPE/DOIPServiceInfo lookup is only possible once DOIP server is known.Firstly, we observe that the EOSC IF recommendations are at a high level, mainly affecting governance and practices by communities.This Organizational level is also highlighted by the FDO recommendations, for instance the FDO Typing (PR-TypingFDOs-2.0)propose a governance structure to recognize community-endorsed services.While these community aspects are not mandated by Linked Data practices, best practices have become established for aspects like ontology development (Norris et al. 2021).EOSC IF's technical layer is likewise at a architecturally high level, such as service-level agreements, but also highlight PID policies which is strongly required by FDO, while Linked Data communities choose PID practices separately.The recommendations for the Semantic layer, is largely already implemented by Linked Data practices, yet for FDO mostly consist of encouragements.For instance clear definitions of semantic concepts is required by FDO guidelines, but how to technically define them has not been formalised by FDO specifications.
The Legal layer of interoperability is perhaps the one most emphasised by EOSC, by enabling collaboration across organizational barriers to joinly build a research infrastructure, but this is an area that both FDO and Linked Data are relatively weak in directly supporting.The EOSC IF recommendations in this layer are still largely related to governance practices and metadata, for instance licensing, privacy and usage policies; yet these are essential for cross-institutional and cross-repository access of FAIR objects.
Likewise, search and indexing is important FAIR aspect for Findability, but is poorly supported globally by FDO and Linked Data.Efforts such as Open Research Knowledge Graph (ORKG) (Jaradeh et al. 2019), DataCite's PID Graph (Fenner and Aryani 2019) and Google Knowledge Graph (Singhal 2012) have improved programmatic findability to some degree, however not significantly for domainspecific semantic artefacts, currently scattered across multiple semantic catalogues (Corcho, Ekaputra, et al. 2023).There is a strong role for organizations like EOSC to provide such broader registries, moving beyond scholarly output metadata federations.The EOSC Marketplace13 has for instance recently been expanded to include training material, software and data sources.

DISCUSSION
We have evaluated the FAIR Digital Object concept using multiple frameworks, and contrasted FDO against existing experiences from Linked Data on the Web.In this section we discuss the implications of this evaluation, and propose how these two approaches can be better combined.

Framework evaluation
Having considered FDO and the Web architecture as interoperability ( on page 7), we observe that neither are magic bullets, but each bring different aspects of interoperability.The Web comes with a large degree of flexibility and openness, however this means interoperability can suffer as services have different APIs and data models, although with common patterns.This is also true for Linked Data on the Web, with many overlapping ontologies and frequent inconsistencies in resolution mechanisms; although somewhat alleviated in recent years by schema.orgbecoming common metadata model for semantic markup inline in Web pages.The Web is based on a common HTTP protocol which has remained stable architecturally throughout its 32 years of largely backwards-compatible evolution.FDO on the other side sets down multiple rigid rules for identifiers, types, methods etc. that are advanterous for interoperability and predictability for FAIR consumption.Yet there is a large degree of freedom in how the FDO rules can be implemented by a given community, for instance there is no common metadata model or identifier resolution mechanism, and DOIP is just one possible transport method for FDOs, which itself does not enforce these rules.
When evaluating FDO implementations against the FDO guidelines ( on page 10) we see that several technical pieces and community practices still need to be developed and further defined, for instance the FDO type system, how to declare FDO actions, how to resolve persistent identifiers, or how to know which pattern of FDO composition is used.Achieving fully interoperable FAIR digital objects would require further convergence on implementation practices, and it is not given that his need to diverge from the established Web architecture.It is not clear from FDO guidelines if moving from HTTP/DNS to DOIP/Handle as a way to expose distributed digital objects will benefit FAIR practitioners, when both approaches require additional restrictions, equably implementable, such as using persistent identifiers or pre-defining an object's type.
Considering this, by comparing FDO and Web as middleware ( on page 15) we saw that programmatic access to digital objects, a core promise of FDO, is not particularly improved by the use of the protocol DOIP as compared to HTTP, e.g.lack of concurrency transparancy.Recent updates to HTTP have added many features needed for large-scale usage such as video streaming services (e.g.caching, multiplexing, cloud deployments), and having the option to transparantly apply these also to FDOs seems like a strong incentive.Many programmatic features are however missing or needing custom extensions in both aspects, such as transactions, asynchronous operations and streaming.
By assessing FDO against the FAIR principles ( on page 20) we found that both FDO implementations are underspecified in several aspects (licences, provenance, data references, data vocabularies, metadata persistence).While there are implementations of each of these in general Linked Data examples, there is no single set of implementation guides that fully realizes the FAIR principles.FAIRification efforts like the FAIR Cookbook (Rocca-Serra et al. 2023) and FAIR Implementation Profiles (Schultes, Magagna, et al. 2020) are bringing existing practices together, but there remains a potential role for FDO in giving a coherent set of implementation practices that can practically achieve FAIR.Significant effort, also within EOSC, is now moving towards FAIR metrics (Devaraju et al. 2021), which in practice need to make additional assumptions on how FAIR principles are implemented, but these are not always formalized (Mark D Wilkinson et al. 2022) nor can they be taken to be universally correct (Verburg et al. 2023).Given that most of the existing FAIR guides and assessment tools are focused on Web and Linked Data, it would be reasonable for FDO to then provide a profile of such implementation choices that can achieve best of both worlds.
EOSC has been largely supportive of FDO, FAIR and related services.By contrasting the EOSC Interoperability Framework ( on page 27) with FDO, we found that there are important dimensions that are not solved at a technical level, but through organization collaboration, legal requirements and building community practices.FDO recommendations highlight community aspects, but at the same time the largest FAIR communities in many science domains are already producing and consuming Linked Data.Just as the Linked Data community has a challenge in convincing more research fields to use Semantic Web technologies, FDO currently need to build many new communities in areas that have

30/40
shown interest in that approach (e.g.material science).It may be advantegous for both these effort to be aligned and jointly promoted under the EOSC umbrella.

What does FDO mean for Linked Data?
The FAIR Digital Object approach raises many important points for Linked Data practictioners.At first glance, the explicit requirements of FDOs may to be easy to furfill by different parts of the Semantic Web Cake (T.Berners-Lee 2000, slide 10), as we have previously proposed (Soiland-Reyes, Castro, et al. 2022).However, this deeper investigation, based on multiple frameworks, highlights that the openness and variability of how Linked Data is deployed can make it difficult to achieve the FDO goals without significant effort.
While RDF and Linked Data have been suggested as prime candidates for making FAIR data, we argue that when different developers have too many degrees of freedom (such as serialization formats, vocabularies, identifiers, navigation), interoperability is hampered -this makes it hard for machines to reliably consume multiple FAIR resources across repositories and data providers.Indeed, this may be one reason why the initial FDO effort steered away from Linked Data approaches, but now seems in a danger of opening the many same degrees of freedom within FDO.
We therefore identify the need for a new explicit FDO profile of Linked Data that sets pragmatic constraints and stronger recommendations for consistent and developer-friendly deployment of digital objects.Such a combination of efforts will utillise both the benefits of mature Semantic Web technologies (e.g.federated knowledge graph queries and rich validation) and data management practices that follow FDO guidance in order to grow a rigid (yet flexible) ecosystem of machine-actionable scholarly objects.It is beyond the scope of this work to detail such a profile, but its main priorities could be: • Use HTTP(S) as protocol • Use URIs as identifiers, with persistent identifier promises • Provide consistent identifier resolution that does not require heuristics • Common core metadata model • References are always URIs, and should be persistent identifiers • Types, attributes and actions are self-defined by their identifier The FAIR and Linked Data communities likewise need to recognize the need for simpler, more pragmatic approaches that make it easier for FAIR practitioners to adapt the technologies with "just enough" semantics.We have previously proposed the combination of RO-Crate (Soiland-Reyes, Sefton, Crosas, et al. 2022) and Signposting (Van de Sompel et al. 2022) as a mean to implement FDO (Soiland-Reyes, Sefton, Castro, et al. 2022) over HTTP using a common Linked Data metadata model.
However it may be sufficient to use HTTP-based FAIR Signposting alone to achieve the above list, if one considers only a small metadata model, and rather reference from the signposting which metadata resources are additionally available.This will allow any Linked Data resource to gradually participate in the FDO ecosystem, with minimal effort and non-intrusive implementation changes.FDO implementations like Cordra typically already use HTTP APIs that align with DOIP (DOIP API for HTTP Clients -Cordra Docum 2023), these can be augmented with Signposting headers without necessarily moving to a Linked Data metadata model.

CONCLUSION
In this work we have considered FAIR Digital Objects (FDO) as a potential distributed object system and compared FDO with established Web approaches focusing on Linked Data.We have described the background of the Semantic Web and FAIR Digital Objects, and evaluated both using multiple conceptual frameworks.
We find that both FDO and Linked Data approaches can significantly benefit from each-other and should be aligned further.Namely Linked Data proponents need to make their technologies more approachable, agreeing on predictable and consistent implementations of FAIR principles.
The FDO recommendations show that FAIR thinking in this regard need to move beyond data publishing and into machine actionability across digital objects, and with broader community consensus.As flexibility for extensions is a necessary ingredient alongside rigidity for core concepts, the FDO community likewise need to settle on directly implementable specifications rather than just guidelines, and avoid making similar mistakes as the early Semantic Web adopters.
By implementing the goals of FAIR Digital Objects with the mature technology stack developed for Linked Data, EOSC research infrastructures and researchers in general can create and use FAIR machine-actionable research outputs for decades to come.

Table 1 .
(Delgado 2016)O and Web according to the quality levels of the Interoperability Framework for Fast Data(Delgado 2016).

Table 2 .
(Delgado 2016)tamodel concepts from the Interoperability Framework for Fast Data(Delgado 2016)to equivalent concepts for FDO and Web.

Table 3 .
The PROV Ontology 2023) • FDO collections are not yet defined for DOIP, while Linked Data seemingly have too many alternatives, LDP has specific native support for containers.•Tombstonesfordeletedresources are not well supported, nor specified, for either approach, although the continued availability of metadata when data is removed is a requirement for FAIR principles (see RDA-A2-01M in Table5on page 23).•DOIP supports multiple chunks of data for an object (FDOF3), while Linked Data can support content-negotiation.In either case it can be unclear to clients what is the meaning or equivalence of any additional chunks.Checking FDO guidelinesBonino et al. 2019; PR-RequirementSpec-3.0 against its current implementations as DOIP DOIPV2.0 2018 and Linked Data Platform (LDP) Bonino da Silva Santos 2022, with suggestions for required additions.
linked with custom keys), particularly due to lack of namespaces and the favouring of local types rather than type/property re-use.Linked Data frequently have multiple representations, but often not sufficiently linked, perhaps prov:specializationOf (PROV-O:

Table 4 .
Zarras 2004AIR Digital Object (with the DOIP 2.0 protocol DOIPV2.0 2018) and Web technologies (using Linked Data) as middleware infrastructuresZarras 2004 Scalability: application should be effective at many different scalesNo defined methods for caching or mirroring, although this could be handled by backend, depending on exposed FDO operations (e.g.Cordra can scale to multiple backend nodes) Cache control headers reduce repeated transfer and assist explicit and transparent proxies for speed-up.HTTP GET can be scaled to world-population-wide with Content-Delivery Networks (CDNs), while write-access scalability is typically manage by backend.

Table 5 .
Assessing RDA's FAIR Data Maturity Model FAIR Data Maturity Model Working Group 2020; Bahim et al. 2020 (first 2 columns) against the FDO guidelines Bonino et al. 2019, FDO implemented with the protocol DOIPv2 DOIPV2.0 2018, Linked Data Platform (LDP) Bonino da Silva Santos 2022 and examples from Linked Data practices in general.(-indicatesUnspecified,maybepossiblewithadditional conventions)The European Open Science Cloud (EOSC) is a large EU initiative to promote Open Science by implementing a joint research infrastructure by federating existing and new services and focusing on interoperability, accessability, best practices as well as technical infrastructure (Commission High Level Expert Group on the European Open Science Cloud n.d.).The EOSC Interoperability Framework(Corcho, Eriksson, et al. 2021) details the principles for creating a common way to achieve interoperability between all digital aspects of research activities in EOSC, including data, protocols and software.The recommendations are realized through 4 layers, Technical (e.g.protocols), Semantic (e.g.metadata models), Organisational (e.g.recommendations) and Legal (e.g.agreements), with a particular aim to address the FAIR interoperability principles and building on the concept of FAIR Digital Objects.In Table6on page 29 we review the EOSC Interoperability Framework (EOSC IF) recommendations, and evaluate to what extent they are addressed by the principles of FDO and Linked Data or their common implementations.