Characterising the Digital Twin: A systematic literature review. CIRP Journal

While there has been a recent growth of interest in the Digital Twin, a variety of deﬁnitions employed across industry and academia remain. There is a need to consolidate research such to maintain a common understanding of the topic and ensure future research eﬀorts are to be based on solid foundations. Through a systematic literature review and a thematic analysis of 92 Digital Twin publications from the last ten years, this paper provides a characterisation of the Digital Twin, identiﬁcation of gaps in knowledge, and required areas of future research. In characterising the Digital Twin, the state of the concept, key terminology, and associated processes are identiﬁed, discussed, and consolidated to produce 13 characteristics ( Physical Entity/Twin ; Virtual Entity/Twin ; Physical Environment ; Virtual Environment ; State ; Realisation ; Metrology ; Twinning ; Twinning Rate ; Physical-to-Virtual Connection/Twinning ; Virtual-to-Physical Con-nection/Twinning ; Physical Processes ; and Virtual Processes ) and a complete framework of the Digital Twin and its process of operation. Following this characterisation, seven knowledge gaps and topics for future research focus are identiﬁed: Perceived Beneﬁts ; Digital Twin across the Product Life-Cycle ; Use-Cases ; Technical Implementations ; Levels of Fidelity ; Data Ownership ; and Integration between Virtual Entities ; each of which are required to realise the Digital Twin.


Introduction
Typically described as consisting of a physical entity, a virtual counterpart, and the data connections in between, the Digital Twin is increasingly being explored as a means of improving the performance of physical entities through leveraging computational techniques, themselves enabled through the virtual  counterpart. Interest in the Digital Twin has greatly increased in the past five years across both academia and industry, accompanied by a growth in the number of related publications, processes, concepts, and envisaged benefits (see Figure 1). Missing from literature, however, is a consolidated and consistent view on what the Digital Twin is, and how the concept is evolving to meet the needs of the many use-cases to which it is being tied. This lack of consistency has led to a breadth of characterisations and definitions for digital twins and the digital twinning process that, due to the breadth of frameworks applied across industry, leads to a risk of diluting the concept and missing the benefits that the Digital Twin was originally devised to deliver.

The origin of the Digital Twin
The origin of the Digital Twin is attributed to Michael Grieves and his work with John Vickers of NASA, with Grieves presenting the concept in a lecture on product life-cycle management in 2003 [33]. In a time when Grieves describes virtual product representations as ". . . relatively new and immature" and data collected about physical products as ". . . limited, manually collected, and mostly paper-based ", Grieves and Vickers saw a world where a virtual model of a product would provide the foundations for product life-cycle management.
The initial description defines a Digital Twin as a virtual representation of a physical product containing information about said product, with its origins in the field of product life-cycle management. In an early paper [33] Grieves expands on this definition by describing the Digital Twin as consisting of three components, a physical product, a virtual representation of that product, and the bi-directional data connections that feed data from the physical to the virtual representation, and information and processes from the virtual representation to the physical. Grieves depicted this flow as a cycle between the physical and virtual states (mirroring or twinning); of data from the physical to the virtual, and of information and processes from the virtual to the physical (See Figure  2). The virtual spaces themselves consisting of any number of sub-spaces that enable specific virtual operations: modelling, testing, optimisation, etc.

Concept Description
Digital Twin A complete virtual description of a physical product that is accurate to both micro and macro level.

Digital Twin Prototype
The virtual description of a prototype product, containing all the information required to create the physical twin.

Digital Twin Instance
A specific instance of a physical product that remains linked to an individual product throughout that products life. Digital Twin Aggregate The combination of all the Digital Twin Instance

Digital Twin Environment
A multiple domain physics application space for operating on Digital twins. These operations include performance prediction, and information interrogation.

The Digital Twin in the Product Life-cycle
In a later paper [34], Grieves further aligned the Digital Twin to the product life-cycle through the expansion of the concept via the introduction of the Digital Twin Prototype, Digital Twin Instance, Digital Twin Aggregate, and Digital Twin Environment (defined in Table 1). In context of the product life-cycle [82], see Figure 3, and using the terms within Table1, the Digital Twin starts life as a Digital Twin Prototype (design phase). Digital Twin Instances are created for each manufactured product during the realise phase, and the accumulation of the Instances form the Digital Twin Aggregate. Both the Instances and Aggregate exist within the Digital Twin Environment -the virtual representation of the environment within which the physical product exists -that enables virtual techniques such as simulation, modelling, and evaluation. The Digital Twin Instances/Aggregates and Environment persist beyond the actual life of the physical product, which ends in the Retire/Dispose phase.
This core concept of the Digital Twin envisaged a system that couples physical entities to virtual counterparts, leveraging the benefits of both the virtual and physical environments to the benefit of the entire system. Product information is captured, stored, evaluated, and learning applied to the current product, as well as future products. As envisioned by Grieves, this process in essence enables the application of a knowledgeable, data driven approach to the monitoring, management, and improvement of a product throughout it's life-cycle.
Since the inception of the Digital Twin in 2003 the concept has grown in interest, and is now listed by Gartner as a key strategic technology trend for 2019 1 . This growth is largely driven by advances in related technologies and initiatives such as Internet-of-Things, big data, multi-physical simulation, and Industry 4.0, real-time sensors and sensor networks, data management, data processing, and a drive towards a data-driven and digital manufacturing future. As a consequence both academia and industry have been researching, developing, and seeking to apply Digital Twins or the principles it represents. As will be demonstrated in this work, however, this growth has led to inconsistent application and divergence beyond the original descriptions of Greives, leading to a need for consolidation of the concept in light of current research and industry application.
This paper initially revisits the concept of the Digital Twin and through a systematic literature review attempts to characterise the Digital Twin including the key processes and associated terminology. Through this process, gaps in knowledge within the wider field are identified and discussed, setting directions for future work required to realise the Digital Twin and its envisaged benefits.

Methodology
The research presented in this paper follows a systematic approach [93] and therefore outlines a clear aim which is addressed in a repeatable and thorough manner. With the aim of characterise the Digital Twin including the key processes and associated terminology, Figure 4 shows the methodology used to gather a corpus of Digital Twin related literature, and a structured technique  for its analysis. The results described in the following sections are all based on a corpus of papers relating to the Digital Twin. This corpus was collected between the 29th of September 2018 and the 2nd of October 2018 using Google Scholar and the search query "digital twin". The first 50 pages of results (500 papers) were stored for review. All papers that cite one of three seminal papers, Grieves [33] [34] or Tao et. al paper [88] where also reviewed. At the time of writing both Grieves and Tao Table 3 highlights that approximately two thirds of the corpus are journal articles, with a third being conference papers, and the remaining 3 being book sections. Finally, Table 4 shows the corpus broken down by publication year, evident is the vast increase in the number of publications in 2017 and 2018 (to October 15th). Combining these, the corpus and research area appears to be heavily manufacturing/production related, and, from the relatively high number of conference papers compared to journal articles and the breakdown by year, booming. A thematic analysis [59] was then performed on the corpus. This form of analysis utilises a structured approach to identify themes within published work and involves six stages: become familiar with data; generate initial codes; search for themes; review themes; define themes; write up. Following this process, 19 core themes where identified. These themes were then divided into those that related to the characteristics of the Digital Twin (Section 3.1), and those that relate to general research areas, gaps and future directions (Section 3.2). Within these two sections, and where appropriate, further analysis of the corpus was also performed to further understand and elicit results. These include the identification of common parameters (Section 3.1.5), a mapping of the corpus against the product life-cycle (Section 3.2.2), and use-cases (Section 3.2.3). In a bid to further underpin the 12 characteristics discovered, Section 4 then selects a number of papers from similar research fields to show how the characteristics are not unique to the Digital Twin.

Results
As detailed in Section 2, the first part of the review involved a thematic analysis. Table 5 shows the identified core themes relating to the Digital Twin and their descriptions, with Table 6 showing the each theme mapped to related papers. Each of the identified themes presents a key concept identified across literature as part of the Digital Twin. Themes 1 to 12 form the basis of the characteristics of the Digital Twin, while themes 13 to 19 form the basis of future directions and gaps in research. It is worth noting that themes 18 and 19, data ownership and integration between virtual entities, are an exemplar of gaps in research in that they were highlighted as important within literature, and no papers held them as their focus.

Characterising the Digital Twin
Exploring themes 1 to 12 from Table 5 and Table 6, this section explores each themes in detail before formally describing and characterising the Digital Twin.

Physical Entity
In discussing the physical entity, papers are typically domain-specific in their terminology. Examples include: 'vehicle', 'component', 'product', 'system', 'models', and 'artefact'. The commonality in these entities lies in their 'real-world' existence and that they are, needless to say, physical. While this list of terms all refers to man-made entities, as interest in the Digital Twin has grown the Digital Twins of children [62], farms [98], and agricultural supply chains [39] have also been considered. To encompass all types, and in line with some of the literature, this paper proposes the use of the term 'physical entity' [112] [18] for when the physical entity is twinned. Any number of virtual 'worlds' or simulations that replicate the state of the physical environment and designed for specific use-case(s), e.g. heath monitoring, production schedule optimisation.

Fidelity
The number of parameters transferred between the physical and virtual entities, their accuracy, and their level of abstraction. Examples found in literature include: fully comprehensive, ultra-realistic, high-fidelity, data from multiple sources, micro-atomic level to the macro-geometrical level.

State
The current value of all parameters of either the physical or virtual entity/environment.

Parameters
The types of data, information, and processes transferred between entities, e.g. temperature, production scores, processes. 8

Physical-to-Virtual Connection
The connection from the physical to the virtual environment. Comprises of physical metrology and virtual realisation stages. 9 Virtual-to-Physical Connection The connection from the virtual to the physical environment. Comprises of virtual metrology and physical realisation stages.

10
Twinning and Twinning Rate The act of synchronisation between the two entities and the rate with which synchronisation occurs.

Physical Processes
The physical purposes and process within which the physical entity engages, e.g. a manufacturing production line.

Virtual Processes
The computational techniques employed within the virtual-world, e.g. optimisation, prediction, simulation, analysis, integrated multi-physics, multi-scale, probabilistic simulation.

Perceived Benefits
The envisaged advantages achieved in realising the Digital Twin, e.g. improved design, behaviour, structure, manufacturability, conformance, etc..

Digital Twin across the Product Life-Cycle
The life-Cycle of the Digital Twin -(whole life cycle, evolving digital profile, historical data) 15 Use-Cases The applications of the Digital Twin, e.g. reducing cost, improving service, supporting decision making.

16
Technical Implementations The technology used in realising the Digital Twin, e.g Internet-of-Things.

Levels of Fidelity
The number of parameters, their accuracy, and level of abstraction that are transferred between the virtual and physical twin/environment.

Data Ownership
The legal ownership of the data stored within the Digital Twin.

19
Integration between Virtual Entities The methods required to enable communication between different virtual entities.

Virtual Entity
As with the physical entity, the virtual entity is also referred to by a number of similar-yet-domain-specific terms. For example, 'product', 'world ','model ', 'cyber ', 'device' and 'object'. For symmetry with the physical entity and in line with some literature, this paper proposes the use of the term 'virtual entity' [99] [58] in the general case and 'virtual twin' [75] [1] [46] [49] [51] [1] when the virtual entity is twinned.
In line with Grieves' concept, there are multiple Virtual Entities present in a Digital Twin, each with a specific purpose, i.e. scheduling, health monitoring, etc. Yet to be presented in literature is how these different Virtual Entities interact, cooperate, and are aggregated. Take, for example, a case where the health monitoring Virtual Entity predicts a faulty component at the same time as the scheduling Virtual Entity is optimising to meet a deadline, which entity is prioritised? and how is that decision made?

Physical Environment
The physical environment refers to the 'real-world' space within which the physical entity is situated; 'real-space', 'real-world ', and 'factories' all being examples of terms used in literature. Aspects of these environments are measured and fed into to the virtual twin environment to ensure an accurate virtual environment, upon which simulations, optimisation, and/or decisions will be made (for example) and achieving this requires the capture of all relevant parameters (see 3.1.5. This paper proposes the term 'physical environment' to include all parameters that may influence the physical entity, noting that these need not be limited to those measured as part of the Digital Twin, and indeed that capture of all parameters may not be viable. Does measurement of the physical environment of a factory include, for example, the weather, regional holidays, or the schedule of a local sports team's home games? Arguably each of these factors could have an influence on the production output of said factory and as such should be included in the virtual environment. The term 'factory' implies not, whereas the term 'physical environment' implies any affecting parameter could be measured. The term 'physical environment' is also widely used in literature [36]

Virtual Environment
The virtual environment exists within the digital domain and is a mirror of the physical environment, with twinning achieved through physical metrology (i.e. sensors) relaying key measures from the physical to the virtual. In line with terms used to describe the physical environment, there are many similar terms used in place of the virtual environment, e.g. 'virtual-space', 'virtual-world ', 'data-model ', 'multi-domain models'. Unlike the physical environment, descriptions of the virtual environment are sometimes referred to by the underlying technology, such as 'database', 'data-warehouse', 'cloud-platform', and 'server ' and 'API '. In an ever-changing technological landscape it may be unwise to link the concept to a particular technology outside of the specific use-cases such papers present. As such, this paper proposes the term 'Virtual Environment'; a Location The entity's geographic position with respect to the entity [15], with respect to the environment [98], layouts [54], [106], [94], manufacturing [16] Process The activities within which the entity is engaged scheduling parameters (sequence, idle time) [108], models [21], logistics [36], general [

Parameters
Parameters refer to the types of data, information, and processes that are passed between the physical and virtual twins. Table 7 shows examples of parameters mentioned in the corpus classified into overarching themes. These themes were again developed through a thematic analysis of the corpus. Table  7 shows how parameters can be classified into just ten classes, a relatively small set given the range of examined literature.

Fidelity
The term fidelity describes the number of parameters, their accuracy, and level of abstraction that are transferred between the virtual and physical twin/environment. Terms such as 'comprehensive physical and functional description' [80], and 'fully mirroring its (physical twin) characteristics and functionalities' [85] are used to describe the fidelity, with the term fidelity itself used in [ [89]. Bar a small number of occasions where an appropriate, use-case specific fidelity is called for [24] [20] [9], the fidelity of the virtual model is described as a highly accurate replication of the physical entity. Grieves himself describes the virtual twin as accurate "from a micro-atomic level to the macro-geometrical level ". Correspondingly, this paper adopts the term fidelity.

State
The state refers to the current condition of both the physical and virtual twins, or the current values for each of the measured parameters. Specific examples of this include operational and health [68], processes and behaviour [104], mechanical and thermodynamic [111], as-built [63] [55], and even the state of a disease within a human being [92]. Considering the state of the virtual twin on par with the physical twin achieves functionality such as realtime state estimation [14], and the presentation and prediction of past, current, and future states [32] [22] [107] [63] [98]. The term state is then both appropriate and widely used and is thus proposed to be applicable to both the virtual and physical entities.

Physical-to-Virtual Connection
The physical-to-virtual connections are the means by which the state of the physical entity is transferred to, and realised in, the virtual environment -i.e. the updating of virtual parameters such that they reflect the values of physical parameters. These include Internet-of-Things sensors [ [16], and customer requirements [103].
All descriptions of the Digital Twin within literature contain physical-tovirtual connections. The connection itself consists of a Metrology phase, in which the state of the physical entity is captured, and a Realisation phase, in which the delta between the physical and digital entities is determined and the virtual entity is updated accordingly. Figure 5a shows this process. As an example, a change in temperature of a physical motor is measured using an Internet-of-Things thermometer (metrology phase), the temperature measurement is transferred to the virtual environment via a web service, a virtual process determines the difference in temperatures between the physical motor and the virtual motor, and then updates the virtual motor such that both measures are the same (realisation phase). There is no widely used term for this process and so in line with the terms presented in this paper, physical-to-virtual connection is proposed.
This continuous connection between the physical and virtual is one differentiator between the Digital Twin and more traditional simulation and modelling exercises, where analysis is frequently performed 'off-line'. The physical-tovirtual connection allows for the monitoring of state changes that occur both in response to conditions in the physical environment, as well as to changes in state that occur in response to interventions enacted by the Digital Twin itself, i.e. should a change in motor speed be enacted due to some temperature measurements, the physical-to-digital connection would also measure the effect of this intervention.

Virtual-to-Physical Connection
Grieves describes the virtual-to-physical connection as the flow of information and processes from the virtual to the physical; that is, the Digital Twin contains the functionality to physically realise a change in the physical state. Examples of this in practice include changes in display terminals [111], PLC's [111] [6], process control [48], machine parameters [55], and production management [112]. The process of virtual-to-physical connection mirrors that of the physical-to-virtual, in that it contains both metrology and realisation phases, see Figure 5b. Virtual processes and metrology methods determine and measure an optimal set of parameter values within a physical entity or environment, and realisation methods determine the delta between these new values and the existing state, and update the state of the physical entity accordingly. For example, in response to an increased motor temperature that exceeds a safety threshold, the effect of changing motor speed is modelled, a speed that sufficiently reduces the temperature is measured, and the physical motor speed is adjusted.
In comparison to the physical-to-virtual connection, the virtual-to-physical connection is not always included in descriptions of Digital Twins, even though it is included in Grieves' original definition. The reason for this is not clear; conceptually it is possible to generate a 'Digital Twin' with just a one way physical-to-virtual connection -the state of the virtual entity will reflect the state of the physical hence the two could be characterised as 'twinned' -although it is challenging to understand how benefits of the Digital Twin may be realised without a virtual-to-physical connection. The CIRP Encyclopedia of Production Engineering definition of the Digital Twin [83] is one such example that does not specifically include the virtual-to-physical connection. A potential benefit of this definition is that it is more universal but this comes at the expense of context of application relating to the fundamental paradigm of twinning and its origins, i.e. a bi-directional relationship between the virtual and the physical.
The value of the virtual-to-physical connection is that, when used in conjunction with a physical-to-virtual connection, it closes the loop between hypotheses generated in the virtual environment and the actual consequences realised in the physical environment. Effectively, the Digital Twin with both physicalto-virtual and virtual-to-physical connection can hypothesise, and subsequently perform, test, and adjust that hypothesis in a continuous adapting and improving cycle. It is this continuous loop that can set the Digital Twin apart from more traditional modelling methods, where hypothesis testing is a far more involved and labour intensive task.
An aspect of this which is again frequently not discussed in literature is the role of the human-in-the-loop: if one were to, for example, use the virtual twin to determine the health of a particular component using a predictive model, and then send a mechanic to replace that component, the mechanic in this scenario performs the realisation process of virtual-to-physical twinning.
If one did not have the virtual-to-physical connection, i.e. the information generated in the virtual environment is not acted on in the physical environment, then is becomes difficult to separate the concept from those of more traditional multi-physics simulation and modelling approaches that can be considered to represent an instance of a system at a predefined set of inputs/conditions. Leaving this point open for future debate and returning to the aim of this review, there is no widely used term for this process and so in line with the terms presented in this paper and physical-to-virtual twinning, virtual-to-physical connection/twinning is proposed as a key tenet of the paradigm.

Twinning/Twinning Rate
Twinning is 'simply' the act of synchronising the virtual and physical states, for example, the act of measuring the state of the physical entity and realising that state in the virtual environment such that the virtual and physical states are 'equal', in that all of the virtual parameters are the same value as physical parameters. The process is depicted in Figure 6 and includes the process of physical-to-virtual and virtual-to-physical twinning. A change that takes place in either the physical or virtual entity is measured before being realised in the equivalent virtual/physical twin, when both states are equal, the entities are 'twinned'. The combination of both connections allow for a continuous cycle optimisation, as possible physical states are predicted in the virtual environment and optimised for a specific goal. That is, a virtual optimisation process is performed using the current state of the Physical/Virtual Entity, once determined this optimal set of virtual parameters is propagated through to the Physical Twin. The Physical Twin then responds to the change, the loop cycles around to update the Virtual Twin with the measured physical state. The delta between the actual and predicted states can then be compared and the optimisation process re-run with the updated information.
The Twinning Rate is then the frequency with which twinning occurs. In literature, this twinning rate is only described as being in 'real-time'; that is, a change is a physical state will near-instantly be reflected by the same change in the virtual state. The value of a near real-time state is that it enables the Virtual and Physical Twins to act both simultaneously and together, and theoretically results in a near real-time response to change. For example, an assembly line that automatically adjusts scheduling to counter production losses when a faulty batch of components is detected.
Twinning and the Twinning Rate are in effect the live connection between the Physical Entity/Environment and the Virtual Entity/Environment. A key aspect of Grieves' Digital Twin is the collection and reuse of historical data. As such, all these interaction are stored within the virtual environment and made accessible to future Virtual Processes. This effectively means the Digital Twin can learn from its past, both in terms of actual historical performance and in terms of historical virtual processes.

Physical Processes
Physical processes refer to the activities being performed by the physical entity in the physical environment. Reported examples of these are largely manufacturing related [7] [5], iron and steel manufacturing process (coking, sintering, blast furnace iron-making, steel-making, continuous casting and rolling production) [103], 3D printing [40] [18], mobile robot control [107] [12], engineering design [87] [110], and medical health, disease and bio-mechanical processes [92]. It is during physical processes that changes in Physical Twin parameters occur, and it is these state changes that are captured and translated to the Virtual Twin.

Virtual Processes
Virtual processes refer to the activities performed using the virtual entity within the virtual environment. The vast majority of these processes can be covered by the activities of simulation, modelling, and optimisation [ [36], welding sequence optimisation [80] and 'whatif scenario' analyses of alternative management scenarios [3]. These processes result in changes in Virtual Twin parameters, the state of which can then be analysed and/or realised in the Physical Twin.

The Digital Twin and Twinning Process
Through a thematic analysis of literature, this section has identified a range of themes core to the Digital Twin concept. Here, these themes are consolidated and formalised as characteristics of the Digital Twin, with Table 8 presenting these characteristics and their descriptions, and Figure 7 presenting the Twinning process and the inter-relationship of terms within the overall Digital Twin concept. Figure 7 shows how physical/virtual processes act on the corresponding physical/virtual entity, where these processes generate a change in the state of that entity via it's parameters. This change in state is captured using metrology methods, transferred via physical-to-virtual and virtual-to-physical connections, and realised in the other (virtual/physical) environment by synchronisation of all parameters. Both virtual and physical environments contain the means to measure and realise state changes. The process of change → metrology → realise is the twinning process, and runs in both directions from virtual-to-physical and physical-to-virtual. The twinning rate is the frequency at which the virtual and physical twins are synchronised. This is the Digital Twin.

Future Directions and Gaps in Research
This subsection examines in more detail the literature to elicit gaps and future challenges based on the identified themes 13 to 19 from The data connections/process of measuring the state of the physical entity/twin/environment and realising that state in the virtual entity/twin/environment Virtual-to-Physical Connection/Twinning The data connections/process of measuring the state of the virtual entity/twin/environment and realising that state in the physical entity/twin/environment

Physical Processes
The processes within which the physical entity/twin is engaged, and/or the processes acting with or upon the physical entity/twin

Virtual Processes
The processes within which the virtual entity/twin is engaged, and/or the processes acting with or upon the virtual entity/twin  Figure 7: The physical-to-virtual and virtual-to-physical twinning process.

Perceived Benefits
There are many potential and perceived benefits highlighted in literature and industry relating to the digital twin concept. These include: reducing costs [33] [4] [40] [22], risk and design time [22], complexity and reconfiguration time [85]; improving after-sales service [17] [72], efficiency [2], maintenance decision making [58], security [6], safety and reliability [80], manufacturing management [67], processes and tools [11]; enhancing flexibility and competitiveness of manufacturing system [111]; and finally, from Grieves himself, the fostering of innovation [33]. There are, however, very few examples of validation and quantification of such perceived benefits against existing processes and systems, with very few papers showing tangible improvement over current norms. Given the potential costs and challenges of the infrastructure and work-flow changes needed to effectively implement digital twins in an industrial context, a lack of tangible understanding of scale and nature of benefits is a substantial obstacle. It is difficult to justify substantial change without clarity in return on investment, and similarly difficult to identify the characteristics and nature of the digital twin to employ in order to realise the benefits each industry context requires. Without substantial effort to describe and quantify benefits, it is challenging even to suggest that the digital twin concept itself may be the most appropriate solution to the challenges faced by each particular industry. Future work in this area is needed to evaluate the Digital Twin and associated processes and determine where quantifiable improvement may be achieved, the limits of this improvement, and the context / cases in which it may be operationalised. Establishing this cost-benefit is essential for industrial uptake.

The Digital Twin across the Product Life-Cycle
Grieves depicts the life-cycle of the Digital Twin as starting as a Digital Twin Prototype in the concept phase of the product life-cycle, and continually evolving throughout the entire life-cycle. As the virtual entity may be stored in-perpetuity it will eventually surpasses the physical entity itself, with continual potential value for future analysis and insights even following physical entity disposal. For the papers reviewed in which the life-cycle is described authors are in agreement with Grieves, with terms such as over, throughout, and entire lifecycles used [ If this is where the Digital Twin is envisaged, Figure 8 shows where research effort is focused through the classification of papers against Starks' product life-cycle [82] of: Imagine, Define, Realise, Support/Use, Retire/Dispose. Papers are classified by their focus on digital twins as a concept, the methodology of their implementation, implementation cases, or a general literature review. This classification shows that research is being largely focused on the Realise and Support/Use phases of the life-cycle, and that the majority of papers are presenting methodologies followed by reports on implementations. There are relatively few papers that place focus on the core concept of the Digital Twin or consider the digital twin across the entire life-cycle, while there are a relatively high number of methodologies and implementations that present an interpretation of the Digital Twin for specific use-cases. Such focus highlights several areas in which knowledge gaps exist, in particular the applicability of digital twins to earlier life-cycle phases and disposal, and the core concepts of digital twins both in the general case and in specific life-cycle phases. As a means for generating information and supporting optimisation, analysis, and understanding, a lack of detailed study of digital twins across the life-cycle implies that opportunities for benefit may to-date have been missed.
Further work is then needed to understand the requirements of the Digital Twin across the entire life-cycle, and determine whether the existing methodologies and implementations from other phases are applicable. Performing this work could see the realisations of benefits such as reduced cost, risks and design time, fostering innovation, general reliability, and decision making, particularly in the Imagine, Define and Retire/Dispose phases.

Use-Cases
The use-cases presented here simply refer to how and where literature is applying the Digital Twin. The vast majority of identified use-cases are manufacturing related [ [94]. Other use-cases include: product design (bicycle [87], pump [29], and automotive wiring harness [90]), model-based engineering [61] [20] [74], 5G communication for factories [16], air-frame health monitoring [53], composite optimisation [41], smart cars [22], farming [98] [28], and human health and the agriculture supply chain [39]. Figure 9b shows the use-cases mapped against the twinning cycle with the mapping based on the most appropriate placement for each use-case. For example, Simulation, Modelling, and Optimisation are all virtual processes and so are positioned on that side of the cycle, Smart Cars and Farms are both physical entities and so are placed on the physical side of the cycle. Those use-cases situated in the centre of the cycle relate to the entire cycle. Learning for example, contains both physical and virtual entities with the connections between.
From the mapping it is evident that the majority of use-cases involve both the physical-to-virtual and virtual-to-physical aspects of the Digital Twin -even if the virtual-to-physical involves a human-in-the-loop, such as training of the virtual twin before engaging with the physical twin. Two use-cases do not however involve both forms of twinning, yet there are specific reasons for each of these. Geometry assurance is a stage in a larger process -manufacturing and this larger process does involve both forms of twinning. Recycling is an end-of-life activity and, as such, there is no longer a physical entity to twin. In future research it may then be worth an increased focus on the virtual-tophysical connection if benefits of reduced times/costs and increased safety are to be realised.
Examining the use-cases from a product life-cycle perspective, Table 9 shows the literature classified by use-case across Stark's product life-cycle [82]. These results show that largely, research is concerned with data management and data usage techniques of simulation, modelling, and optimisation. Outside of these, there are some more specific use-cases such as geometry assurance, health monitoring, traceability, etc. The 'Other' category covers those use-cases that are either very specific (the design of an automotive wiring harness [90] for example) or at too high a level to fall into the other categories, "Emergence of Digital Twins" [23] for example.
Through the sparseness of several areas of research across many life-cycle phases, Table 9 highlights many opportunities for research and implementation. Aspects studied in each use-case may have potential for application across life-cycle phases, with concurrent potential benefit and opportunity for improvement. For example, opportunities for learning across phases, the importance of design traceability throughout earlier phases to capture rationale, and the nature of effective data management in earlier phases or disposal. In addition to the spares and unpopulated areas of Table 9, there is a lack of literature studying the entire life-cycle. The requirements of the Digital Twin at each phase of the life-cycle are not yet fully understood. The required fidelity at each phase for example. There are also questions over how many Digital Twins exist, is one Digital Twin across the entire life-cycle appropriate or is a new one implemented at each phase? And either way, how are transitions between phases managed? Once a product goes into production, do they all have a single common Digital Twin ancestor? Or is that ancestor cloned and duplicated across all instances? If this is the case, then what is that Digital Twin ancestor: a finished design, or some smaller subset of the finished design? There are then many interesting gaps in this area that require future work.

Technical Implementation
Research into technical solutions to the Digital Twin is largely focused on leveraging existing technologies. These include: 5G [16], Internet-of-Things [ [80]. Figure 9a shows the technology involved in enabling the Digital Twin presented in literature mapped to the twinning cycle. The technology is placed in the area of the cycle where it is used. Those technologies placed in the centre of the cycle are applicable to the entire cycle; for example, 5G and wireless communication technology are used for both physical-to-virtual and virtual-to-physical connections. Figure 9b shows that the Digital Twin is largely dependent on (Industrial) Internet-of-Things for twinning for both physical-to-virtual and virtualto-physical connection. In line with this, sensors (including RFIDs) are being used for data capture, and actuators are being used to realise change in the

22
physical environment. In terms of managing the virtual, the technologies discussed relate to the Internet-of-Things and general internet technology. Finally, the technology relating to the physical entity are those entities themselves, such as smart factories. The Digital Twin is being constructed on existing, state-ofthe-art, and off-the-shelf-technology that are being developed independently of the Digital Twin. While this has benefits in terms of cost and availability of technology, the counter to this is whether these technologies are optimised for the purpose of Digital Twin and the challenges of industrial applications. There is then a need to ensure future standards are suitable for Digital Twin purposes and if this is not possible, to develop those standards.

Levels of Fidelity
In the earlier discussion on fidelity, it was shown that most papers that discuss fidelity (including Grieves) advocate the highest levels feasible, with only a few papers (3) presenting fidelity levels specific to particular use-cases. Fidelity is important as it governs the processes that can be performed in both the virtual and physical environments, i.e. the higher the fidelity, the closer the virtual and physical twins are aligned and, for example, the more accurate the simulation, modelling, and optimisation will be.
Placing fidelity on a scale from abstract (low) to precise (high) with medium fidelity in the centre, those cases identified in the corpus are typically situated around the centre. That is, the use-cases use a subset of parameters (medium fidelity) and not the full set (high fidelity) called for in the original Digital Twin concept. Literature is yet to present an exhaustive high-fidelity implementation, where parameters for every aspect of the physical twin are captured. The reality of doing so may see challenges in elements of the system such as network speeds and computational processing power that means a true high-fidelity Digital Twin is not actually currently achievable. While this could change with future advances in technology, research should also be exploring techniques to mitigate this. Is there a 'divide and conquer' approach to twinning complex systems for example? Or, do we explore the importance of fidelity further and determine the most achievable or appropriate level of level(s) of multi-scale and multi-fidelity for a given use-case.
Equally, literature is yet to visit the abstract level of fidelity, i.e. a spreadsheet of requirements -is this part of a Digital Twin of a concept? Both these levels of fidelity raise challenges. Dependent on the parameters present and recorded, it is questionable whether an exhaustive high-fidelity Digital Twin is an achievable goal. If it is not, there rises a question of what level of fidelity is appropriate and realistic for a given case in order to maximise benefit while minimising expense and technical difficulty of implementation. The abstract level of fidelity challenges the concept of the Digital Twin itself, i.e. can you twin prior to their being a finalised physical and virtual design, can evolving twins of design prototypes be created while the physical entity itself varies substantially in fidelity, and even can a concept or idea itself be twinned? With proposed benefits in simulation capability and information generation, and earlier process stages characterised by a need for information, and up to 70% [96] Physical-Virtual Twinning of budget dedicated in early life-cycle phases, the creation of such abstract and early stage digital twins have substantial potential benefits. The answers to both these will likely be whether benefits are realised -at what level of fidelity does one maximise the improvement in decision making for maintenance, and are their benefits in being able to switch between physical and virtual working in early stage product design.

Data Ownership
In a world where the ownership of data, such as personal data (online activity for example), are being seen as increasingly controversial. This is also true within the field of engineering, i.e. car airbag 'black box' data [19]. If the aim of the Digital Twin is the exhaustive capture of all physical environment parameters, then there is a high possibility that those parameters can in someway directly or indirectly relate to aspects of people's lives, intellectual property, and everything in between. Determining how this information is shared between organisations and individuals poses a major challenge.
For example, when an individual purchases a car they own that physical entity. There is an unanswered question, however, as to whether they also own the virtual twin and associated data. This is particularly relevant if the individual is also an actor in the virtual environment -if the car is involved in a collision, there are a number of parties (insurance company, engineers, accident investigators) who may want access to data on how the car was being driven. The question of ownership is pivotal to who accesses data and for what purpose. There are then social and cultural implications associated with the large scale collection, storage, and sharing of data through the Digital Twin that need to be fully addressed.

Integration between Virtual Entities
Grieves described a Digital Twin consisting of multiple virtual entities and environments, each with it's own specific use case. For example, a production line virtual entity for health monitoring, and another for scheduling. This specificity is mirrored in the corpus, with literature discussing virtual entities at a level of specific use cases and in general, singular virtual entities. Literature is however yet to step back to the higher level view from which the interaction of virtual entities can be addressed -for example, balancing of the need to deliver to a production deadline with a predicted future fault, each of which may be managed by separate digital twins. As with the integration of all discrete digital systems, automatically taking the output from one virtual entity (health monitor) and using it to trigger a re-run of another virtual entity's analysis (production scheduler) may prove a non-trivial challenge. As quantity of twins increases, and hence potential complexity of the management of disparate twins, there may prove a need for specific research into twin integration and control. For example, there is potential value in operation of digital twins as agent-based systems that cooperate toward specific goals with emergent benefits for the wider system. This detailed problem assumes that 1) the virtual entities are on the same platform and 2) the virtual entities have a common means of interaction. Standardisation and interoperability such that virtual entities can communicate is key to realising this aspect of the Digital twin and, again, has potential to be a complex, non-trivial challenge. Examples of this can be seen in the Building Information Modelling field, where Stadler et. al [81] discuss and attempt to address the challenges of integrating the geographical city data with semantic information using CityGML [45], itself an open source XML-based data structure XML-based designed for the storage and sharing of virtual cities. Similar discussions and research must be performed to address challenges in implementation of digital twins, and realise potential benefits.

The Digital Twin Characteristics within the Context of Related Literature
The systematic literature review presented in this paper was deliberately confined to those papers with a contribution specific to the Digital Twin, such that research could be consolidated and a common understanding developed. There are however a number of related fields that both predate the Digital Twin (Virtual Manufacturing Systems for example), and are developing in parallel (Building Information Modelling for example). To provide greater underpinning to the characteristics developed, this paper now considers the developed characteristics within the wider context of these other fields. Unlike the systematic approach used to generate the characteristics of the Digital Twin and due to the vast number of publications in all these fields, this section considers only seminal works from the related fields.

Computer-Integrated Manufacturing
Back in the late 1980's, the CIM Reference Model Committee International Purdue Workshop on Industrial Computer Systems published a reference manual for Computer-Integrated Manufacturing [102]. With the advent of computers with processing power fit for the 'real-time' control of production lines, Computer-Integrated Manufacturing was seen as the means of developing robust and dynamic production lines with the ability to adapt and compensate to changes caused by disruptions such as breakdowns, and changes in customer demands. This was achieved through the closing of information loops, i.e. computers could both monitor and enact change in the physical entity.
In 2018 to mark the 30 year anniversary of the field the International Journal of Computer Integrated Manufacturing, Laengle et al. [50] produced a bibliometric analysis of the journal's 1687 papers. Amongst the analysis the top 30 global keywords are presented, with the top three (and their position in brackets) being 'simulation' (1), 'scheduling' (2), and 'process planning' (3). While not taking the full list out of context, it is effectively made up from virtual techniques ('simulation' (1), 'modelling' (6), 'optimisation' (22)), the means of realising change in the physical entity ('Step-NC ' (5), 'CNC ' (20)), metrology and data management techniques ('interoperability' (7), 'RFID' (24)), and specific use-cases 'supply chain management' (10), 'supplier selection' (25)). Topics that all align with those presented in the characterisation of the Digital Twin presented in this paper.

Virtual Manufacturing Systems
In the 1990's Onosato published a paper on the development of a Virtual Manufacturing System [65], a system aimed at generating a virtual representation of a physical production line such that manufacturing processes could be modelled without the need for the physical entity. Specifically, Onosato presented the mean to model the factory, product life-cycle, and manufacturing processes over time, with the desired use-cases of shop-floor layout, modelling, testing and simulation of control strategies, programs and scheduling. Later that decade, Iwata et al. [43] [44] built on Onosato's work, defining architectures and information infrastructures required to deliver the Virtual Manufacturing System. While communication technology and processing power have developed and are capable of processing more and faster, changing the landscape of the challenges, Onosato's Virtual Manufacturing System was aimed at being an "...manufacturing systems which pursue the informational equivalence with real manufacturing systems." Capable of replicating the physical production line such that accurate and useful models could be created and evaluated.
In comparison with the characteristics of the Digital Twin, Virtual Manufacturing Systems are discussed in terms of physical and virtual entities, with the virtual entity being a high-fidelity representation of the physical. The key differences are the lack of connection between physical and virtual entities. The aim of the Virtual Manufacturing System was to be useful through its ability to replicate 'real-wold' operations through high-fidelity virtual representations of the physical. So the concept of using a virtual representation of a physical entity is one that has existed since the 1990's, albeit one that relies on accurate models, rather than 'real-world' data.

Model-Based Predictive Control
Originally developed as a mean to control chemical processes in the oil and gas industry, Model Predictive Control is simply the means of controlling a process based on some form of model (e.g. linear, non-linear) [31]. Physical processes are measured and compared to a virtual model that is able to predict the future states of the process, and optimise/adapt/control the process appropriately. As a means of control, model-based prediction is automated and robust, and as such is now widely used across engineering disciplines and has evolved through a number of generations [69]. In a review of the field published in 2014 Mayne [60] gives a good general overview of the field, highlighting both the theoretical and mathematical aspects of the models, as well as the more physical sensors, actuators and the practical network challenges in delivering closed-loop control through "sensor-to-controller" and "controller-to-actuator" connections.

27
The similarities between the Digital Twin and Model-Based Predictive Control are in the capture and interpretation of the current state of the physical entity and being able to use that current state to change the future state. Whether that is to optimise or to react to problems etc.. The similarity between the "sensor-to-controller" and "controller-to-actuator" and the physical-to-virtual and virtual-to-physical that appear in the characteristics of the Digital Twin, speak to the benefits of the closed-loop approach as originally conceived by Grieves and Vickers.

Advanced Control Systems
In a review of control techniques in factory automation, Dotoli et al. [ Algorithms have all been integrated and applied to the control of industrial machines. Adaptive Control systems simply adapt the manner in which they control based on input parameters from the system that they control. Discrete Event Systems Based Control systems are based on the occurrence of asynchronous discrete events, i.e. the control enters particular states based on inputs from the controlled system. Event-Triggered Control systems respond to the detection of particular states in the controlled system. Self-Triggered Control systems respond to predicted states in the controlled system, i.e. they are able to react in anticipation of the controlled system entering a particular state.
Similarly to the description of Model-Based Predictive Control systems in the previous subsection, Advanced Control Systems use measurement of data from a physical entity, preform some form of virtual analysis on that data, and use it to realise change in the physical entity/environment. As such, the similarity between Advanced Control Systems and the Digital Twin mirror that of Model Predictive Control. In addition however, the advanced techniques speak to the challenges of control of complex systems, with the need for intelligent control approaches.

Machine Health Monitoring/Prognostics
In their paper on rotating machinery prognostics, Heng et al. [38] describe machine prognostics as the '...forecast of the remaining operational life, future condition, or probability of reliable operation of an equipment based on the acquired condition monitoring data' and state how the challenges of maintaining the health of machinery has moved from breakdown maintenance (fixing a broken machine), through to intelligent predictive maintenance systems, i.e. the automated collection, analysis and prediction of the state of a machine. Heng et al. describe the process of sensors being used to measure the state of a machine (vibration, acoustics, etc.), and these measures being stored and analysed using physical-based (mathematical models) and data-driven (artificial neural networks on historical and current states) prognostics models. With a review of the state-of-the-art in the field, amongst others, Heng at al. concluded that too many prognostics models were based on data collected in laboratory environments, rather than the 'real-world' operational environment.
Within the context of the characteristics of the Digital Twin, the techniques clearly map to metrology methods, physical-to-virtual data connections, and the state of the physical entity. The Heng et al. paper is used as an example here as it is both a widely cited publication, and also speaks to the importance of the environment within which the physical entity is situated. Something that Grieves' Digital Twin called for from the beginning.

Building Information Modelling
While technically both an engineering field and one that manages the lifecycle of an engineering asset: a building, Building Information Modelling is both a current and highly related research field. Being largely driven by government legislation requiring the capture of building information and making it accessible via a three-dimensional representation of the building, Building Information Modelling is aimed at providing a single source from which all stakeholders can operate, across the building's entire life-cycle. In their handbook on Building Information Modelling, Eastman et al. [26], describe an overview of the field. While a large portion of the book is dedicated to the various stakeholders involved in building projects, the delivery of systems is focused on interoperability (standardised file formats), parametric modelling, and light-weight representations (CAD models/visualisations). So from its foundations, Building Information Modelling is a virtual representation of a physical entity albeit with a greater focus on the users of the system than the Digital Twin.
To emphasise the similarity with the Digital Twin, Table 10 shows the 'BIM Levels'as outlined by the UK Government's Building Information Modelling Industry Task Group (2011) 2 . BIM Levels 1 to 3 map across to the Digital Model, Digital Shadow, and Digital Twin as described by Kritzinger et al. [48]. The Digital Model being the two and three-dimensional CAD model (corresponding to BIM Level 1), the Digital Shadow showing the three-dimensional CAD model containing data from the actual physical construction (corresponding to BIM Level 2), and the Digital Twin being the three-dimensional model with two-way data connections (corresponding to BIM Level 3). Put within the context of the characteristics of the Digital Twin, Building Information Modelling aims to twin a building, using both virtual-to-physical and physical-to-virtual data connections, with the means to measure and realise change in the current state of the physical building.

BIM Level
Description 0 Unmanaged two-dimensional CAD shared via paper/electronic paper 1 Managed two/three-dimensional CAD adhering to BS1192:2007 and within a Common Data Environment that allows collaboration. 2 Level 1 within a three-dimensional virtual environment with attached data. Representations for Architectural, Structural, Facilities, Building Sources and Bridges. 3 Level 2 plus interoperable data.

Summary
The Digital Twin is seen as a relatively new research field. While not exhaustive, this section frames the characteristics of the Digital Twin with respect to concepts in related fields. Many of which predate the Digital Twin. The purpose of this section is to highlight and further underpin the characteristics of the Digital Twin: Physical Entity/Twin; Virtual Entity/Twin; Physical Environment; Virtual Environment; State; Realisation; Metrology; Twinning; Twinning Rate; Physical-to-Virtual Connection/Twinning; Virtual-to-Physical Connection/Twinning; Physical Processes; and Virtual Processes. Whether explicitly stated or not, each of these characteristics appear elsewhere in literature. One could also argue that the Digital Twin is not a brand new concept, it can also be seen as the aggregation or evolution of a number of existing areas of research and industrial techniques.

Conclusion
The Digital Twin is undergoing an increase in interest from both an academic and industrial perspective. In response to this, this paper presented a systematic literature review in a bid to characterise the Digital Twin, identify gaps in research, and highlight directions for future research. The review methodology comprised a collection of 92 Digital Twin related papers from two sources: a Google Scholar search with the query 'digital Twin', and those papers that cite one of three seminal works. A thematic analysis was then performed on the corpus and 19 key themes were extracted (Table 5 and 5). These themes were separated into those relating to the characteristics of the Digital Twin, and those that spoke to the gaps in research and future direction.
Starting with characterising the Digital Twin, the main contribution of this section of work were 13 characteristics and processes where generated and discussed in detail (Table 8 and Figure 7). These characteristics comprised: Physical Entity/Twin; Virtual Entity/Twin; Physical Environment; Virtual Environment; State; Realisation; Metrology; Twinning; Twinning Rate; Physical-to-Virtual Connection/Twinning; Virtual-to-Physical Connection/Twinning; Phys-ical Processes; and Virtual Processes. These characteristics were mapped in Figure 7 to generate a complete description of the digital twin and, according to current literature, all elements and concepts it contains. The second part of this paper identified gaps in and future directions for research based on the remaining seven identified themes. These comprised: Perceived Benefits; Digital Twin across the Product Life-Cycle; Use-Cases; Technical Implementations; Levels of Fidelity; Data Ownership; and Integration between Virtual Entities (Tables 5  and 6). Each of these areas were discussed in detail and further analysis performed where required; i.e. mapping of the corpus to the product life-cycle, and highlighting the lack of research and implementation in earlier life-cycle phases or system disposal ( Figure 3); identification of 11 use-cases associated to the Digital Twin (Table 9); and a mapping of use-cases and technology to the twinning cycle (Figure 9b and 9a).
The Digital Twin would benefit from a more detailed comparison and review in context of similar and connected fields. Building Information Modelling shares many aspects of the Digital Twin, is arguably a more advanced field, contains both physical and virtual entities with data connections between, and is yet treated as a separate area of research. Computer-Integrated Manufacturing, Virtual Manufacturing Systems, Model-Based Predictive Control, Advanced Control Systems, and Health Monitoring/Prognostics are examples of well established research areas that both predate the Digital Twin and underpin the characteristics presented in this paper. Some challenges of delivering the Digital Twin will not unique and may have been addressed in these related fields. For example, Section 3.2.7 on Integration between Virtual Entities showed one attempt to manage integration between data sources in Building Information Modelling. Something that the Digital Twin will also have to address. This paper then contributes in both an understanding of the Digital Twin and its future direction. As shown by both the 2019 Gartner Hype Cycle and the breakdown of number of papers published by year (Table 4), the field is appearing to undergo a large increase in attention from both academia and industry. As an example of this, the CIRP Encyclopedia of Production Engineering [83] recently launched a definition of the Digital Twin: A digital twin is a digital representation of an active unique product (real device, object, machine, service, or intangible asset) or unique product-service system (a system consisting of a product and a related service) that comprises its selected characteristics, properties, conditions, and behaviors by means of models, information, and data within a single or even across multiple life cycle phases. alongside a "Digital Twin 8-dimension model " for planning according to the purpose of the Digital Twin. The 8-dimension model reinforces many of the findings presented in this paper, albeit with different terminology: the model talks of integration breadth, connectivity mode, update frequency, CPS intelligence, simulation capabilities, digital model richness, human interaction, and the product life-cycle. Table 11 shows the CIRP dimensions mapped to the findings presented in this paper.  The CIRP Encyclopedia also acknowledges the challenge of how best to represent the Digital Twin that future research efforts will need to address. The contribution of the characterisation of the Digital Twin is a step forward in addressing this. Through framing future Digital Twin use-cases with a consolidated common understanding and terminology, a multitude of Digital Twins of physical entities of all forms can be realised in a manner that holds true to the Digital Twin paradigm. It is only through these efforts that the envisaged benefits afforded by the Digital Twin can be fully realised and shared across domains.