Automated Container Terminal Production Operation and Optimization via an AdaBoost-Based Digital Twin Framework

,


Introduction
e container terminal is a key node of international transportation and an important hub for cargo transportation between land and sea [1]. With the deepening of automated container terminals (ACTs) construction, the transformation from automation to intelligence is the trend at container terminals [2]. Various advanced technologies, equipment, and intelligent systems were applied into the container terminal to realize the automated, high-efficiency, and environmental friendly operation. Loading and unloading efficiency, safety, and service capacity determine the intelligent degree of the ACT [3]. At present, the development of the ACT faces major challenges, such as safety and efficiency [4]. High operation risk and limited emergency capacity occur at ACTs. Loading and unloading are a high-risk operation, which have high safety requirements for the working environment, mechanical equipment, and operators. In the process of operation, all kinds of potential safety hazards need to be faced, such as equipment failure, equipment conflict, and human negligence. ese hidden dangers have a negative impact on the operation safety.
Another key problem is the loading and unloading efficiency, which restricts ACT's economic development. e processing speed of handling equipment, horizontal transportation time, and waiting time between equipment are the key factors affecting the efficiency of ACT operation. us, it is of great significance to further improve the safety and operation efficiency of ACTs with advanced technology and production control methods to monitor the operation in real time.
Some research on the operation control of ACTs was provided for the safety production and operation efficiency. Zhong et al. [5] established the architecture of commandand-control system using the multiagent technology and realized the improvement of operation efficiency when ensuring the safe operation of the terminal. Ma et al. [6] used the mutation hybrid frog leaping algorithm to improve the efficiency of horizontal transportation. However, researchers mostly focus on the analysis of historical data and seldom consider the real-time and virtual data for the operation of the ACT. e common feature of the above research is based on physical space. e lack of integration between physical space and virtual space and the data generated by each system are independent of each other, which provides a low value for the optimization of ACT production control [7]. erefore, there are still many problems in the production control research for ACTs. First, few methods were provided to achieve real-time iterative optimization of virtual and physical space of ACT [7]. In addition, most of the research on ACTs are based on single-type data, where the data among systems exist in isolation. In other words, the current research focuses on historical data and ignores real-time data, simulation data, and fusion data. is adds the difficulty to control the real-time process of the ACT. Under some specific assumptions, mathematical models and algorithms [6,8] are used to improve the operation efficiency, which is not consistent with the actual engineering application.
With the application of new generation information technology in industry such as digital twin (DT), artificial intelligence (AI), and industrial Internet of things (IIoT), the interaction between physical and virtual space becomes possible [9]. DT is the mirror image of physical entities in virtual space, which has the characteristics of virtual reality mapping, real-time, prescience, and closed-loop. Tao and Zhang [10] applied DT to shop-floor and established a DT framework. Based on the framework, the production characteristics such as virtual real integration, operation iteration, and optimization of shop-floor are explored. Eugene et al. [11] applied DT to the loading of air cargo. By constructing a closed-loop dynamic air cargo loading DT system, loading optimization and real-time visualization in dynamic environment are realized. Besides, Wei et al. [12] deeply integrated DT with the passenger-cargo RORO Port and proposed the DT system composition and operation mechanism. e applications of DT in previous research mostly focused on manufacturing shop-floor, aviation, and RORO port, but there is little research about ACT production control. On this basis, a DT application framework integrated with AdaBoost is proposed to optimize the production control of the ACT. It can not only deal with the uncertain factors in the operation of ACT in time but also control the production process more accurately and flexibly.
is study combines DTand machine learning to explore an intelligent operation mode for the operation at ACTs. A DT application framework integrated with AdaBoost is proposed, which can comprehensively manage and control the uncertain factors in ACT operation. In view of the uncertainty interference, the twin data is used to generate the dynamic models iteratively according to the change of environment. en, the AdaBoost algorithm is introduced to train the model repeatedly to better control the operation. Furthermore, the proposed DT application framework is applied to the ACT for developing the DT system. e conventional and DT-based operation tests with different scales are carried out to demonstrate the efficiency of the proposed DT application. e remainder of this paper is organized as follows: in Section 2, there is a literature summary on DT research, machine learning, and terminal production optimization.
e DT theoretical reference model is established, and the proposed application framework is described in detail in Section 3. Section 4 illustrates the model training and experimental results. Section 5 briefly concludes the research.

Related Work
DT reflects the precise mapping between physical space and virtual space [13]. However, the research on theory and practical application of DT is still at its infancy. No universal definition, implementation framework, and protocol are available [14]. DT is an information model which is equivalent to a physical entity in virtual space [15]. It can be used to simulate, optimize, and control the behaviors of physical entities. e concept of "twins" can be traced back to NASA's Apollo program, which was built to allow mirroring the conditions of the space vehicle during the mission [16]. Michael Grieves first proposed the concept of the DT in 2003 and defined the conceptual model of it [17]. is model is mainly composed of physical entities, virtual entities, and their connections. At that time, owing to the immature data acquisition technology in the production process, as well as the difficulty of computer performance and algorithm to deal with a large amount of data in real time, DT has not been widely concerned [18]. To further understand the concept of DT, this paper interprets DT through the specific application of the DT in various fields in detail. At present, DT is widely used in manufacturing, aerospace, health care, power system, sea transportation, and other fields. Smart manufacturing is the most concerned and hot research field in China. Several authors [19][20][21] proposed the concept of a DT workshop, expecting to realize smart manufacturing through the interaction and integration of the physical world and information world. e US Air Force research laboratory applied DT to solve the maintenance and life prediction problems of future aircraft in a complex service environment [22]. Hänel et al. [23] designed a method based on planning and process data for machining processes via using the example of components in the aerospace industry to create a DT model. Liu et al. [24] proposed a DT-based cloud medical system framework medical treatment to realize the management of monitoring, diagnosis, and prediction of personal health. Zhou [25] applied the DT framework to the online analysis of power grid and proposed the online analysis of DT (OADT) method to realize the digitalization of power grid dispatching rules. Especially, some researchers have begun to pay attention to the application of DT in the port field. Among this, more attention is paid to the port operation and terminal equipment maintenance. Wei et al. [12] deeply integrated DT technology with the traditional port and created an equation for the system architecture of the DT port of passenger-cargo RO/RO to realize digital and intelligent operation of the port. Hofmann and Branding [26] used DT to continuously evaluate the current scheduling strategy and configuration alternatives so as to get the best scheduling strategy to enhance port resource utilization. In addition, Szpytko and Duarte [27] applied DT to the container terminal and established the comprehensive maintenance decisionmaking model under the crane operation state using the concept of DT. is greatly reduced the risk of gantry crane failure (GCI) and improved the operation efficiency.
Machine learning is an important branch of artificial intelligence. It is a method where the computer obtains a certain model using the existing data and uses this model to predict the future. e major machine learning algorithms include neural network, support vector machine, decision tree, clustering, and regression algorithm. With the rapid development of the artificial intelligence technology, machine learning has attracted more and more attention. Chen et al. [28] used a multiview learning algorithm to extract ship descriptors from different ship feature sets to achieve efficient ship tracking. Pham et al. [29] proposed the application of data mining and machine learning technology in metal industry. Shahrabi et al. [30] used reinforcement learning to deal with dynamic job shop scheduling problem under the uncertainty of stochastic jobs and machine failures. Li and Xu [31] explored the application of machine learning in intelligent transportation. e machine learning method is applied to the traffic flow prediction to solve the problem that the traditional traffic flow prediction model cannot cope with the complex changes of traffic flow. Choe et al. [32] applied machine learning to ACTs. e scheduling strategy of automated guided vehicle (AGV) is dynamically adjusted by online preference learning algorithm. Min et al. [7] applied machine learning to petrochemical industry. e DT framework of production optimization in petroleum industry based on machine learning is proposed. e DT model is trained by machine learning and big data technology to realize the optimization of production control. ACT operation includes a series of related operation process at ACTs, and its complexity involves many decision-making problems commonly discussed in the terminal [5]. Until now, there are abundant studies focused on the operations of ACTs, such as safety of operation, handling equipment scheduling, storage space allocation, and integrated scheduling of multiresource [33]. Chen et al. [34] deployed adaptively the proposed ship behavior analysis framework in the networked autonomous vehicle detection system of the automated terminal to improve the capacity and security of the traffic network. As for the equipment scheduling of ACTs, previous studies have focused on improving the operational efficiency of automated quay cranes (AQCs), automated guided vehicles (AGVs), and automated yard cranes (AYCs). AQC scheduling is to minimize the completion time of ship operation by determining the optimal operation sequence for handling containers. Dagansuo [35] was the first to study the crane scheduling problem of the multicontainer ship terminal. In the follow-up study, a variety of constraints are considered and noncross constraints are introduced into AQC scheduling problem. Also, heuristic algorithm is used to solve the model [36]. AYC scheduling focuses on obtaining an optimal plan for one or more AYCs to stack and reclaim containers in the terminal storage yard [37]. Many researchers have studied the stacking operation of a single crane. Besides, some studies have considered the setting of multiple cranes sharing one stacking [38]. For AGV scheduling problems, most researchers focused on seeking the optimized assignment and routing plan for vehicles [39]. Chen et al. [40] proposed a method based on the computer vision technology to extract vehicle trajectory efficiently and accurately from video and enrich more trajectory data sets under traffic conditions for traffic flow research. Tasoglu and Yildiz [41] proposed a simulation optimization-based solution method for the integrated berth allocation and AYC scheduling problem under the influence of many factors such as the berth layout. Homayouni et al. [8] proposed a mixed integer programming model for the integrated scheduling of handling equipment in ACTs and used simulated annealing algorithm to find the optimal solution.
From the above, the practical value of DT and machine learning in manufacturing, transportation, and other industries has been demonstrated in many existing studies. However, the application of the interaction between DT and machine learning in the ACT still needs further exploration.
rough the review and analysis of the existing literature, it is found that there are still some contents to be further explored and studied. First, an application framework that can support the DT theory and method in ACT operation optimization is still needed. Second, it explores how to dynamically update the DT model to respond to the influence of uncertain factors in real time. Finally, the real-time performance of DT is embodied. e acquisition and processing of time series data in different periods is also a difficult problem to support machine learning to complete model training. us, according to the above analysis results, this paper proposes a new production and operation mode suitable for ACT. en, machine learning is used to enhance DT application framework to achieve the production control optimization of ACT. Besides, the framework is applied to an actual automated terminal to verify its availability.

DT eoretical Reference Model for ACT Production
Control. Under the current terminal operation mode, the data generated by the system usually directly affects the field as the information flow. It has not been used to update the virtual terminal iteratively to adapt to the dynamic environment of operation. Moreover, traditional terminal production control methods are generally based on expert experience. However, the DT-based terminal operation mode is an iterative interactive process between physical and virtual scenes. In the DT framework, the real-time data generated in the terminal operation are continuously transmitted to the virtual model through the IIoT technology. Real-time and historical data are used to train and update the model. Feedback is provided to the control system to achieve real-time control of production. In this section, the DT theoretical reference model for the operation of ACTs is discussed.
rough the exploration of the application practice of DT, the five-dimensional DT model was put forward [9], which was verified to have satisfactory practicability [42]. It also provides a general reference model for the application of DT in different fields [43]. us, as for the aforementioned challenges of ACT, a DT reference model (DT − TPOS) for the operation optimization of the ACT is raised: DT − TPOS reference model (see Figure 1). It can be expressed in the form of a five-tuple model, as shown in Equation 1.
where TPS, TVS, TSS, TDD, and T_C denote the terminal physical space, the terminal virtual space, the terminal service system, the terminal twin data, and the connection of different elements, respectively. ∷ � is the meaning that can be defined. TPS is regarded as the foundation of building the DT − TPOS model, which provides the basic production environment for the operation of the terminal. TVS is the mapping of TPS, and it can also monitor and control TPS. TSS is the core component of DT − TPOS, which completes the function realization and application service of various data, models, and algorithms during the operation process of the ACT. In addition, it can also drive TVS to keep synchronization with TPS. TDD is the key driver of DT − TPOS, which can effectively solve the problem of information island in ACT operation. T_C is not only the key link of DT − TPOS construction but also the key to ensure the successively iterative optimization of the operation process. Interfaces (e.g., RPC, JDBC) and protocols (e.g., OPC, Http Restful) are defined to support the connection between DT − TPOS.

An Integrated DT Application Framework of ACT
Operation. Based on the DT theoretical reference model proposed in Section 3.1, an integrated DT application framework with AdaBoost in ACT operation optimization is discussed in this section. It consists of three parts: physical space, data service platform, and virtual space. During the process of ACT operation, the application of DT realizes the interconnection between the virtual world and the real world. Furthermore, it provides a practical solution to realize the whole physical system mapping, whole parameter dynamic modeling of the physical entity, and the real-time iterative optimization of the operation process. erefore, through the iterative optimization of the whole process and the interaction between virtual reality, the intelligent decision in ACT operation process is realized. An integrated DT application framework with AdaBoost in ACT operation optimization is seen in Figure 2. e construction process of an integrated framework for ACT production control is as follows: according to the layout, production factors and the technological process of the ACT, the corresponding virtual model is constructed. After that, based on the historical data of the terminal operation system, the virtual model is trained by machine learning. By a series of evaluation indexes, the virtual model is evaluated, verified, and optimized. en, the real-time data are used to drive the synchronization between virtual and reality. Combined with the input information of production demand and the real-time data of field operation, the optimal solution is simulated in the virtual model. e solution is fed back to the terminal operation system to guide the field operation. Finally, there is a circular interaction between virtual and reality. To adapt to the dynamic changes of the terminal production environment, the virtual model is optimized by real-time iteration on the basis of constantly updated data.

Physical Space.
Physical space refers to the collection of existing physical entities of ACT. It can not only ensure the basic operation of ACT but also provide all elements of data information for the virtual space. As shown at the bottom of Figure 2, the physical space is composed of physical entities, such as containers, handling equipment, and environment, which provides a complex and dynamic production environment for ACT operation. ese physical entities are distributed in different locations of ACT and connected through the IIoT technology.
e formal description of physical space is shown in Equations (2)-(5): where PS is a set of all physical entities. PC is a set of containers. PE is a set of handling devices. PG is a set of scenarios and environments. PN is a set of intelligent gateways. ⋈ refers to the natural connection between PC, PE, PG, and PN, indicating the autonomous interaction between them. n is the number of containers. PC i is the ith container. C ID is the container number. R ID is the unique code of the container bound RFID tag. C a is the set of container attributes, which includes weight, type, length, width, and height. C SD means the place of departure or destination of the container. C pos is the current storage location of the container. C t is the date of departure or arrival of the container. C info is some other information about the container. m is the number of handing devices. PE i is the ith handing devices. E ID is a set of device number. E type is a set of device types, including handling and transportation devices. E cp is the key parameter set of the device. E cs is the current status of the device, such as idle, run, and fault. E pos is the position of the device. Da s is a set of real-time data in device operation. E info is some other information about the device. e information perception of physical space is the key to establish one-to-one mapping between virtual and physical entities. In this study, the data sensing methods of multisource data in the ACT are divided into three categories: artificial static data sensing, various sensors data sensing, and RFID-based data sensing. Furthermore, different interfaces (such as RS232, RFID, MODBUS) and communication protocols (such as TCP/IP, OPC, CAN) are defined for ACT multisource data acquisition and transmission. Multisource data acquisition and transmission mainly through three modules: control and execution module, perception module, and network module. e control and execution module are composed of PLC, server, worker, mechanical control, and display terminal, which is to control the operation of the devices. e perception of devices, personnel, and environment in the process of operation, which is to collect the total factor data information of ACT operation. e network module is composed of RFID communication, intelligent gateway communication, and wireless sensor network, which is to transmit the perceived data information upward. Considering the differences between devices, some can be directly through the network module.

Data Service Platform.
e data service platform is the medium connecting physical space and virtual space during the operation of ACT. rough the interaction of operation data in data service platform, the bidirectional mapping and interconnection between physical and virtual space can be realized. As shown in the middle of Figure 2., the data service platform mainly includes data processing, data mapping, and data storage.
(1) Data Processing Module. e original data obtained by the IIoT and wireless sensor technology are mostly time series data. e fast processing of time series data ensures the synchronization between physical and virtual space. To use machine learning to complete the model training successfully, the data need to be cleaned, resampled, correlated, and dimensionality reduced.
(1) Data cleaning Data cleaning plays an important role in time series data processing. Its purpose is to filter and remove duplicate or redundant data, supplement missing data, correct or delete wrong data from the original data, and finally sort out the data that can be further used. In this paper, the data set includes the container number, AGV number, AGV position, AGV residual power, AGV speed, AGV current status, AGV engine current and voltage, AGV transportation time, AQC number, AQC position, AQC completion time, AQC placement speed, AQC engine current and voltage, AYC number, AYC position, AYC completion time, AYC engine voltage and current, and so on. First, the data set is preprocessed, such as data backup, unifying the data format of each column, and deleting redundant empty rows. Second, the missing values of data set are filled. e preprocessed data set is supplemented by the Lagrange interpolation method in Equation (6). Finally, the abnormal data is processed. It is deleted or corrected with the average value by the abnormal degree.
where f(x) is the fill value obtained, x i is the position of the corresponding independent variable, y i is the value of the corresponding function at this position, i≠j (x − x j )/(x i − x j ) is a polynomial, and n is the number of polynomials.
(2) Data resampling In the process of ACT operation, the sampling frequency of different types of time series data is different, but these data have continuity. us, it is necessary to use data resampling to realize time series data frequency unification, that is to say, the process of transforming time series data from one frequency to another. Generally, any data dimension is selected as a benchmark, and the data of other dimensions are consistent with the benchmark. An example of data resampling is shown in equations (7) and (8). First, assume that two sets of data are   Journal of Advanced Transportation collected at F 1 and F 2 frequencies, which are X 1 and X 2 , respectively. Second, taking X 1 as the benchmark, the new data dimension X * 2 is generated by X 2 and is consistent with the sampling frequency of X 1 . k and i are any position in the X 1 and X 2 data sets.
(3) Data correlation analysis and dimension reduction Data volume and data dimension should be considered when using machine learning to train the model. In most cases, a large number of data will cause the machine learning to run slowly. Moreover, too large data dimension is easy to cause dimension disaster. us, the data dimension reduction method is adopted. Generally, there are two ways to reduce the dimension of data. One is to use principal component analysis (PCA), by destroying the original structure of the data to extract its features. e second method is the data correlation analysis. e attributes of data are chosen to achieve the purpose of dimension reduction by certain rules. In practical engineering problems, the collected data itself has a very important physical significance and research value. Extracting the main features will destroy the information of the original data. erefore, this study tends to the latter. Pearson correlation analysis was used to consider the correlation between the two sets of data, as shown in equation (9).
where n is the number of data contained in the dataset and the value of ρ X 1 X 2 is between −1 and 1. e larger the value, the stronger the correlation.
(2) Data Storage and Data Mapping Module. Data storage module is to store the processed data and provide persistent data service for virtual space. e stored data are mainly physical data and virtual data. Physical data mainly include task data, equipment data, container data, and so on. Virtual data mainly includes simulation data, model data, and decision data. Relational, nonrelational, and temporal databases are used to store various types of data. It provides reliable and reusable data resources for ACT operation analysis and decision-making. Data mapping is to establish the mapping operation mechanism between data through data structure information. It supports the synchronous mapping between physical data and virtual operation. Data timing analysis, data association, and data synchronization are included [8]. First, the temporal data model is built through the characteristics of the data. Second, the fast index and multiscale transformation of the data sequence set are used to complete data sequence analysis, and the complex network and related algorithms are used to realize data association. Finally, through the relationship between data and virtual real association rules, the ACT data network is established to complete the running state analysis.

Virtual Space.
Virtual space is mainly composed of twin space and service system, as shown in the upper part of Figure 2. e DT service system mainly provides scheduling scheme generation, operation equipment and process status monitoring, production information statistics, and other services for ACT operation. Twin space provides a virtual operation environment for the ACT. e construction of twin space mainly includes the whole factor entity modeling, the dynamic modeling of operation process, and simulation modeling of the ACT. e operation plan generated by the service system will be verified in the twin space. e twin space will feed back the verification results to the service system. e service system will adjust the operation plan by the results. e adjusted operation plan will continue to be sent to twin space for verification until the set optimization goal is met. ere will be iterative interaction between twin space and service system to ensure continuous operation of terminal. Twin space plays an important role in virtual space. To restore the scene more truly, the whole factor entity modeling and dynamic modeling of operation process for the ACT are described.
(1) Total Factor Entity Modeling of ACT Operation. e multidimensional twin model of mechanical devices, containers, and other production factors of ACT is constructed from geometry, physics, behavior, rules, and dimensions. e geometric model is constructed in compliance with the geometric feature parameters of the elements. e physical model is integrating the deep physical characteristics of the device based on the geometric model. e behavior model describes the behavior of the device according to the behavior relationship among the components. e rule model uses the XML language to describe the deduction and association rules of device to realize multidimensional modeling of elements. e multidimensional twin model of AGV is built (see Figure 3). e multidimensional twin model of each element is assembled and integrated according to the spatial layout and equipment connection relationship in the field operation. e multidimensional and multiscale model from parts to equipment to the whole working environment is constructed. In this study, 3D Max and Unity 3D are used to complete the rapid construction of the multidimensional and multiscale model, and the formal description is shown in Equations (10)- (14).
Journal of Advanced where PC′, PE′, PG′, and PN′ represent the DT multidimensional models corresponding to each production element. ⋈ refers to the natural connection between PC′, PE′, PG′, and PN′. PC i ′ represents the virtual entity corresponding to the ith container. PE i ′ represents the virtual entity corresponding to the ith handing devices. ∷ � means that can be defined as. G m S is a set of the geometric models. P m S is a set of the physical models. B m S is a set of the behavior models. R m S is a set of the rule models. ↔ 1: 1 represents the one-to-one mapping relationship. More importantly, the verification of the virtual real consistency of the DT model is required according to pass the virtual real consistency verification rules proposed by Tao et al. [44]. It can ensure the validity, correctness, and accuracy of the model (see Figure 4).
(2) Dynamic DT Modeling of Operation Process for ACT. In the process of ACT operation, there are many factors that affect the operation efficiency, and the operation state of AQC, AGV and AYC is coupled. It is difficult to build a more accurate dynamic mechanism model of operation. erefore, the DT-based dynamic mechanism model is proposed to accurately respond to the dynamic changes in the ACT operation (see Figure 5). Machine learning is a method of learning from data. rough the data training, the algorithm model is constructed and the goal optimization is completed based on the model. DTcan provide powerful data support for production process analysis. us, machine learning is used to train the DT dynamic model. Before entering the stage of machine learning, data acquisition, preprocessing, and feature engineering should be completed. All kinds of data operations are described in detail in the previous section of the data processing platform. Next, the training and verification, optimization, and deployment of the model will be described comprehensively.

(1) Model Training and Verification
Before training and verifying the DT model, the data segmentation is completed. e processed data set is divided into training set and verification set. Especially, the training set accounts for 80% of the total data set. e training set is used to build the prediction model. en the trained model is predicted on the validation set. e model is optimized according to the prediction results. e training goal is to build the mathematical relationship by the existing data and algorithm, as shown in Equation (15).
where Y t is the control target, X t is a real-time controllable variable, and ω is a real-time uncontrollable variable. e scenario of loading and unloading synchronization is considered. Assuming that the number of container tasks to be loaded and unloaded is N, the time to complete a container is C i and task start time is S i . V j refers to any scheduling scheme generated in ACT operation. e completion time of loading and unloading operation is shown in Equation (16). To minimize the completion time, the machine learning goal for the purpose of ACT   Journal of Advanced Transportation operation optimization can be described by Equation (17).
In the process of training, machine learning algorithms such as decision tree, XGBoost, AdaBoost, and random forest are used for model training. e decision tree is a machine learning algorithm that can solve classification or regression problems. e process of decision tree training is to split samples into subtree nodes by continuously selecting split attributes. e goal of splitting is to minimize the gain of each node. e calculation of gain value is shown in Equation (18).
where R 1 and R 2 represent the split branches of each node and c 1 and c 2 are the values returned by the two split child nodes. XGBoost algorithm is a typical boosting algorithm. It is a second-order Taylor expansion of the loss function, and a regularization term is added to the objective function to find the optimal solution of the whole, as shown in Equation (19). is makes the objective function and the complexity of the model balanced, and the over fitting phenomenon can be avoided.
where Ω(f t ) is the model complexity.   AdaBoost is not only an iterative algorithm but also a representative boosting algorithm. e weights α i are updated for errors in each iteration. e main idea of the algorithm is to add a new weak classifier h i (x) in each round until a predetermined small enough error rate is reached. M is the number of weak classifiers. As shown in Equation (20). However, random forest is a typical bagging algorithm based on decision tree. It uses the bagging technology to train some small decision trees and finally sets and averages the prediction results of these small decision trees to complete the construction of the forest model.
e verification set and evaluation index are used to verify different models and training results of different algorithms.
e accuracy score (A c ), root mean square error (RMSE), interpretable variance (Ivar), and fitting error (FE) were used to evaluate the model quality. e calculation of model accuracy is shown in Equation (21), and y i and y i are the predicted value and real value, respectively.
RMSE is the mean value of the square root of the error between the predicted value and the true value, as shown in Equation (22). e smaller the value is, the smaller the error is. A minimum value of 0 indicates a perfect model. A minimum value of 0 indicates a perfect model.
Ivar index is the similarity between the dispersion degree of the difference between all predicted values and real values and the dispersion degree of the sample itself, as shown in Equation (23). e maximum value is 1. e larger the value is, the closer the dispersion degree of prediction and sample value is.
e calculation of FE is shown in Equation (24). Its true value and absolute value can be represented by a curve to intuitively compare the efficiency of different algorithms.
(2) Model optimization and deployment Safety has always been the most concerned problem in ACTs. Before the model is put into production, the real-time data must be used to complete the model test in the test site to verify the effectiveness and safety of the model. e model is optimized by realworld test results and feedback from relevant technical departments. e optimized DT model can be deployed online. e DT model needs to be connected with the IIoTand TOS systems in ACT to fully obtain real-time data. On the contrary, the DT model will provide feedback for the running results to the system so that the system can change the improper places in time. en, the control command is sent to the control and execution module to realize the operation optimization of the ACT.

Background.
e Shanghai Yangshan phase IV automated terminal is located on the west side of the Yangshan deep water port and opened for trial operations in December 2017. It is the largest single container terminal with the highest degree of comprehensive automation in the world. At present, 21 AQCs, 108 AYCs, and 110 AGVs have been put into operation. e terminal handling operation is mainly composed of automated handling equipment for terminal handling, horizontal transportation, yard handling, and automated terminal production control system. e system structure and operation process of the terminal are seen in Figure 6. With the development of artificial intelligence, automation, and other technologies, the ACT has obvious advantages in improving handling efficiency and production safety. To further improve the productivity and operation safety of the terminal on the original basis, the Shanghai Port launched a terminal intelligent operation management and control project, including the use of big data, machine learning, and other technologies to complete the development of the DT system.

Parameter Setting.
Container loading and unloading is the core of terminal operation. In the process of loading and unloading operation experiment, the parameters of the ship demand, the quantity of equipment required, the number of tasks, and the distribution of the container area are set (see Table 1). Before the test experiment started, the battery should be fully charged and the tire pressure should be normal. In addition, consider the power consumption and fault conditions under continuous operation.

Model Training.
e target of DT model training is to minimize the makespan. e speed of handling device, waiting time between devices, failure rate, and other factors will cause the change of target value. After obtaining the historical data of terminal operation, the missing data will be supplemented by Equation (6). en, the AGV running speed is set as the data sampling frequency benchmark in many indicators that affect the completion time. e data sampling interval is F 1 � 20 s. e other indexes are resampled using Equation (8). Finally, the correlation between indicators is analyzed through Equation (9).
Based on historical data, the DT model is trained by AdaBoost, random forest, and XGBoost algorithm. To make the experiment more convincing, six time points with large interval were randomly selected as the reference points of training. e time interval is 30 days, and the test times of each algorithm is 5. In addition, the data for 75 days ahead of each time point are used as the training set, and the data for 15 days after are used as the verification set to compare the differences between the algorithms. e FE curve (see Figure 7) intuitively shows that the AdaBoost algorithm is superior to the random forest and XGBoost algorithm. Equations (21)-(23) are used for further comparative analysis to comprehensively evaluate the model (see Table 2). e results in Table 2 show that the model trained by the AdaBoost algorithm is better than other algorithms though comparing A c , RMSE, and Ivar. us, the AdaBoost algorithm is chosen to train the DT model.

Results and Discussion.
rough the verification, optimization and deployment of the DT model in the previous section, the DT-based experimental platform is built (see Figure 8). Because the actual use of the system is classified, the pictures shown in this paper are the simulation process pictures of the laboratory computer. Driven by the real-time data of physical space, the DT model is optimized by machine learning to ensure the synchronous operation between physical and virtual. rough the iterative interaction between the service system and twin space, the real-time management and control of terminal operation process are realized. In the case of loading and unloading synchronization, the goal of machine learning is set to minimize the completion time. Two groups of small-scale and large-scale tests have been carried out on the conventional and DTbased terminal operation modes. e number of tasks between 5 and 30 is defined as a small-scale problem. e number of tasks between 30 and 200 is defined as a largescale problem. e data involved in the routine ACT operations are recorded during the actual test results according to the number of tasks. e DT-based ACT process completes the test in the same scenario and records the relevant data information. Compared with the conventional ACT operation process, the DT-based performance evaluation indicator on the ACT operation process is analyzed.
Among these indicators, the operation time of AQCs refer to the time taken by AQC to complete a task multiplied by the average number of tasks for each AQC. e turnaround time of AGV refers to the task completion time of AGV minus the time of receiving tasks. e operation time of AYC is the time taken by AYC to complete a task    System time-consuming refers to the time taken by the system to issue instructions during operation. e makespan is the time taken to complete the last task, that is, the difference between the end time of the last task and the start time of the first task.

Test Results for Small-Scale
Problems. e handling operations with 20 tasks were tested, including the results of two modes of conventional operations and DT-based operations. e test cases are evaluated by the performance indicators of AQCs, AGVs, and AYCs. Ten tests were carried out for handling operations under this scale, and the evaluation results are shown (see Table 3). e performance indicators value of conventional terminal operation comes from the real operation data and the calculation results of the data. e DT-based terminal operation can continuously optimize and update the DT model by sensing the data of physical space in real time. e situation of on-site operation is adjusted through the driving of real-time data to deal with the congestion and waiting in the operation. It can be seen from Table 3 that the tasks completion time of the DT-based terminal operation is 23.34% less than that of conventional terminal operation. Obviously, the operation efficiency has been improved. However, there is an iterative interaction process in the DT-based terminal operation, which is more time-consuming than the conventional operation mode. According to statistics, the system time-consuming is 51.47% higher than that of conventional terminal operation system in small-scale problems. Moreover, the twin space will provide feedback to the system based on the location and performance of the devices. e system assigns the task to the nearest device to reduce the waiting time of the operation by the feedback results.
e operation time of AQCs and AYCs and the turnover time of AGVs in DT-based terminal operation have been reduced to a certain extent, which are reduced to 23.52%, 21.53%, and 24.76% of conventional operation mode, respectively. e operational efficiency of AQCs has also been improved. In terms of the failure rate, compared with the conventional mode, the number of failures is greatly reduced due to the prediction characteristics of DT. In the process of DT-based terminal operation, the system will predict the possible failure time of the devices by the motor power, tire pressure, and other relevant performance data. During this period, the use of these devices is reduced to reduce their failure rate.

Test Results for Large-Scale Problems.
e test results of two modes in a large-scale problem with 120 tasks (see Table 4). It can be seen from Table 4 that DT-based the terminal operation mode is still superior to the conventional  terminal operation mode. Compared with small-scale operation, DT based the operation mode has been further improved from 23.43% to 31.46% in operation efficiency. In addition, the system time consumption is reduced from 51.47% to 38.55%. e main reason is that the DT-based terminal operation mode uses machine learning to continuously learn the states in the operation process and update the DT model. is makes the operation mode more suitable for the actual production situation. Although the system time-consuming is inferior to that of the conventional terminal operation mode, the gap between the two modes has been narrowing. It is believed that after repeated iterations, the DT-based terminal operation mode will be better than the conventional operation mode in terms of system time-consuming. To sum up, the test results of two scales show that DTbased terminal operation mode can effectively optimize the production control under the same production environment by setting the goal of machine learning as the minimum completion time. e continuous training and optimization of the DT model also ensures the operation safety of the terminal. e DT application framework and system in this paper have certain theoretical and practical significance for the realization of rapid and efficient production control of the ACT. At the same time, DT application in the ACT has guiding significance for port and transportation industry.

Conclusions
is study proposed a safety operation optimization framework integrating DT with the AdaBoost algorithm at ACTs. To provide a more complete data set for the training of the DT model, we employed data cleaning, data resampling, and correlation analysis to deal with the problems of missing values and inconsistent sampling frequency in the original time series data. After that, a multidimensional and multiscale DT model is constructed. AdaBoost, random forest, and XGBoost algorithms are used to train the DT dynamic mechanism model to deal with the occurrence of uncertainty in the operation. More specifically, the quality of the model is comprehensively evaluated by A c , RMSE, Ivar, and FE so as to determine which algorithm to be used for model training. en, the trained model is tested by realtime data of terminal operation, and the model is optimized and deployed online according to the results. Based on the proposed framework, the DT system is developed in the ACT. Different scale experiments are performed to demonstrate the integrated application of DT and AdaBoost algorithm at the ACT. e experimental results show that the DT-based terminal operation mode has higher loading and unloading efficiency than that of the conventional terminal operation efficiency.
In future, the following aspects can be further carried out to expand this research. First, the research on the construction of the high-fidelity virtual space model has a certain value. Second, the construction of complex DT dynamic mechanism model and the selection of model evaluation index are worthy of further study. Finally, the application of an integrated DT framework and modeling method in AGV scheduling will also be an important research direction. Based on the DT, a new generation of virtual real combination and intelligent decision-making AGV scheduling twin environment is established to solve the complex AGV scheduling problem with multiresource integration and strong dynamic real-time performance.
Data Availability e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.