Nonlinear Model Predictive Control for Heavy-Duty Hybrid Electric Vehicles Using Random Power Prediction Method

A primary challenge to the implementation of hybrid electric vehicles (HEVs) is the design of the energy management strategy for the vehicle. Most conventional strategies have been designed for passenger vehicles using rule-based or optimization-based control strategies that rely on navigation support; therefore, the optimal performance of heavy-duty HEVs that lack navigation support cannot be achieved using conventional strategies. In this study, we propose a nonlinear model predictive control (NMPC) for heavy-duty HEVs based on a random power prediction method. To obtain the models of multiple power sources, we analyzed the structure and powertrain of the vehicle using mathematical modeling methods. To account for the lack of navigation support, we used the data-driven prediction method by combining the grey model and Markov chain methods to obtain higher-accuracy ultra-short-term power prediction. Considering the predicted disturbance power, we established a multi-objective optimization function with explicit constraints to optimize fuel consumption, bus voltage, and battery state of charge. Under these constraints, a nonlinear programming problem based on the NMPC could be restricted to find an optimal numerical solution in real time. We validated the control strategy on a hardware-in-the-loop simulation platform and compared its results with those obtained using thermostat control, fuzzy, and dynamic programming approaches. The proposed control strategy achieved a considerably better all-round performance than rule-based control strategies; moreover, the results were considerably similar compared with those of offline global optimization strategies. Furthermore, the proposed method achieved excellent real-time operation capability, thereby providing a valuable reference for practical engineering applications.


I. INTRODUCTION
Hybrid electric vehicles (HEVs) with more than two power sources have the potential to save energy and reduce emissions and noise. As such, HEVs have gained the attention of major car manufacturers and are widely used in civil, industrial, and military fields. Over the years, engineers have conducted many studies on the energy management strategies that can be adopted by HEVs [1]- [18]. In terms of the working principle, such strategies can be roughly divided into two types: rule-based and optimization-based. Ramadan et al. [1] established a Petri net strategy based on The associate editor coordinating the review of this manuscript and approving it for publication was Bohui Wang . a global positioning system (GPS) and used battery capacity management to reduce fuel consumption. Li et al. [2] proposed a torque-leveling threshold-changing strategy for parallel HEVs, which reduces hydrogen fuel consumption while maintaining the battery's state of charge (SOC) at close to the ideal value. Rule-based energy management strategies such as these enable flexible adjustment and excellent real-time performance, and they are easy to calculate; however, they have poor adaptability to external changes and lack clear optimization goals. To overcome these problems, Wang et al. [5] proposed a global optimization method based on the Pontryagin minimum principle (PMP) that improved both motor efficiency and fuel economy by 40%. Larsson et al. [6] used an analytic solution for the dynamic VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ programming (DP) subproblem for a plug-in HEV to reduce the computation time. In general, calculating the global optimization strategy is complicated and requires that global operating conditions be preset. Even if an optimal solution can be obtained, the complete calculation process cannot be applied in real time [10]. To achieve real-time optimization control in HEVs, Rezaei et al. [11] introduced the equivalent consumption minimization strategy (ECMS) to HEVs and improved its estimation method for equivalent factor bounds, thereby improving vehicle fuel economy. Real-time optimization requires less prior knowledge, and it can be used to obtain optimal or sub-optimal solutions in the finite time domain. Therefore, the control strategy based on realtime optimization has gradually become a hot spot in current researches. Before applying real-time optimization control to heavyduty HEVs without navigation support, two fundamental problems must inevitably be solved. The first is ensuring the predictive accuracy of the reference information, which is used to describe the future state of the controlled object; the second is ensuring that the optimization algorithm can be calculated within a limited time domain. In terms of the former, most previous studies adopted external navigation support such as GPS and intelligent transportation systems (ITSs) [14], [15]. However, heavy-duty HEVs often work under atypical road settings in which the ground conditions cannot be collected and modeled in advance. In such cases, it is impossible to apply these prediction methods based on navigation support, which are usually used by civilian vehicles. Unique prediction methods must be developed for heavy-duty HEVs instead. Researchers have proposed a variety of data-driven prediction methods [16], [17] that have high predictive accuracies for steady-state change trends but low accuracy in predicting under randomly changing trends. To address this gap in the literature, we propose a joint data-driven prediction method using the grey model and Markov chain approaches to improve the accuracy of load power prediction under unsteady conditions. To improve the vehicle control effect, researchers applied real-time energy management strategies such as ECMS to HEVs, which achieve sub-optimal control effects. However, these open-loop optimization control strategies may cause large deviations under inaccurate modeling and external disturbance conditions. Therefore, we further propose a nonlinear model predictive control (NMPC) method to repeatedly solve the closed-loop optimization problem within a finite time domain, which can help improve robustness and anti-interference ability.
The remainder of this paper is structured as follows. Section II introduces the topology of heavy-duty HEVs and establishes various power source models. Section III introduces the proposed power prediction method based on the grey model and Markov chains. The NMPC method is introduced in Section IV. In Section V, we describe the experimental validation results; finally, a summary is presented in Section VI.

II. HEAVY-DUTY HYBRID ELECTRIC VEHICLE MODELING
Heavy-duty HEVs are powered by hybrid electric systems that support the regular operation of various electrical components. According to the direction in which power flows, heavy-duty HEV drive systems can be divided into front and rear power chains. The front power chain includes three power sources-the engine-generator set, power battery pack, and supercapacitor-which are used to convert fossil into electric energy and then store it. The rear power chain includes multiple drive motors to enable the mutual conversion between electric and mechanical energy [18]- [21]. The structure of a hybrid electric system is shown in Fig. 1. The development and validation of energy management strategies depend on the use of reliable simulation models [22]- [24]. In this study, principle and data modeling were used to analyze the external characteristics of various controlled objects, which is in line with our goal of developing a control strategy.

A. ENGINE MODELING
In heavy-duty HEVs, a diesel engine is used to drive electric generators that serve as the main power source of the hybrid electric systems. The fuel characteristic surface of a diesel engine is shown in Fig. 2.
The fuel consumption rate of a diesel engine is determined by its speed and torque (Fig. 2), and it changes by an operating point as where m eng , T eng , n eng , and ϕ denote the fuel consumption rate, engine torque, engine speed, and two-dimensional mapping function. To achieve the best overall efficiency, the engine must track the best fuel consumption curve. In the HEV case, the engine speed and torque are related to only electric power through the following functional relationship.
T eng = f T (P eng ) n eng = f n (P eng ), ( where f T and f n are the best torque function and the best speed function of the engine, respectively; P e is the mechanical power output of the engine. Eqs. (1) and (2) can be combined and simplified to obtain m eng = f eng (P eng ). (3)

B. GENERATOR MODELING
A permanent magnet synchronous generator (PMSG) has a high-power density and is suitable for use in vehicles as a high-power generating device. Following controllable rectification, the PMSG transmits DC power to the DC bus. A PMSG efficiency map is shown in Fig. 3.   3 indicates that generator efficiency is a twodimensional function of speed and torque that can be expressed as As the external mechanical characteristics of the engine are consistent with those of the generator, they can be rigidly connected without using a reduction gear to obtain T gen = T eng n gen = n eng .
Finally, Eqs, (4) and (5) can be combined and simplified to η gen = f gen (P eng ). (6) C. BATTERY MODELING Power batteries have high energy densities and can continuously store or release electrical energy at specific power levels. Because the external characteristics of the lithium-ion battery are the most important aspect of control-oriented modeling, the battery system can be represented as the first-order equivalent circuit model [25], [26] shown in Fig. 4. By applying Kirchhoff's voltage law, the voltage balance can be given as where U out and U oc denote the battery terminal voltage and open-circuit voltage, respectively, I batt denotes the current running through the circuit, and R batt represents the internal battery resistance. The power balance relationship is given by where P batt denotes the external power of the battery. For using Eq. (8) as a quadratic equation of one variable about I batt , we can get From the definition of the battery SOC [27,28], we obtain where SOC represents the current SOC of the battery, respectively, and Q full , Q rem , and Q used represent the rated, remaining, and used capacities of the battery, respectively. Q used can be expressed as where SOC ini represents the initial SOC of the battery. Substituting Eq. (11) into (10), we get the derivation of Eq. (10) as VOLUME 8, 2020 Eqs. (9) and (12) can be combined and simplified to In general, the open-circuit voltage and resistance are related to the temperature and SOC [29], [30]. To simplify this calculation, we assume that the working temperature of the power battery remains unchanged; therefore, the influence of temperature can be ignored. The functional relationship above can then be re-expressed as Eq. (14) is substituted into Eq. (13), and it can be simplified to obtain

D. DC/DC MODELING
The DC/DC efficiency loss represents a combination of switching and internal resistance losses [31]. Because the switching frequency is pre-determined, the switching loss can be approximated as a fixed value, leaving only the power effect to be considered in determining the DC/DC efficiency (Table 1).  Table 1 indicates that the DC/DC efficiency is closely related to the current and can be fit using the least-squares method to the following polynomial.
η dc = f dc (P batt ), (16) where η dc denotes the DC/DC efficiency and P dc denotes the power input from the battery to the DC/DC.

E. SUPERCAPACITOR MODELING
As energy storage devices with extremely high power densities, supercapacitors can be used in powertrains with fast power response speeds to support the DC bus voltage and protect the power battery. The charging and discharging processes of a supercapacitor can be simulated using a series RC model [32] (Fig. 5). Based on the characteristics of a supercapacitor, the relationship between instantaneous operating current and voltage can be expressed as where I sc, C, and U sc denote the supercapacitor current, capacitance, and supercapacitor voltage.
Multiplying the left-and right-hand sides of Eq. (17) by the voltage provides where P sc represents the supercapacitor power. Because a supercapacitor has low internal resistance losses, its efficiency can be approximated as a fixed value (0.96), thereby providing the DC bus power which can be simplified tȯ

F. MODELING VEHICLE DYNAMICS
In the driving process, a vehicle will always obey Newton's second law, i.e., force equals mass times acceleration (F = M ×A) [33], [34]. Based on this, the vehicle dynamics balance equation can be derived as where F, δ, m, v, C d , A f , µ, g, and α denote the total driving force between the tire and the ground, equivalent rotational coefficient, vehicle mass, vehicle speed, air resistance coefficient, windward surface area of the vehicle, rolling resistance coefficient, gravitational acceleration, and vehicle gradient angle, respectively. The power balance equation at the tire becomes The power converted to the DC bus side is then where η motor and η inv are the motor and inverter efficiencies, respectively.

III. RANDOM POWER PREDICTION METHOD A. GREY MODEL METHOD
A grey model is a hybrid white-and black-box model through which an approximate exponential trend of generated data can be obtained by accumulating a messy and irregular original sequence. Based on this, a grey model can be constructed to predict the future output [35], [36] using the following steps.

1) GENERATE CUMULATIVE SEQUENCE
An accumulation of the following original sequence can be performed to obtain the approximate exponential sequence from which the following cumulative sequence is obtained as

2) CONSTRUCT A FIRST-ORDER DIFFERENTIAL EQUATION
The first-order generated sequence in Eq. (26) will follow an approximate exponential growth trend [37] expressed as where a and b denote the model development and coordination coefficients, respectively.

3) ESTABLISH GREY FORECAST MODEL
The grey differential equation is expressed as in which the value of the intermediate variable z is The future time-domain recursion is given by By assigning the following parameters to the matrices in Eq. (30) . . .
the least-squares method can be used to find the optimal By substituting these optimal parameters into Eq. (28) and then discretizing the relation, we obtain

4) DATA PREDICTION AND RESTORATION
The result of the direct prediction process is an accumulation that requires a first-order subtraction operation to restore the actual data. Following the restoration process, we obtain Substituting Eq. (33) into Eq. (34), we can get

B. MARKOV CHAIN METHOD
A Markov chain can be used to determine the probability of transition from the current to the next state under the application of a random process [38], [39]. The proposed method applies the following Markov process.

1) DETERMINE DISCRETIZATION TIME AND STATE
First, the sampling time must be determined and the data must be recorded at this rate. The data universe can then be divided and discretized into m state variables, after which the data can be classified into different state intervals.

2) CALCULATE THE ONE-STEP STATE TRANSITION PROBABILITY MATRIX
Defining M ij as the number of transitions from state i to state j and the row sum value M i as the total number of transitions from i to other states, the one-step state transition probability p ij is given by which can be used to construct the one-step state probability matrix P as in which VOLUME 8, 2020

3) FORECAST FUTURE STATE OUTPUT VALUE
If the observation at the current moment x k belongs to the state E i , the maximum value in row i of matrix P is from which it can be predicted that the observational value x k+1 at the next moment is most likely to transfer to E j .

C. GREY MARKOV CHAIN FRAMEWORK
The grey model is suitable for predicting steady-state conditions that follow a trend, while the Markov chain is suitable for predicting results under unsteady state conditions. As the load power follows a broadly ranging trend and contains small random changes [40], a single method cannot be used to accurately predict it. Therefore, the proposed method adopts a combined grey and Markov chain forecasting model in which the former and latter are used to predict the steady change and unsteady residual change components, respectively. Adopting this approach allows the proposed method to take full advantage of both models, thereby improving the accuracy of prediction (Fig. 6).
The driving cycle of a vehicle can be represented as a curve showing the relationship between vehicle speed and time to characterize specific driving types. To assess the effects of using different prediction methods, we chose the highway fuel economy test (HWFET) cycle as an example. This driving cycle is specially designed for heavy-duty HEVs, and its speed curve is shown in Fig. 7. The research object of this study was an 8 × 8 heavy-duty HEV with the primary parameters listed in Table 3. The power required to drive the vehicle to achieve the speeds shown in Fig. 7 can be calculated using Eq. (23), which can be used to obtain the power map on the DC bus shown in Fig. 8.
We first applied the grey model to predict the load power at a prediction step of 1 s, with the results shown in Fig. 9.    prediction results of the grey model, with the results shown in Fig. 10. Fig. 10 indicates that the prediction residuals of the grey model constitute a series of stationary random numbers with a mean value of zero. Markov chains are particularly suited to modeling this type of data, and based on the data characteristics of the residual sequence, we established a Markov prediction model by dividing the residual sequence into 25 state intervals and then establishing a final Markov state transition matrix (Fig. 11). The Markov model established using the measured residuals was then used to predict residuals that were used in turn to correct the grey model prediction values. The joint prediction results produced by the combined model are shown in Fig. 12. The application of the combined grey Markov chain joint model improves the forecast accuracy, thereby indicating the effectiveness of the detailed partial method. Table 2 lists the root mean square error (RMSE) results used as an evaluation indicator. The results in the table show that the grey Markov chain joint prediction method has a high degree of accuracy, meets the requirements of a predictive model, and can provide more accurate load power prediction than a simple predictive control algorithm.

IV. NONLINEAR MODEL PREDICTIVE CONTROL ALGORITHM
NMPC is an advanced computer control method initially developed for use in the industrial field [41], [42]. It comprises three parts-a predictive model, receding horizon optimization, and feedback correction-and it can explicitly handle constraint optimization problems online while solving model mismatch and external disturbance problems [43], [44]. NMPC applies both self-tuning and optimization control, making it suitable for heavy-duty HEV control systems that require high real-time performance.
A. PREDICTIVE NMPC MODEL 1) NONLINEAR MODEL x The overall output power of the engine-generator set to the DC bus is given by which can be simplified to y The power output from the power battery pack and bidirectional DC/DC to the DC bus is given by which can be simplified to z From the power balance relationship, the complete power expression can be obtained as where P req is the power required by the load. | The voltage relationship in the supercapacitors can be restated asU sc (t) = f sc (P sc_bus , U sc ).
} The state-space equation can be used to establish a complete prediction model in which the variables are where x, u, v, and y are the state, control, disturbance, and output variables, respectively, and P req is the power demand converted from the load to the DC bus. Finally, the hybrid power system can be uniformly described using the following nonlinear model [45], [46]: where f and g are the status and output update functions, respectively.

2) LINEARIZATION
The nonlinear model has a complicated form and an expensive solution time, making it unconducive to online realtime application. To simplify the model, a first-order Taylor expansion can be performed at the steady-state equilibrium point [47]- [49] to obtain The linearized prediction model can be re-expressed as in which the respective coefficients can be expanded as 202826 VOLUME 8, 2020

3) DISCRETIZATION
By discretizing the predictive model using the forward Euler method [50], we obtain where T s is the discrete sampling period.
If the initial state and future control sequence are known at a given moment, the future output sequence will bê y = cx + Dû + Ev + Gf, (53) in which each variable and coefficient can be expanded in (54), as shown at the bottom of the page.

B. OPTIMIZATION SOLVER OF NMPC
HEVs do not have external charging devices; instead, they use power maintenance-type batteries [51], [52]. Combining this feature with the factors of fuel consumption and DC bus voltage, the objective function can be established as where J is the optimization objective function, P is the prediction time domain, α, β, and γ are the weight coefficients of the fuel, SOC, and bus voltage terms, respectively, and m eng , SOC ref , and U ref are the engine reference fuel consumption, battery reference SOC, and bus reference voltage, respectively. The variables listed above have several hard constraints owing to the limitations of the actual implementing agency capacity Owing to these constraints, the optimization problems above cannot be solved analytically and must instead be calculated numerically. For the convenience of calculation, the nonlinear programming form can be transformed into the standard quadratic programming form [53] to obtain where H denotes the Hessian matrix and b represents the bias vector.

VOLUME 8, 2020
The optimal control quantity solution can then be expressed as where M is the control time domain and k + 1|k is the estimated value at each future time given the current value k. Only the first control variable is considered during each calculation cycle.

C. FEEDBACK CORRECTION OF NMPC
As a result of the adverse effects of model mismatch, time-varying parameters, and external disturbances, a prediction model will not be entirely consistent with the actual physically controlled object. To improve the stability and robustness of the predictive system, closed-loop control can be introduced.
The systematic error e(k) is defined as where y(k) and y(k|k) are the actual output of the controlled object and the prediction model's output at the current moment, respectively. Assuming the future error remains unchanged, the forecast output will be y p (k + i) = y(k + i|k) + e(k), i = (0, 1, · · · , P − 1), where y p (k +j) and y(k +j|k) are the outputs of the prediction system after and before correction, respectively, and P is the width of the prediction time domain.

D. FRAMEWORK OF REAL-TIME ENERGY MANAGEMENT STRATEGY
By combining random power prediction and NMPC, a new type of real-time energy management strategy can be established, as shown in Fig. 13. The NMPC predicts the future load demand power by applying the grey Markov chain model with the current value used as the initial value. It then compares the obtained value with the actual value in the preceding interval and applies the difference as feedback to correct the prediction result. Using the predicted output, the constrained optimization problem can be explicitly solved in the finite time domain, with the relative distributions of usage of power sources such as the engine-generator set and power battery pack determined through a real-time optimization algorithm for which the first optimization result is used as the actual control quantity. This use of feedback correction enables the closed-loop control to be performed as a whole, thereby enhancing its ability to combat model mismatch and external disturbance.

V. RESULTS AND DISCUSSIONS
To validate the real-time controllability and effectiveness of the proposed energy management strategy, we constructed a co-simulation platform based on Vortex (CM-LABS Company), RT-Lab (OPAL-RT Technologies), and dSPACE (dSPACE Company) (Fig. 14).
The heavy-duty HEV dynamics model and road surface model were established in Vortex, and it was used as a simulation node and output graphical interface. The mathematical models of the vehicle were established in the host computer of RT-Lab. After converting the simulation models into C code using the built-in compiling function, we downloaded them into the real-time simulator of RT-Lab lower computer via Ethernet connection. The NMPC model was established in dSPACE. To download the control model to the actual vehicle electronic control units (ECUs), we first used the Targetlink toolbox in dSPACE to convert the packaged modular model into efficient C code, and then, we downloaded C to the actual vehicle ECUs through CCS (TI Company). The Vortex, RT-Lab, and EUCs exchanged data through the CAN bus, with the CANoe (Vector Company) device observing and recording data in real time. The dricab communicated with the actual vehicle ECUs and Vortex through a serial port and transmitted the operation intention of the driver to the HIL simulation platform. The real-time simulation platform was used to simulate real vehicle applications to improve the confidence of the control strategy.
To investigate the working effect of various energy management strategies under different working conditions, we selected two typical working conditions [54], [55] under the heavy duty urban dynamometer drive schedule (HUDDS, 0-1,060 s) and HWFET (1,061-1,820 s), and we sequentially combined them to obtain a comprehensive test work. This working condition could fully simulate high-speed driving and rapid acceleration and deceleration of vehicles. The relationship curve between speed and time obtained under this working condition is shown in Fig. 15. The heavy-duty HEV was an improved in-wheel motordriven model based on an eight-wheeled armored vehicle. Its primary technical parameters are listed in Table 3.
Because the parameters of the nonlinear model controller have a significant impact on the control effect [56], we ensured that the sampled data were not distorted. In accordance with the Nyquist sampling theorem, we set the discrete sampling period to 2 ms. In accordance with the computing capability and component response speed of the chip, the control command output period was set to 1 s; to overcome the internal delay in the actual components, the control time domain had to be less than the predicted time domain, and therefore, the two values were set to 3 s and 10 s, respectively. As the power battery could not be connected to an external charging device and required a feedback energy storage function, the SOC had to be maintained within a reasonable range and was therefore set to 0.5. Because overvoltage in the bus voltage would cause the electrical equipment to burn out, while insufficient voltage would cause the equipment to malfunction, the target voltage was set to 750 V to maintain it within a stable range. To ensure that the engine fuel consumption was always positive while keeping the fuel consumption data as low as possible, the target fuel consumption was set to 0 g/kWh.
To validate the control effect of the proposed energy management strategy, we selected three representative control strategies for comparison: a thermostat control strategy, fuzzy control strategy, and DP control strategy. After optimizing the control parameters of the respective strategies, we used them to control the heavy-duty HEV and then recorded the results.

A. THERMOSTAT CONTROL
As a typical fixed-rule-based control strategy, the thermostat control strategy is widely used in industrial control. It takes VOLUME 8, 2020 the battery SOC as a reference input and determines the start and stop states of the engine by setting the upper and lower boundaries of the SOC hysteresis interval, respectively. Based on safety and efficiency considerations, we set the upper and lower bounds of the SOC hysteresis to 40 % and 80 %, respectively. Fig. 16 indicates that the thermostat control strategy uses the battery and engine-generator set power as the main and auxiliary power sources, respectively, with the two jointly providing energy for the load. The dynamic response of the battery is extremely fast, and it can cope with a specific range of power changes over time, which weakens the charging and discharging frequencies of the supercapacitor. Owing to the extensive hysteresis range of the battery SOC, the switching frequency between the battery charging and discharging states is low. Although the battery can assist in adjusting the operating point of the engine-generator set, the process is not spontaneous and requires external control commands to guide it. The distribution of engine operating points obtained using the thermostat control is shown in Fig. 17. The operating points are widely scattered-primarily in low-efficiency areas-in a distribution that is likely to adversely affect fuel economy.

B. FUZZY CONTROL
Fuzzy control is a control strategy based on ''fuzzy'' (as opposed to ''crisp'') rules. Owing to the ability of this approach to process uncertain information and its high code execution efficiency, it has been successfully applied in several fields. According to the actual characteristics of the controlled object, we formulated several dozen fuzzy control rules before finally establishing a fuzzy controller with two inputs and two outputs: demand power and battery SOC and engine-generator set and battery target power, respectively. The controller was then applied in the HIL simulator and the output results were recorded.
The fuzzy control strategy balances the two factors of load demand power and battery SOC and allocates the target power  of each power source (Fig. 18). Because the engine-generator set undertakes the primary power output task, its amplitude and frequency closely follow the load power demand. However, because of its slow dynamic response speed, batteries and supercapacitors are required to provide auxiliary power during the dynamic adjustment process.
Under the guidance of fuzzy control strategies, batteries and supercapacitors can be given more opportunity to participate in power control, thereby reducing the amplitude and slowing the rate of changes in engine output power. Fig. 19 shows that the engine operating points obtained using fuzzy control are primarily distributed within the low-speed area, in which the fuel economy is suboptimal.

C. MPC
The proposed model uses the MPC strategy representative of a branch of the real-time optimization control strategies,  The MPC considers the bus voltage, battery SOC, and fuel consumption as optimization targets and performs real-time scheduling for multiple power sources. It is evident from the figure that up to 1,100 s, the load demand power changed rapidly; during this period, the engine was more involved in producing the output, and the battery and supercapacitor were charged and discharged rapidly. After 1,100 s, the load demand power changed relatively smoothly, with the engine-generator set outputting extra power to provide electrical energy for the battery, which was switched to the charging state.
As shown from the distribution map in Fig. 21, the operating points of the engine are distributed around the optimal fuel consumption curve, which covers the complete power range. This result mostly benefits from the fuel optimization item of the MPC control method, which ultimately reflects a better fuel economy.

D. DP CONTROL
DP is an offline optimization control strategy with an enormous computational burden. Because it produces optimal control results, they can be used as evaluation criteria for other control strategies. To calculate the global optimal control sequence, the DP control strategy must know the load power over the full-time domain in advance. The simulation results obtained using DP control are shown in Fig. 22. Because a DP controller will have prior knowledge of the overall driving conditions, in theory, each power source can be optimally controlled. It is seen from the DP controller power distribution results in the figure that the enginegenerator set played the primary power output role and that the large inertia of the engine was fully considered in the simulation. As a result, the amplitude and frequency of the output power of the battery and supercapacitor were reduced along with the pressure on the auxiliary power source to perform power compensation. Fig. 23 shows that the engine operating points are densely distributed in the high-efficiency area under the optimal control sequence, with only a few points scattered within the low-efficiency area because of factors such as dynamic adjustment. Intuitively, the working state of the engine has been optimized to a large extent, which has had a specific promoting effect on improving the working efficiency of the vehicle.

E. COMPARISON
To compare the control effects of the respective energy management strategies, we selected several performance indicators for horizontal comparison.
The change curves of battery SOC are shown in Fig. 24. The thermostat control strategy adopted a hysteresis control method to limit the battery SOC to within a reasonable range of 40 % to 80 % and, thus, produced a generally monotonous change trend between the upper and lower SOC boundaries. In the first stage of the cycle test, the load power was at a low-medium level and the engine carried out the main powering task. As a result, the battery SOCs obtained under the other three control strategies were roughly stable at approximately 60%. In the second stage, however, the load power entered a steady high state. During this phase, the fuzzy control and MPC strategies increased their SOCs, gradually absorbing electric energy from the engine-generator set, while the SOC under the DP control followed a downward trend in which it continued to release electric energy to the DC bus. Fig. 25 indicates that the voltage change curves are considerably different under the respective control strategies.  In general, rising voltages were obtained under the thermostat and DP controls, whereas the fuzzy control and MPC produced falling and then rising voltages. In particular, the engine-generator set provided the main power components under the fuzzy control, which maintained a higher battery SOC; as a result, the bus voltage dropped to approximately 590 V, and the voltage stabilization progress was worse than under the other three control strategies. Around 1800 s, the load power of the rear power chain quickly dropped to 0 kW, while the front power chain had inertia. Meanwhile, the front and rear power were unbalanced, which caused some fluctuations in the bus voltage.
To evaluate the effects of the respective control strategies quantitatively, we applied the RMSE as an indicator for reflecting the deviation between the obtained and target bus voltages and battery SOCs. To compare the respective fuel economies, we used the average fuel consumption per 100 km as an evaluation indicator. The final indicator values for the respective control strategies are listed in Table 4.
It is seen from the table that the evaluation indicators differ significantly by the control strategy. For the bus voltage, the control effects can be ranked from best to worst as DP > Thermostat > MPC > Fuzzy. The RMSE for the best-performing DP control effect is 17.27 V, and it is the result of the clear optimization of the bus voltage in the DP optimization objective function. The thermostat control uses the battery as the primary power source, and because the battery has a fast dynamic response capability, this strategy also produces a good voltage stabilization characteristic. Similar to the DP control method, the objective optimization function of MPC optimizes the bus voltage; however, as a result of its real-time calculation limitations, the results are optimized to a lesser degree. Because the fuzzy control does not consider fluctuations in the bus voltage, it cannot optimize the bus voltage; thus, it produces the worst voltage stabilization effect.
In terms of the battery SOC, the control effects can be ranked from best to worst as MPC > DP > Fuzzy > Thermostat. As MPC and DP are both optimization-based control strategies, their optimization objective functions include a battery SOC optimization item, and therefore, the results produced by the two methods are very similar, with the numerical difference attributable to the difference in performance between the offline and online calculation methods. As one of the two control inputs of fuzzy control is battery SOC, the approach has the specific ability to stabilize this parameter, which can be compensated for by controlling the increases or decreases in the output of the other power sources. However, because the approach applies fixed rules, its optimization effect is weaker than that under optimization-based control strategies. Because the thermostat control uses the battery as the primary power source, the SOC significantly changes within the boundary range, and, as a result, the thermostat cannot stabilize the SOC within a narrow range. This feature ensures that the thermostat control performs the worst in terms of SOC performance.
In terms of fuel consumption rate, the respective control effects can be ranked from best to worst as DP > MPC > Fuzzy > Thermostat. As an essential component of an optimization-based control strategy is fuel consumption, this indicator is significantly more relevant than it is for rule-based control strategies. Because the DP control performs offline optimization on a global scale, it obtains the best control effect in this case. MPC, by contrast, is a real-time optimization strategy, therefore, obtains a control effect that is second only to that of an offline optimization strategy. In rule-based control strategies such as fuzzy and thermostat control, fuel consumption is not used as a control reference, thereby causing the fuel consumption indicator to significantly drop under these autonomous controls.
The results of this comprehensive comparison of four control strategies indicate that the control effect obtained using the proposed MPC strategy in this paper is significantly better than can be obtained using rule-based control strategies such as fuzzy and thermostat control. The overall performance of MPC is also quite close to the theoretical optimal results obtained under DP control. Thus, MPC can achieve suboptimal control in real-time applications.

VI. CONCLUSION
To improve the performance of the heavy-duty HEVs without navigation support, this study proposed a new energy management strategy based on load power prediction and real-time optimization. Our main contributions are summarized as follows: (1) For a hybrid electric system comprising multiple power sources with different working characteristics, we established individual mathematical models for each power source. (2) A data-driven prediction method based on a combined grey model/Markov chain approach to deal with unsteady prediction problem under off-road conditions was proposed. (3) A closed-loop real-time optimization control strategy based on NMPC was proposed as a trade-off between real-time performance and the optimization effect. (4) The effectiveness of the proposed energy management strategy was validated by comparing its results with those produced by three typical control strategies in multiple HIL simulation experiments. (5) The study results provide a valuable reference for practical engineering applications. (6) The weight coefficients of each optimization item are determined based on a large amount of experimental data, but the combination of these weight coefficients is not necessarily optimal. In future work, the selection method of weight coefficients can be further optimized.