State-of-the-art review of fuel cell hybrid electric vehicle energy management systems

The primary purpose of fuel cell hybrid electric vehicles (FCHEVs) is to tackle the challenge of environmental pollution associated with road transport. However, to benefit from the enormous advantages presented by FCHEVs, an appropriate energy management system (EMS) is necessary for effective power distribution between the fuel cell and the energy storage systems (ESSs). The past decade has brought a significant increase in the number of FCHEVs, with different EMSs having been implemented due to technology advancement and government policies. These methods are broadly categorised into rule-based EMS methods, machine learning methods and optimisationbased control methods. Therefore, this paper presents a systematic literature review on the different EMSs and strategies used in FCHEVs, with special focus on fuel cell/lithium-ion battery hybrid electric vehicles. The contribution of this study is that it presents a quantitative evaluation of the different EMSs selected by comparing and categorising them according to principles, technology maturity, advantages and disadvantages. In addition, considering the drawbacks of some EMSs, gaps were highlighted for future research to create the pathway for comprehensive emerging solutions. Therefore, the results of this paper will be beneficial to researchers and electric vehicle designers saddled with the responsibility of implementing an efficient EMS for vehicular applications.


Introduction
The prevalent interest in electric vehicles (EVs) and fuel cell hybrid electric vehicles (FCHEVs) is largely supported by the decline in fossil fuel production and the quest for an environmentally friendly transport system. This interest has equally triggered extensive research on FCHEVs [1,2]. Furthermore, environmental pollution associated with internal combustion engine (ICE) vehicles, advancement in fuel cell (FC) technology, improvement in power electronics and cutting-edge energy management systems (EMSs) are some of the reasons why FCHEVs have received significant attention from both the transportation and environment sectors [3,4]. The primary function of an EMS in a FCHEV is to control, distribute and ensure effective management of the various energy sources and energy storage devices based on the drive cycle information and load demand [5]. However, the hybridisation of FC with batteries, ultracapacitors or both in FCHEVs is basically purposed to absorb the regenerative braking energy and maintain a balance between the load and fuel cell power [2]. Achieving these two conditions will ensure that the vehicle is always operated optimally without overloading individual components in the system. According to [6], FCHEVs have not achieved their pride of place in the EV sector due to concerns around high cost and FC degradation; however, with adequate inclusion of an efficient EMS, this can be solved swiftly. The ESS (lithium ion) can be used to provide additional power to the system, thereby ensuring the downsizing of the fuel cell, which will lead to reduced cost of materials. Again, the FC degradation problem can be solved by using the battery to supplement power during acceleration or transient loading [4]. This will enhance the FC lifespan, improve efficiency and optimise operation. Hence, an in-depth understanding of current EMS control systems for FCHEVs and relevant literature on EMSs are provided. The remaining part of the paper is organised as follows: Section 2 presents the methodology used in the systematic literature review; Section 3 presents the results based on the papers reviewed; Section 4 discusses the results by showing some advantages and disadvantages, and it unveils future research opportunities. Finally, some concluding comments are provided in Section 5.

Methodology
A systematic literature review (SLR) is mostly done in fields such as social sciences, education, global economics, retail business and medicine. However, in [7], the technique was used to provide an evidence-based approach in the engineering field because the fundamental philosophies of SLRs are restrictive, duplicative, algorithmic and collective. This study was conducted using the specified stages shown in Figure 1. In this figure, Stage 1 identifies, defines and delineates the problem, Stage 2 designs and develops a strategy on how to solve the identified problem, Stage 3 decides on the method of data collection, Stage 4 organises and groups the obtained data into different categories, Stage 5 uses specific tools to analyse and discuss the information obtained and Stage 6 compiles the information into an understandable report. Stage 1: The number of FCHEVs has increased in the past decade due to technology improvements and government policies [8]. However, the effective energy management of FCHEVs has remained one of the major challenges that is confronting researchers. According to available literature, most researchers are focused primarily on the different powertrain topologies, power electronic configurations and the choice of ESS. However, identifying a suitable EMS for a FCHEV is crucial because it ensures component optimisation and increased travel range. Hence, this study provides different FCHEV EMSs available to researchers, with focus on well-established methods only.
Stage 2: The first phase of the literature review was aimed at providing a detailed overview of the different EMSs used in FCHEVs. Subsequently, the literature review looks at technology maturity and the suitability for different configurations based on available EMS technology, vehicle type, drive cycle and technology maturity.
Stage 3: Keywords are used to search for data/information on databases using search engines such as IEEE Xplore, Microsoft Academic, Scopus, Science Direct and Google Books. Google Scholar was used primarily to search for specific information on journal papers and theses where the citation was done directly from the website using Endnote X7 and the referencing was exported to Mendeley. The keywords used to search for information on FCHEV EMSs are shown in Table 1.
Stage 4: After the search for relevant literature on the topic, a Prisma 2009 flow diagram was used to arrange the different EMSs according to the methods, advantages, disadvantages and technology maturity in the following sequence [7]: a. 240 papers and theses were assembled from different databases using search engines such as Science Direct, Google Books, IEEE Xplore, Microsoft Academic and Scopus by using the keywords shown in Table 1. The first column shows the different keywords used in the search for FCHEV EMSs, and the second column are the different parameters measured. b. 80 more papers were added to the database by using the Google Scholar search engine. c. Out of the 320 papers assembled, only 220 papers were kept for further screening after a proper check on duplicate papers was completed. d. 50 papers were further disqualified because the focus was not the core of the study, rather, it discussed other topics such as powertrain topologies, EV electrical system configurations and EV modelling and analysis. e. 170 papers were evaluated based on the set criteria for FCHEV EMS, and 38 papers were removed because they did not discuss FCHEV EMS as its core. f. In this study, a total of 132 papers were used for qualitative evaluation, and 40 papers were used for quantitative evaluation; the selection criteria were set to only include papers from 1983-2021. Stage 5: The study used the inclusion and exclusion criteria to select the research papers that are most relevant to the topic. But, the inclusion criteria were achieved based on the date the research was conducted. However, this search excludes the following: (a) unpublished work, (b) webpages and (c) papers that discuss EMS techniques that are not applicable to FCHEVs. This study used published papers from 1983-2021, while older papers were excluded because they were cited in the newer papers.
Stage 6: The outcomes are properly documented.

EMS requirements
The fundamental objective of FCHEV EMSs is to ensure effective power distribution using optimal multi-motive sources to meet the drive cycle condition and load demand. This will provide dependable, robust and efficient operation, as well as lower fuel consumption, reduce cost and minimise losses. These can be obtained by developing an efficient EMS within established parameters [9]. However, for EMS to achieve its fundamental aim of system optimisation, it must consider reliability, battery cell degradation, fuel economy and FC degradation. FCHEV EMSs must ensure the availability of power during acceleration and transient loading regardless of the battery state of charge (SOC) [10,11]. Furthermore, the EMS must ensure that the battery is operated above its minimum voltage to prevent deep charging and over cycling of the battery [1,12,13]. The FC voltage starts to drop substantially at high loads such that the mass transfer of different chemicals across the fuel cell becomes the restricting factor, as shown in Figure 2 [14].
In addition to ensuring the optimisation of the fuel cell, the efficiency of other components in a FCHEV must be evaluated for overall system efficiency. Some of such components are the DC/DC converters, traction motor, ESS, and inverters. However, it is the battery SOC during braking that determines the energy efficiency recovery that arises from braking. So, if the power from braking is greater than the battery capacity, or if the battery is fully charged, then the energy at that moment will be lost. Hence, it is vital to have a battery with enough capacity to absorb the energy during braking so that the recovered energy can be maximised [3].

Overview and taxonomy of FCHEV EMSs
The primary objective of an EMS is to ensure effective power distribution using the most suitable operational conditions [15]. Some of such objectives are to improve the components' lifespan, maximise fuel economy, increase vehicle travel range, reduce tailpipe emissions and reduce the electrical stress on both the primary and secondary power sources, as shown in Figure 3.
There has been a significant amount of research done in the past decade on FCHEV EMSs for vehicular applications, with slight difference in the categorisation. However, there is a unanimous agreement on the broad categorisation of EMSs to include three major types: optimisation-based (OB), rule-based (RB) and learning-based (LB). OB-EMSs are further categorised into online or offline OB-EMSs according to the drive cycle information used, whereas RB-EMSs are further categorised into fuzzy logic (FL) and deterministic EMSs depending on a group of pre-established arbitrary rules that ignore the road conditions. According to [16], RB-EMSs are more and better implemented than OB-EMSs due to technology maturity and available knowledge. Considering the advanced OB-EMSs, Pontryagin's minimisation principle (PMP), dynamic programming (DP) and metaheuristic search methods, which include the genetic algorithm (GA), particle swarm optimisation (PSO) and simulated annealing (SA), are commonly implemented offline for global optimisation exploration. On the other hand, equivalent consumption minimisation strategy (ECMS) and model predictive control (MPC) are widely used as online OB-EMSs. Due to the significant advancements made in machine learning and artificial intelligence, LB-EMSs have gained more relevance, demonstrating huge potential to compete with other EMSs. This is so significant because LB-EMSs have the capacity to self-learn based on historical data and implement used drive cycle data for online-based learning. Presently, there are EMSs that combine the different techniques, i.e., the LB, RB and OB techniques, to create a complex but integrated EMS aimed at increasing the travel range and improving fuel economy [16][17][18]. Normally, global positioning system (GPS)-based information is used to improve the basic control parameters of any EMS known as adaptive EMSs; this type comprises telematics MPC, adaptive FL and adaptive ECMS. Statistic and clustering analysis techniques are commonly implemented in drive cycle identification within predictive EMSs [19,20]. A set of parameters are gathered in a time window ranging between 15 and 200 seconds to evaluate the driving pattern critically and adequately for any drive cycle. These sets of accumulated parameters basically include the acceleration/deceleration of the vehicle, average velocity, maximum speed, average speed and maximum acceleration based on sub-classification to enable the classification model. However, a related sets of rules, as shown in [21], consist of a decision tree, fuzzy clustering, a support vector machine [22,23], and a learning vector quantisation neural network [24]. When using a stochastic technique, the driver load demand, engine torque and vehicle velocity are implemented as a stochastic Markov chain in form of a state vector. Again, different studies have shown that, to compensate for faulty drive cycle parameters, stochastic DP (SDP) and stochastic MPC techniques can be integrated into an existing EMS. In addition, the identification techniques comprise Gaussian mixture models, statistical analysis, fuzzy categorisation and jerk evaluation [25].

Rule-based EMSs
RB-EMSs operate without pre-established driving cycle parameters, depending on human skills or heuristics techniques based on prior knowledge. This makes them simple and easy to implement because they use real-time values, especially when using state machine logic or look-up tables for their execution. Its main setback is the inability to optimise when it requires information regarding the drive cycle parameters before implementation. Again, RB-EMSs require pre-calibration to ensure optimal performance within a confined acceptable travel range of any selected drive cycle. It can only be used for a specific powertrain, but the concept and technique can be applied to another powertrain within specified operational boundaries. Hence, to improve the performance of an RB-EMS, other recognition and optimisation techniques must be incorporated. Such an EMS strategy must have a multi-mode technique incorporated into the EMS, a thermostat with driving recognition, state machine control built into the ECMS and a multi-mode built into the drive cycle recognition using a neural network. RB-EMSs may not be the best technique available, but they have gained wide acceptance because of their simplicity in real-time applications. They are further categorised as follows [26]: An FL strategy translates the human experience and thinking into a set of conditional statements [27,28]. The entire process proceeds as follows: parameter inputs, quantisation, fuzziness, fuzzy reasoning, inverse fuzziness and output quantisation. Its performance is defined by a membership function and fuzzy set of conditions at the fuzzy reasoning level. FLs are robust because they do not rely on the mathematical model of the control system, as they guarantee the handling of a multi-domain and the nonlinear conditions present in FCHEVs, as demonstrated by [29,30]. The FL strategy is sub-classified into adaptive FL control, predictive FL control, and optimal FL control.
 Adaptive FL control Fundamentally, adaptive processes are included in an existing FL rule-based strategy to enhance the system's ability to swiftly adapt to variations. Reference [31] recommended a decentralised adaptive control system as a technique that can be used to improve system adaptation to undefined parameters such as vehicle loading, the drive cycle and changes in tire parameters. This was meant to be implemented in a four-wheel-drive hybrid electric vehicle (HEV) powertrain system. To maximise the EV torque and reduce the amount of fuel consumed, Mohebbi M, et al. created an adaptive neural fuzzy intervention system [32]. To improve the HEV performance, Dazhi W, et al. implemented a double neural network-based adaptive estimator for the torque and velocity of the electric motor (EM) and engine [33]. The result obtained indicated an improved acceleration and deceleration of the HEV. Again, Chen Z, et al. proposed an intelligent power management strategy for a vehicle powertrain; it included multiple power sources and used FL and machine learning (learning optimal power sources (LOPPS)) [34]. The LOPPS was modelled to learn from previous data sets based on the SOC condition created, and then generate an effective power-sharing strategy amongst the available power sources for cloud implementation.
 Predictive FL control Predictive FL control is used for effective energy management based on the predicted future state of the vehicle. It is modelled to reduce emission, enhance fuel consumption and improve vehicle performance [16]. Reference [35] developed a predictive FL-RB technique to ascertain the response of a vehicle to anticipated states of traffic conditions and inclined grade data from a GPS. The result showed that the vehicle predicted the road condition with higher accuracy as compared to only RB-EMS or OB-EMS methods.
 Optimised fuzzy rules control Optimised FL control is utilised to optimise the driving performance of the vehicle via an established optimisation process and thus accomplish the set aims, such as reduced fuel consumption, a healthy SOC, reduced emissions, increased travel range and improved performance. However, to achieve the set objectives, the membership function and set fuzzy rules must be improved by applying transformative optimisation algorithms such as the comparative factor algorithm, direct algorithm, or bee algorithm for a FCHEV [28,29,[36][37][38]].

Deterministic strategy
In a deterministic rule-based strategy, a predetermined knowledge of the fuel efficiency, behaviour of individual components, power distribution in the drivetrain and other physical experiences are used to execute a search table on how available power sources are distributed in the system [1,2,39]. However, the modelling and simulation of the primary energy sources were either applied to ensure optimal operation or within a high efficiency range that will improve fuel consumption and reduce energy losses during transmission. This technique is mostly used in vehicular applications because of the ease of implementation and applicability to real-time operations [40]. Deterministic strategies are broadly sub-divided into frequency-decoupling strategies and optimal working condition-based strategies. In the case of frequency-decoupling control, slow dynamic energy sources, such as a FC in an FCHEV, provides the low-frequency power, while fast dynamic energy sources, such as the battery, supply the necessary power during the peak time and/or high frequency. Optimal working condition-based strategies ensure that the vehicle is operated using the same battery size and FC pack while, at the same time, improving their ageing process on the operational line and their optimal efficiency range, as proposed in [41,42].

Learning-based EMS
LB-EMSs use a more sophisticated data-retrieving algorithm technique to extract large amounts of previously stored and real-time data to establish an improved control rule. One of the major benefits of LB-EMSs is their ability to produce an accurate control decision without much emphasis on exact model prediction [31,43,44]. However, it takes a lot of time and it is very difficult to create a precise database and have an algorithm that can model the correct size with the capacity to directly impact the performance of the controller [45]. Techniques that are data-based, including machine learning techniques, are flexible and can be used to manage large set of datasets effectively for different drive cycles under different conditions [40,46]. LB-EMSs can be used together with model-based techniques (e.g., RB-EMSs and OB-EMSs) to control the parameters during different drive cycles, such as on street roads or highways, and given good or rough drivers [10,47]. LB-EMSs can be subcategorised into supervised learning, unsupervised learning, neural network learning (NNL) and reinforcement learning (RL) systems according to the learning mode [48].

Supervised learning
When using a supervised LB-EMS, a prototype is subjected to a learning process that will allow it to make the required predictions and adjustments according to the prediction errors. This process is made to continue until the prototype attains the desired level of accuracy based on the data set. These data sets are labelled and categorised to simplify the learning exercise and are considered for implementation based on an error-correction learning method [39,49]. The above hypothesis means that the learning data sets are labelled and the required outcome of the learning input data set is predefined according to the input learning process for the compilation of the parameters and a simulation of the anticipated result. Hence, Li Q, et al. used the root mean square error method to evaluate the general performance of the chosen technique, which was predefined for an anticipated condition in the central database system that records the sensor data set for the EM and ICE using the engine temperature, gasoline level and gear level [50].

Unsupervised learning
When using an unsupervised LB-EMS, a prototype is developed by decreasing the number of parameters entered as input data and ensuring that it can extract the overall set rules, arrange available data according to shared features and logically model the mathematical algorithm capable of eliminating redundancy. Reference [51] used c-means clustering to categorise the input parameters of the database that contain the optimal hybridisation levels across standard driving cycles together with the equivalent state vector of the vehicle, which comprises the battery SOC, operational temperature and vehicle speed. In this study, a knowledge-based control technique that uses a fuzzy c-means clustering set of rules was trained along the entire driving cycle. Reference [52] used a gathering technique that was previously implemented to create some set of clusters to extract the RB control strategies for a parallel HEV. However, the cost analysis showed that the input data attracted some extra operational cost due to the minimisation of components.

NNL
NNL was developed to mimic the neurons present in the human brain as its basic functionality. Just like human neurons, NNL has several connections. These nodes are items in a neural network with multiple inputs and outputs. Different behaviours of the vehicle can be modelled by using these neurons to form different layers and combining them differently [16,43]. Lin C, et al, Venditti M, and Murphey Y, et al. proposed a machine learning technique that incorporates an artificial neural network to treat the different types of roads (streets and highways), including the level of traffic jam prediction, with a DP algorithm for optimal energy control [46,53,54]. Again, Martinez C, et al. were able to reduce the computational time by 60% by using a real-time optimal control set of rules based on an EMS to teach an Elman neural network (ENN), and to keep the value of the battery SOC within a high range of efficiency [55]. The ENN is basically trained to mimic the human brain using neural network algorithms; hence, it enhances the knowledge acquired and neuron size for optimal functionality. Li W, et al. and Hu Y, et al. implemented neural DP and a backpropagation neural network for HEV EMSs, respectively [56,57]. Both results showed that the vehicle made decisions independently, including obeying traffic signs, allowing other vehicles the opportunity to drive next to it without collision and maintaining a normal speed level.

RL
RL systems are made up of two distinct but connected parts, namely, the learning agent and the corresponding environment. The learning agent garners information about the environment regularly. Based on the available data of the environment accumulated, the learning agent decides on a corresponding action to be implemented. Subsequently, the environment is elevated to the next level based on the action and the compensation caused by the movement is evaluated and sent back to the learning agent. The agent receives an instant compensation, which forms the basis for the control scheme that creates the present condition according to the most suitable control action. After the learning agent has acquired the necessary training for optimal operation, the best policy then instructs the learning agent to take the best sets of actions garnered over a specific time to maximise its operation. Therefore, the control policy determines the best decision at every cycle [51,57]. Chin H. and Jafari A. proposed the use of an RL-EMS for implementation in a series HEV [51]. They proposed the use of a repetitive renewing algorithm to represent a real-time demand of power based on previous and predictive power demands. It used a measure of the difference between the previous probability distribution and future probability distribution to evaluate the power demand transition based on the Kullback-Leibler (KL) divergence method, also known as relative entropy. Zhang W, et al. implemented a temporal-difference-learning algorithm for effective power distribution in a plug-in HEV [58], Qi X, et al. proposed the use of RL with constant state and action spaces to obtain an optimal control strategy for a plug-in HEV [59]. Hence, the RL strategy prompted a constant EMS online if the power demand transition probability changed from the KL divergence rate. In addition, Li Y, et al. proposed a closed-loop RL network for a parallel HEV wherein the internal-loop RL reduces the running cost and the external-loop controls the degradation in battery SOC [60]. Zhang Q, and Li G, developed a deep RL-based EMS for a plug-in hybrid electric vehicle (PHEV) [19]. The study used a constant target Q network with the capacity to satisfy the straight driving condition. The results showed that the major problem was related to how to achieve constant output conditions without the torque being negatively affected because of the variation caused by the discretised output condition.

Optimisation-based EMSs
The OB control strategy is used mathematically to establish the control boundaries and objectives in a cost function based on the cost of fuel consumption, losses in the system and overall cost of the system. This is achieved by employing fundamental global optimisation values to define the control techniques that are informed by predetermined drive cycle data. However, there are some setbacks with this method, because utilising global optimisation creates design challenges for real-time applications; nevertheless, it is still a useful design strategy for assessing several control strategies. OB control is broadly categorised into two types, online and offline strategies, based on their reliance on previous information and knowledge of the driving conditions [61].

Online strategy
An online strategy does not require previous information of the driving condition, nor does it guarantee the optimal operation of the system in real-time situations, making it localised and fundamental [15,17,56]. In theory, the universal optimisation challenges associated with online strategies are created by immediate optimisation difficulties related to execution that have constrained the computational time-storage reserves in real time, as shown in Figure 4. ECMS and MPC are the most implemented real-time online OB-EMSs used in automotive applications [9,62,63].
i. ECMS The ECMS controls the ESS SOC when the load is supplied, and it ensures that the fuel consumption is operated optimally without overstretching the available energy sources in the system [1]. It always ensures localised optimal operation of the individual components by taking into consideration the overall energy consumption and monitoring the SOC regularly. This type of strategy also enjoys multiple topologies and a flexible configuration, as the energy sources operate at optimal levels [58].
In addition, the equivalent coefficient, which is equal to the power ratio of all available power sources in the system, must be stated for effective operation under the ECMS, and the fundamental value of the co-state must be established. The equivalent coefficient is a vital parameter when using ECMS; it must be defined for enhanced performance alongside the co-state because they are all linked to the drive cycle [58]. In addition, the expected drive cycle is normally pre-defined using a combination of a various set of rules that will guarantee optimal operation, making it different from the offline strategy. The results from [64][65][66] showed that complete information on the drive cycle is necessary to ensure optimal operation and use the equivalence factor, marginal cost technique and shooting method effectively. However, when using the equivalence factor method, all the necessary but unstable factors, such as the direction of electric current, battery SOC, drive cycle knowledge and charging/discharging cycle must be updated accordingly. Fundamentally, ECMSs are used in FCHEVs to reduce the amount of hydrogen consumed by transforming the electric energy generated by the ESS into corresponding hydrogen consumption. In a previous study conducted on an FCHEV [67], a projected probability was introduced to two equivalent factors based on the charging and discharging of the supercapacitor to eliminate major changes to the supercapacitor's SOC; the results showed an increase in the supercapacitor's lifespan.
ii. MPC Based on past and present models of a system, the MPC strategy makes predictions of the future and anticipated outcomes within a defined set of tested rules by using a quadratic cost function and component classification based on other tested models. The operation of MPC is based on a declining horizon control algorithm that has a predictive system implemented by first establishing the optimal input parameters across a predictive horizon to reduce the main function of the set boundaries. Thereafter, the initial components of the derived optimal parameters are implemented in real time and then the entire prediction horizon ahead is set and the entire process is repeated. The optimal control glitches present in the finite domain is eliminated at each sampling stage, while the control indicators are achieved using online motion optimisation [67]. It improves the existing strategy whilst maintaining potential outcomes; hence, it requires future parameters [43]. Generally, MPCs are connected to a GPS to offer real-time optimisation by distributing the area of operation of the driving force system against a group of linear configurations combined with established parameters. This includes providing a solution that can be connected with a DP algorithm. Analysing an MPC strategy based on a prediction algorithm, allows it to be sub-categorised into stochastic and deterministic MPCs. Banvait H, et al. used different types of deterministic MPCs to ensure optimal operation [68]. Firstly, a prescient MPC was used to extract the existing information of a specific power demand for a localised but anticipated horizon window; the result showed a 96% improvement in the optimisation of the DP. Secondly, a frozen-time MPC presumes that the active power demand is stable over the prediction horizon. Thereafter, an exponential varying MPC assumes a rapid decrease in the unspecified driver demand torque along the prediction horizon. Hence, the deterministic MPC is considered fundamental amongst other MPCs because of the impractical assumptions; it is used to evaluate others in this category. SDP-MPC is an OB control strategy that uses time-invariant results based on the vehicle characteristics and the possibility of moving to a different operational condition to develop an effective EMS [43]. It enjoys a design flexibility that permits the utilisation of several drive cycle data sets for effective on-board direct implementation and practical (real-time) assessment using a Markov Chain [69]. However, its major drawback is the computational complexity, which affects real-time implementation in some instances. This problem can be solved by representing the cost function as a linear quadratic control or a quadratic equation combined with predetermined drive cycle information [70].

Offline strategies
The output of offline strategies depends on the future of its input and a predefined knowledge of a similar drive cycle to ensure optimal operation. Hence, it is very imperative to establish an optimal reference point against which other similar offline strategies can be measured [71,72]. However, the type of powertrain topology determines the power flow route in the system and the type of problem created. But, boundaries such as the drive cycle, vehicle power demand, travel range and battery SOC are usually the same for most configurations. Therefore, a suitable algorithm is decided to ensure optimisation after setting the necessary boundaries and identifying the problem. This includes power sharing between the fuel cell, ESS and other power sources in the system. Based on the type of problem created and the intended solution, offline OB strategies are broadly sub-categorised as game theory (GT), derivative-free, gradient, direct and indirect algorithms [43,73].
i. GT GT EMSs offer a unique and encouraging solution for EMSs by introducing an independent optimisation operation for individual device control. GT is widely used in different powertrain systems because of its high flexibility and suitability to different element models [74]. The control system of a GT strategy consists of the major component represented by the primary and secondary power sources that supplies the powertrain, as well as the power-consuming components. This includes the propulsion elements such as the grade, aerodynamic drag, acceleration and traction, including the ESSs. Owing to its distinctive characteristic of handling problems associated with interfacing components for multiple powered systems, GT has gained wide acceptance for use in smart grid system applications, HEVs and sustainable energy applications. However, to enhance system performance, a set of defined rules are mostly optimised by using an offline optimisation algorithm for a specific drive cycle [75]. These may include a DP algorithm, PSO, direct global optimisation, a GA and SA optimisation. Each of these optimisation algorithms is modelled to handle a specific aspect of the optimisation target without affecting other aspects of the system. Furthermore, when used for energy management in a HEV or any other hybrid power system, each power source is modelled uniquely as a component and decides on a specific amount of power that will ensure its optimisation. This, however, depends on the drive cycle information, which can be forecast using predictive techniques such as Markov chain models, neural networks, a support vector machine and sophisticated sensor tools [19]. Again, Gielniak M. and Shen Z. used GT to optimise powertrain efficiency and improve the performance of a FCHEV [76].
ii. Derivative-free algorithm (DFA) These are used in EMSs to solve problems associated with derivative information, such as uncertainty, unattainability and impracticability. DFAs also have the capacity to congregate at a global solution, unlike gradient algorithms, and they have applications in PSO, GAs, SA, and multiobjective GAs (MOGAs). PSO was first implemented by Kennedy and Eberhart in [77]. The technology was used to mimic the way social organisms behave when in their natural habitat operating in groups. It offers a useful platform for members of the group to share valuable information that will enhance group and individual optimisation. In 2006, Wang, Z, et al. implemented the technology for HEV optimisation of the fuel and reduced CO2 emissions [78]. Lin X, et al. used the same strategy to optimise the training of neural networks, regulation and control of the operational parameters in a real-time controller [79]. Again, Hegazy O, et al. and Desai C, et al. used a PSO strategy to optimise the design of electrochemical systems such as supercapacitors and fuel cells [80,81].
The GA method involves the three fundamental stages of reproduction, crossover information and modification, and it is also a widely used strategy that originated from organic collection and development in 1975. It has the capacity to solve complex intermodal, nonlinear, discontinuous-time and concave optimisation problems to achieve global optima by eliminating surrounding optima glitches [38]. Piccolo A, et al. implemented this strategy in 2001 to ensure optimisation of the energy management of HEV [82]. To establish the optimal engine-on power level in consideration of the reactive and active power of a battery, Chen Z, et al. applied a GA strategy to a power-split PHEV [83]. The results showed that the fuel consumption was reduced with increased travel range. In addition, Desai C and Williamson S used a MOGA strategy to ensure the optimisation of individual components of a powertrain, reduce fuel consumption and lower CO2 emissions [84].
The interest in SA strategies was triggered by the metal annealing process in 1983 [85]. Its primary objective and design strategy is to search for the best suitable solution by implementing a stochastic method that incorporates the solution candidates; hence, it considers developments based on the fundamental objective. SA techniques do not ensure the provision of a global optimal option, but it can be used together with corresponding techniques, such as a GA and PSO, to guarantee optimisation in a repeated manner [86]. However, Hui S, used the GA technique because of its flexibility to implement a reliable global convergence and optimise an HEV [87]. Sharma A, implemented a hybridised SA and PSO system to improve the merging abilities of the SA algorithm [88]. The results showed enhanced system optimisation and reduced CO2 emission, as well as reduced fuel consumption. Again, Chen Z, et al. capitalised on the ability of SA to locate the maximum current coefficient using PMP, and thus to locate the battery current stability [89].
iii. Gradient algorithms Several studies have tried to provide a solution that will reduce the calculation time of OB-EMSs while enhancing their reliability. These solutions are aligned with established objectives and have effectively improved the number of applicable equations in gradient algorithm EMSs. This is necessary because vehicle powertrains have become more complex and, in most instances, have nonlinear limitations. However, these solutions use derivative data within mathematical constraints such as variability, continuity or a Lipschitz situation to meet the optimisation solution. Gradientbased EMSs are broadly categorised into quadratic programming (QP), convex programming (CP), linear programming (LP) and sequential QP EMSs. The powertrain in a QP-based EMS is estimated to obtain a QP configuration set by a quadratic cost requirement within the established linear limitations. Reck R, et al. and Koot M, et al. used predictive information of the drive cycle within a regulated anticipated time and a QP-based EMS in an integrated-number within established constraints to reduce the calculation time and improve optimisation respectively [90,91]. When using the CP method, vehicles are designed simply to meet the convexity constraints by improving the constraint conditions, removing the ON/OFF of the ICE engine, replacing the battery SOC with the battery energy capacity, etc. In instances in which vehicles are designed and implemented using quadratic equations, EM losses and the power level at a particular speed are estimated using secondorder quadratic equations and battery power is represented by using quadratic linear equations [92][93][94]. In addition, the hydrogen utilisation can also be estimated by using quadratic equation for an FCHEV [95,96]. The fuel economy utilisation in LP-based EMSs is seen as a convex nonlinear optimisation problem that is estimated by using piecewise-linear estimations or achieved by using a set of linear matrix equations [97]. LP techniques surrounds the processes for a specific solution in a linear objective style and constraints, QP surrounds the processes for a particular solution of optimisation by using quadratic and linear constraints and CP surrounds the processes for a specific solution of optimisation by using convex objective and unequal constraints. Nevertheless, standalone gradient algorithms cannot independently provide total optimisation solutions because the reliability of the vehicle is reduced due to generalisation.
iv. Direct algorithms DP, also known as deterministic DP (DDP) is the most used optimisation EMS for offline applications. It is expressed in the form of quadratic equations and better implemented using established drive cycle information. However, the primary purpose of DDP is to express the solution in a nonlinear dynamic format that is sub-divided into separate times. DP is an OB control strategy that uses time-invariant results based on the vehicle characteristics and the possibility of moving to a different operational condition to develop an effective EMS [43]. It enjoys a design flexibility that permits the utilisation of several drive cycle data set for effective on-board direct implementation and practical (real-time) assessment using a Markov chain [69]. However, its major drawback is the computational complexity, which affects real-time implementation in some instances. This problem can be solved by representing the cost function as a linear quadratic control or a quadratic equation combined with predetermined drive cycle information [70]. Again, the Markov chain function is created at every sample time and can be implemented using a reverse repetitive technique. Lin C, et al. and Chen Z, et al. implemented a DDP in a HEV [72,97] and Gong Q, et al. implemented it in a PHEV [98]. Again, Sundström O, and Stefanopoulou A, utilised DP to reduce the cost function created using a sequential duplication function for the battery SOC variations of an FCHEV, the overflow oxygen ratio and the hydrogen consumption [99]. Again, Santucci A, et al. proposed a DP design that has the capacity to approximate the obtainable growth over the lifespan of the battery by using a hybrid ESS [100]. As indicated above, DP is operated effectively only with specific drive cycle information, and it does not ensure optimisation under different drive cycle conditions. In addition, the set rule of extraction is very complex and takes time to implement; also, the feedback response cannot be implemented promptly. To address the above problems, Lin,C, et al. developed an SDP using a Markov chain with shift possibilities [46]. The EMS was implemented by using different drive cycle information indiscriminately. The results showed that the SDP improved the battery SOC control with fewer components requiring adjustment because of a reduction in the total costs. Hence, the problem created by the SDP was controlled using different control methods such as barycentric interpretation, constraint generation and LP.
v. Indirect algorithms PMP is the most widely used indirect algorithm for optimal control problems [101]. The Russian mathematician Lev Pontryagin was the first person to develop this technique in 1956 to provide a solution to constrained global optimisation problems. The PMP offers specific and required conditions, whereas the adequate conditions are met by using the Hamilton-Jacobi-Bellman equation with the fundamental purpose of reducing the constrained global optimisation problem to a local Hamiltonian minimisation problem. However, the Hamiltonian equation is symbolised by a co-state and represented as a premium factor for electrical implementation [102]. When the complete drive cycle information is provided, then the ideal value of the first co-state can be established using a recursive algorithm; but, with undefined drive cycle information, the first co-state will produce different values. Again, the look-up table will increase significantly because of the complex computational process and the increased number of components. This implies that the storage and computational capacities of the chosen controllers will require corresponding increases, thereby making the PMP unsuitable for direct implementation in real-time applications. In 2001, Sebastien D, et al. proposed an application that used the PMP to ensure the optimisation of an EMS in a parallel HEV [103]. Serrao L, and Rizzoni G, did a similar study by implementing the same concept to determine the optimal power point split method for a hybrid electric truck [104], and Bernard J, et al. developed a mechanism that guaranteed effective power sharing between the FC and ESS of an FCHEV to reduce the amount of hydrogen consumed for a particular drive cycle [105]. Again, Hemi H, et al. implemented a combination of an optimal control solution using PMP and a Markov chain for efficient power distribution in an FCHEV [106]. The results showed that the power was controlled with an increase in the travel range. The results in [107] indicated that the solution provided by the PMP is similar to the DP results, and that the co-state significantly affects the battery SOC variation. Again, several studies have proposed solutions to approximate the first co-state and rectify the problem associated with it. Pham [110]. These were used to determine the error signal that exists between the actual battery SOC level and the different reference levels that are obtained based on the past, present and anticipated information. Hence, Pharm T, et al. utilised the battery energy variations and the battery temperature (double proportional feedback controllers) [108]. Researchers have tried to improve the constrained optimisation problem by implementing a dampened Newton technique to mitigate the computational complexity demand associated with the Hamiltonian optimisation method [111]. In addition, Hou C, et al. proposed an approximate PMP (A-PMP) technique to reduce the computational time of the PMP by implementing recorded patterns in the numeric PMP [112]. The study was conducted by establishing the turning point of the ICE fuel level by using a piecewise linear approximation technique. Fundamentally, when using the A-PMP strategy, the system will calculate and determine the Hamiltonian candidate, incorporating the optimal point of the PHEV by adding a convex estimation to the host Hamiltonian.

Discussion and future research work
There have been a number of studies done in the past decade on various FCHEV EMS technologies, with significant results presented in the previous sections of this paper. However, the fast growth in electromobility and emerging technologies have presented complex computational tasks and introduced huge opportunities to improve the performance of FCHEVs and corresponding EMSs. This technological development has presented some advantages and disadvantages, including the need to provide improved fuel economy and an increased travel range, even as FCHEVs have become more sophisticated, as shown in Table 2. Hence, there is a need to highlight a future research path within this area.
Offline OB-EMSs have shown computational complexities when used online, while RB-EMSs demonstrated significant success on its functionalities when used for real-time applications. RB-EMSs cannot ensure the best optimisation within the established constraints due to its fundamental configuration. This is because a significant amount of time and predefined drive cycle information are needed for a specific application to regulate the control components. Nevertheless, DP, GA and PSO strategies offer a non-causal output that cannot be implemented in real time.
Several FCHEV EMSs have been presented according to their performance in fuel economy, tailpipe emission and travel range. The results showed that no single EMS has the capacity or technological capability to provide solutions to all the problems presented. Hence, several studies combined different optimisation strategies that complimented each other to improve the overall performance of EMSs. One such study mixed the PMP with CP to enhance optimisation for ICE ON/OFF switching and power distribution in a HEV [113,114]. In this study, the PMP systematically combined the ICE ON/OFF algorithm and convex optimisation to determine the optimal point. Elbert P, et al. combined CP with DP and provided solutions to optimisation problems associated with mixednumber EMSs, making it possible to combine the ICE ON/OFF method and gearshift to a convex optimisation [115].
Most of the studies focused on the use of somewhat old algorithms, such as PSO algorithms, SA algorithms and GAs, for OB-EMS control to analyse the optimisation level. According to [116], over 30 different fundamental algorithms have been used in several studies and presented in various papers. However, most of these have not been successfully implemented in FCHEV EMS optimisation; hence, it would be interesting to investigate their applicability, reliability, suitability and effectiveness for online applications. Therefore, carrying out research on some of these emerging algorithms would provide the necessary improvement needed for computational time, the ability to compute very complex and multidimensional configurations and the integration and combination of different EMS control strategies aimed at increasing the overall performance. This can be a random combination of any such technologies, such as social-based, bio-inspired, swarm-based and chemistry-based techniques. Again, EMSs can be extended to multi-time scales, multi-vehicle interaction and multiinformation levels. There is also huge research potential in combining machine learning with OB techniques. In this instance, the EMS will consider more than one vehicle interacting with a smart grid network by using a smart charging system to ensure enhanced optimisation.
Different integration possibilities can be considered for future research from an integrated EMS perspective. Starting with a single powertrain level where the EMS will be integrated into other sub-systems such as aftertreatment [121], a waste heat recovery (WHR) system [122] or thermal loads [123]. Achieving the above will help improve fuel economy while considering the tailpipe emissions of hydrocarbons, nitrogen oxide (NOx) and carbon monoxide. A diesel engine aftertreatment-WHR system is a promising energy recovery technology with significant potential in heavy-duty vehicle and truck applications [121]. As powertrain topologies become more complex with associated computational difficulties, there is an increased need for the development of a complete hierarchical EMS system with the ability to efficiently coordinate a multi-scale time horizon that will ensure optimal driver safety. Hence, future research should consider combining the different control layers into a single EMS framework, e.g., integrated powertrain control with a sensor-based emission evaluation system [121], an integrated optimal EMS framework [18] and a multi-level EMS [124]. This will take into account the road conditions such as road grade, speed limits and altitude when considering the vehicle's trajectories. Again, with improvements in cyber-physical systems [125], the integration of environmentally friendly driving into an EMS at the double-vehicle level via adaptive/predictive control [126], or at the multiple-vehicle level [127], is appealing for research consideration.

Conclusions
This paper has presented a comprehensive literature review on various EMSs used for FCHEV applications. Scientific and technical literature on EMSs was adequately presented and categorised according to the type of technology. Fundamentally, FCHEV EMSs are proposed to provide basic control, such as ESS charge maintenance, optimisation of the vehicle travel range, emissions reduction, vehicle performance enhancement and fuel consumption reduction. Hence, the study has provided basic EMS requirements and included a comprehensive categorisation of current EMS techniques and their unique contributions, operational differences, fundamental principles, advantages and disadvantages. Presently, there are several EMS techniques available for effective power distribution in FCHEVs. However, they differ by complexity, technology maturity, cost and accuracy. Hence, the necessary skills and knowledge are required when choosing an EMS for any application. In addition, several strategies, findings and results were investigated, with specific interest in the FC lifespan, battery degradation and travel range. Again, because the primary focus was EMS techniques, the paper has presented, in detail, types of EMSs and discussed their suitability and technology maturity; research gaps were also identified for future research aimed at improving reliability and effectiveness.