Quantum Deep Learning for Fast Switching of Full-Bridge Power Converters

: With the qualitative development of DC microgrids, the usage of different loads with unique conditions and features is now possible in electric power grids. Due to the negative impedance features of some loads, which are called constant power loads (CPLs), the control of DC power converters faces huge challenges from a stability point of view. Despite the signiﬁcant advances in semiconductors, there is no upgrade in the control of gate drivers to exploit all potential of power electronic systems. In this paper, quantum computations are incorporated into artiﬁcial intelligence (AI) to stabilize a full-bridge (FB) DC-DC boost converter feeding CPL. Aiming to improve the bus voltage stabilization of the FB DC-DC boost converter, a quantum deep reinforcement learning (QDRL) control methodology is developed. By deﬁning a reward function according to the speciﬁcation of the FB power converter, the desired performance and control objectives are fulﬁlled. The main task of QDRL is to adjust the control coefﬁcients embedded in the feedback controller to suppress the negative impedance effect resulting from deploying the CPLs. By deploying the potential advantages of quantum fundamentals, the deep reinforcement learning improved by quantum speciﬁcations will not only enhance the performance of the DRL algorithm on conventional processes but also advance related research areas such as quantum computing and AI. Unlike the basic quantum theory, which requires real quantum hardware, QDRL can be executed on classic computers. To examine the feasibility of the QDRL scheme, hardware-in-the-loop (HiL) examinations are conducted using the OPAL-RT. The comparison of the proposed controller with the classic state-of-the-art methodologies reveals the superiority and feasibility of QDRL-based control schemes in both the transient and steady-state conditions to other schemes. Analysis using various performance criteria, including the integral absolute error (IAE), integral time absolute error (ITAE), mean absolute error (MAE), and root mean square error (RMSE), demonstrates the dynamic improvement of the proposed scheme over sliding mode control (approximately 50%) and proportional integral control (approximately 100%).


Introduction and Preliminaries
With the vast penetration of non-conventional energy sources (photovoltaic (PV), hydropower, wind, geothermal, etc.) into modern power systems, the concept of integrating these technologies into microgrid (MG) form has drawn a lot of academic interest over the past 15 years [1][2][3]. These MGs' benefits include low-cost energy, high local resiliency, simple connection to power source units, and the growth of users who can be connected to them. Unlike AC microgrids (ACMGs), which face many challenges in the appearance of harmonics and reactive power and frequency synchronization, DC microgrids (DCMGs) are projected to play a particular role in power networks. Practically, various power interface converters and filters are embedded in the configuration of DCMGs to convert the energy of various types of sustainable sources for supplying DC/AC loads. Moreover, the distributed structure of integrated power systems can be created by paralleling power electronic interfaces [4,5].

•
A full-bridge boost converter feeding constant power loads is modeled in the form of microgrids. For this purpose, the average dynamics of the power interface system are provided. • Quantum computation based on deep reinforcement learning is developed to control the FB power converter. • Extensive examinations and comparative analyses are conducted to validate the efficiency of the proposed FB DC-DC power converter. • HiL tests based on OPAL-RT are developed to test the feasibility of the proposed QDRL algorithm.
This article is organized as follows. In Section 2, the model of a full-bride power converter supplying CPL is illustrated. Then, all parts of the suggested control methodology are introduced in Section 3. Section 4 is devoted to the real-time examinations of the power electronic case study. The outcome of the work is concluded in Section 5.

Dynamic Model of Full Bridge Converter under CPL
The isolated full-bridge (FB) DC/DC converter is a practical and extensively adopted solution for isolated power converter systems. Full-bridge converters, when compared to other DC-DC converters, are suitable for integrated power systems where maximum voltage and maximum power are required. The feasible structure of the FB converter is depicted in Figure 1, which is constructed from a DC source, a boost converter, an isolated FB, and a constant power load [29,30].
Designs 2023, 7, x FOR PEER REVIEW 3 of 14 the DCMGs architecture. In this application, a time-varying CPL is applied to the test system, which imposes the highest level of instability on the DCMG from an electronic power perspective. The main contributions of this work are provided as follows: • A full-bridge boost converter feeding constant power loads is modeled in the form of microgrids. For this purpose, the average dynamics of the power interface system are provided.

•
Quantum computation based on deep reinforcement learning is developed to control the FB power converter.

•
Extensive examinations and comparative analyses are conducted to validate the efficiency of the proposed FB DC-DC power converter.

•
HiL tests based on OPAL-RT are developed to test the feasibility of the proposed QDRL algorithm.
This article is organized as follows. In Section II, the model of a full-bride power converter supplying CPL is illustrated. Then, all parts of the suggested control methodology are introduced in Section III. Section IV is devoted to the real-time examinations of the power electronic case study. The outcome of the work is concluded in Section V.

Dynamic Model of Full Bridge Converter under CPL
The isolated full-bridge (FB) DC/DC converter is a practical and extensively adopted solution for isolated power converter systems. Full-bridge converters, when compared to other DC-DC converters, are suitable for integrated power systems where maximum voltage and maximum power are required. The feasible structure of the FB converter is depicted in Figure 1, which is constructed from a DC source, a boost converter, an isolated FB, and a constant power load [29,30]. On the left side of the converter, a boost converter is configured, which steps up the input voltage ( or ) to a higher level. In the context of MG, many generation units, such as fuel cells and photovoltaics, can be adopted as the input source. The main contribution of full-bridge FB is the transformation from a high-voltage bus to an intermediate level. The input voltage of the full-bridge converter should be set to 110 according to the reference voltage . Moreover, a transformer with an LC filter is implemented to transfer power instantaneously from the output of the FB converter to the external CPL load.
The dynamic equation of the CPL is given as [31]: where and denote the output voltage and current of the boost converter, respectively, and is the CPL's power. The average model of the boost converter is formulated as [31]: On the left side of the converter, a boost converter is configured, which steps up the input voltage (E or V dc ) to a higher level. In the context of MG, many generation units, such as fuel cells and photovoltaics, can be adopted as the input source. The main contribution of full-bridge FB is the transformation from a high-voltage bus to an intermediate level. The input voltage of the full-bridge converter should be set to 110 according to the reference voltage V re f . Moreover, a transformer with an LC filter is implemented to transfer power instantaneously from the output of the FB converter to the external CPL load.

Power electronic interface (dc/dc
The dynamic equation of the CPL is given as [31]: where v o and i CPL denote the output voltage and current of the boost converter, respectively, and P CPL is the CPL's power. The average model of the boost converter is formulated as [31]: where v c is the voltage of the inductance L and i L is the current of the capacitor C in the boost converter.

Quantum Deep Reinforcement Learning for FB Power Converter
The control objective of the FB DC-DC converter is to regulate the output voltage in the structure of DC MG to its nominal voltage in the load bus. This work aims to stabilize the output voltage v o under a time-varying CPL using quantum deep learning. For this purpose, a feedback controller with a structure of classic proportional-integral (PI) is adopted for the FB DC-DC system. The overall control structure of the FB DC-DC boost converter with the quantum process is depicted in Figure 2. According to Figure 2, quantum deep reinforcement learning is developed from three main components, including deep belief nets, reinforcement learning, and the quantum process. The deep neural network of QRL is trained in such a way that adjusts the gains of the established feedback controller (k p and k i ) to reach a good control behavior so transient and steady-state conditions can be realized.
where is the voltage of the inductance and is the current of the capacitor in the boost converter.

Quantum Deep Reinforcement Learning for FB Power Converter
The control objective of the FB DC-DC converter is to regulate the output voltage in the structure of DC MG to its nominal voltage in the load bus. This work aims to stabilize the output voltage under a time-varying CPL using quantum deep learning. For this purpose, a feedback controller with a structure of classic proportional-integral (PI) is adopted for the FB DC-DC system. The overall control structure of the FB DC-DC boost converter with the quantum process is depicted in Figure 2. According to Figure 2, quantum deep reinforcement learning is developed from three main components, including deep belief nets, reinforcement learning, and the quantum process. The deep neural network of QRL is trained in such a way that adjusts the gains of the established feedback controller ( and ) to reach a good control behavior so transient and steady-state conditions can be realized. .

Principal of RL
In every subordinate RL section of the quantum technique, a Q-value renovated procedure, a p-value renovated procedure, an action procedure, and quantum computation is included. In this structure, deep belief neural networks are adopted for prediction states to update the Q-value. In the p-value process, the p-value will be updated to reach the index of actions. The action will be chosen from activity space A. In the final step, the real action will be generated by the quantum procedure. Since the quantum deep RL includes a general RL architecture, it can ameliorate the efficiency and feasibility of power electronic equipment. The RL algorithm is made from four components consisting of states, actions of RL-agent, returns (rewards), and environment. The RL agent evaluates the environment, decides in accordance with the information acquired, and then communicates that choice to the system (action). According to the defined optimal policy, the agent will be trained in such a way that obtains more reward from the system. In the RL, the position of the current system is represented by the state that shows the agent's situation.
The control actions will be delivered to the system by updating the matrices of the Qvalue:

Principal of RL
In every subordinate RL section of the quantum technique, a Q-value renovated procedure, a p-value renovated procedure, an action procedure, and quantum computation is included. In this structure, deep belief neural networks are adopted for prediction states O i to update the Q-value. In the p-value process, the p-value will be updated to reach the index of actions. The action will be chosen from activity space A. In the final step, the real action δ i will be generated by the quantum procedure.
Since the quantum deep RL includes a general RL architecture, it can ameliorate the efficiency and feasibility of power electronic equipment. The RL algorithm is made from four components consisting of states, actions of RL-agent, returns (rewards), and environment. The RL agent evaluates the environment, decides in accordance with the information acquired, and then communicates that choice to the system (action). According to the defined optimal policy, the agent will be trained in such a way that obtains more reward from the system. In the RL, the position of the current system is represented by the state that shows the agent's situation.
The control actions will be delivered to the system by updating the matrices of the Q-value: and p-value [24]: where ζ RL is the learning factor, γ RL is the discount factor, and µ RL is the updated factor. It is assumed these coefficients are selected in the range of [0, 1]. Likewise, O denotes the current state, δ denotes the action, and O denotes the predicted next state. For the FB DC-DC converter, the reward function of QDRL is defined as: where β 1 and β 2 are the constant factors and E error is the difference between output voltage and its reference, i.e., According to the defined reward function, the QDRL algorithm tries to generate the action signals in such a way that mitigates the voltage fluctuations against the CPL.

Deep Belief Nets (BBNs) Based on Restricted Boltzmann Machines
Restricted Boltzmann machines (RBM) are a variant of Boltzmann machines that can be adopted to fine-tune deep belief nets (DBNs) using a greedy technique. Standard RBM are made from binary-valued visible and hidden units and comprise of a weight matrix W with a size of m × n. In RBM, the relationship between the hidden layer h ij and the visible layer (v i ) is represented by the weight component w ij . Moreover, some bias weights (offsets) are used, including b vi and b hi for v i and h i , respectively. Based on the biases and weights, the energy level of RBM of deep belief networks considering the test variables of [23,32]: where N Layer and N Hidden represent the hidden layers and hidden units, respectively. Additionally, E(v, h; θ) can be defined by the following notion: The following is the marginal probability for each potential hidden layer: According to the Gibbs sampling theory, the probability distribution can be given as:

Quantum Computation
A quantum bit, also known as a qubit, is the fundamental data carrier in the quantum process, and it has the ability to exist in a super situation state of its eigenstates |0 and |1 , which is defined by the following expression [23]: where γ and ζ are complex constants qualifying |γ| 2 + |ζ| 2 = 1. The i th action of DRL with the quantum process is generated as: where both the terms of a (k+1) and a (k−1) are considered the actions of set A, while the action δ k is chosen from set A using the learning procedure of QDRL. Additionally, q i (δ) denotes the output probability with the condition 0 ≤ q i (δ) ≤ 1, which will be obtained by: where |C a | 2 is the likelihood that action |δ occurs among the sequence of δ N A δ ; K Q denotes the quantum bit count.
A detailed illustration of the quantum process based on DRL is depicted in Figure 3.

Quantum Computation
A quantum bit, also known as a qubit, is the fundamental data carrier in the quantum process, and it has the ability to exist in a super situation state of its eigenstates |0⟩ and |1⟩, which is defined by the following expression [23]: where and are complex constants qualifying | | + | | = 1.
The th action of DRL with the quantum process is generated as: where both the terms of and are considered the actions of set , while the action is chosen from set using the learning procedure of QDRL. Additionally, denotes the output probability with the condition 0 ≤ ≤ 1, which will be obtained by: where | | is the likelihood that action | ⟩ occurs among the sequence of | ; denotes the quantum bit count.
A detailed illustration of the quantum process based on DRL is depicted in Figure 3.

Experimental Results
To demonstrate the robust performance and quick transient behavior of the proposed QDRL scheme, the real-time examinations of the FB system were accomplished using the OPAL-RT platform. A photograph of the OPAL-RT setup for the FB power converter is provided in Figure 4. The parameters of the FB power converter are shown in Table 1.
The initial values of proportional and integral gains of the feedback controller were set as k p0 = 0.3 and k i0 = 40. The actions of QDRL were generated to adjust the gains of the feedback controller as k p = k p0 + δ p and k i = k i0 + δ i . Here, the terms of δ p and δ i are the actions of QDRL used to adjust the coefficients of the feedback controller. The sliding mode control (SMC) and classic PI controller were also designed for the FB DC-DC boost converter for comparison purposes.

Experimental Results
To demonstrate the robust performance and quick transient behavior of the proposed QDRL scheme, the real-time examinations of the FB system were accomplished using the OPAL-RT platform. A photograph of the OPAL-RT setup for the FB power converter is provided in Figure 4. The parameters of the FB power converter are shown in Table 1

[V]
Scenario I: In the first step, a constant power load with the power of 300 [W] was connected to the full-bridge converter. The real-time responses of FB DC-DC converter in terms of capacitor voltage and output voltage (CPL's voltage) are provided in Figure 5a,b, respectively. From the real-time responses of the FB converter, it was revealed that,  Scenario I: In the first step, a constant power load with the power of 300 [W] was connected to the full-bridge converter. The real-time responses of FB DC-DC converter in terms of capacitor voltage and output voltage (CPL's voltage) are provided in Figure 5a,b, respectively. From the real-time responses of the FB converter, it was revealed that, despite Designs 2023, 7, 60 8 of 14 the CPL imposing a high level of instability on the system, the designed controllers stabilized the output voltage at the desired range regarding the reference value. Moreover, a lower level of current fluctuations appeared using the proposed QDRL compared to the other designed controllers (SMC and classic PI controllers), as shown in Figure 6. Therefore, the proposed controller (realized by QDRL) improved the system stability and enhanced the system performance in terms of overshoot and settling time. despite the CPL imposing a high level of instability on the system, the designed controllers stabilized the output voltage at the desired range regarding the reference value. Moreover, a lower level of current fluctuations appeared using the proposed QDRL compared to the other designed controllers (SMC and classic PI controllers), as shown in Figure 6. Therefore, the proposed controller (realized by QDRL) improved the system stability and enhanced the system performance in terms of overshoot and settling time.   despite the CPL imposing a high level of instability on the system, the designed controllers stabilized the output voltage at the desired range regarding the reference value. Moreover, a lower level of current fluctuations appeared using the proposed QDRL compared to the other designed controllers (SMC and classic PI controllers), as shown in Figure 6. Therefore, the proposed controller (realized by QDRL) improved the system stability and enhanced the system performance in terms of overshoot and settling time.  Scenario II: In this step, a time-varying CPL was connected to the full-bridge converter with the following changes. Figure 7 demonstrates the real-time outcome of the FB power converter, including the voltage bus and CPL's voltage under changes in the CPL's power [16]. The current signals of inductor for various controllers have been depicted in Figure 8. It is shown that when the CPL's power was changed during the experiment, the QDRL effectively stabilized the system outcomes. When the QDRL method was adopted, the settling time of the FB DC-DC power converter was significantly reduced in comparison with the two other ones. In addition, the system outcomes of the FB DC-DC power converter with the application of the proposed QDRL based on the controller experienced less overshoot than SMC and the classic PI controller. Scenario II: In this step, a time-varying CPL was connected to the full-bridge converter with the following changes. Figure 7 demonstrates the real-time outcome of the FB power converter, including the voltage bus and CPL's voltage under changes in the CPL's power [16]. The current signals of inductor for various controllers have been depicted in Figure 8. It is shown that when the CPL's power was changed during the experiment, the QDRL effectively stabilized the system outcomes. When the QDRL method was adopted, the settling time of the FB DC-DC power converter was significantly reduced in comparison with the two other ones. In addition, the system outcomes of the FB DC-DC power converter with the application of the proposed QDRL based on the controller experienced less overshoot than SMC and the classic PI controller. Scenario II: In this step, a time-varying CPL was connected to the full-bridge converter with the following changes. Figure 7 demonstrates the real-time outcome of the FB power converter, including the voltage bus and CPL's voltage under changes in the CPL's power [16]. The current signals of inductor for various controllers have been depicted in Figure 8. It is shown that when the CPL's power was changed during the experiment, the QDRL effectively stabilized the system outcomes. When the QDRL method was adopted, the settling time of the FB DC-DC power converter was significantly reduced in comparison with the two other ones. In addition, the system outcomes of the FB DC-DC power converter with the application of the proposed QDRL based on the controller experienced less overshoot than SMC and the classic PI controller. For quantitative analysis of the FB power converter, various performance indices were considered to quantitatively evaluate the behavior of the designed controllers. For this purpose, the integral absolute error (IAE), integral time absolute error (ITAE), mean absolute error (MAE), and root mean square error (RMSE) were adopted as good options to demonstrate the superiority of the suggested QDRL technique. According to definitions of the above criteria, the best performance was obtained when the value of error was close to zero. The values of the performance index with the application of the PI controller, SMC, and QDRL control technique are shown in Table 2. For example, the values of IAE for case 1 with the proposed QDRL-based controller (IAE = 0.4889) were less than the SMC scheme (IAE = 0.6415) and classic PI controller (IAE = 0.8839). Additionally, for case 2, the values of IAE using the proposed QDRL-based controller (IAE = 0.6202) were smaller than the SMC scheme (IAE = 0.8343) and classic PI controller (IAE = 1.2599). According to these outcomes, the value of the performance index for IAE, ITAE, MAE, and RMSE for case 2 was more than for case 1, which indicates that the time-varying CPL's power imposes more of a destabilization effect than the ideal CPL. In addition, the proposed scheme (realized by the QDRL algorithm) obtained the lowest value of the performance index under ideal and time-varying CPL than the SMC scheme and classic PI controller. Scenario III: In many cases, the voltage level of the DC source may change during the supply loads, which may affect the overall performance of the power electronic interfaces. Thus, in the final stage, the power's CPL and input source are changed simultaneously to examine the feasibility of the designed controller in the worst condition of the power converter system. For this purpose, the DC source voltage was reduced by 5% at For quantitative analysis of the FB power converter, various performance indices were considered to quantitatively evaluate the behavior of the designed controllers. For this purpose, the integral absolute error (IAE), integral time absolute error (ITAE), mean absolute error (MAE), and root mean square error (RMSE) were adopted as good options to demonstrate the superiority of the suggested QDRL technique. According to definitions of the above criteria, the best performance was obtained when the value of error was close to zero. The values of the performance index with the application of the PI controller, SMC, and QDRL control technique are shown in Table 2. For example, the values of IAE for case 1 with the proposed QDRL-based controller (IAE = 0.4889) were less than the SMC scheme (IAE = 0.6415) and classic PI controller (IAE = 0.8839). Additionally, for case 2, the values of IAE using the proposed QDRL-based controller (IAE = 0.6202) were smaller than the SMC scheme (IAE = 0.8343) and classic PI controller (IAE = 1.2599). According to these outcomes, the value of the performance index for IAE, ITAE, MAE, and RMSE for case 2 was more than for case 1, which indicates that the time-varying CPL's power imposes more of a destabilization effect than the ideal CPL. In addition, the proposed scheme (realized by the QDRL algorithm) obtained the lowest value of the performance index under ideal and time-varying CPL than the SMC scheme and classic PI controller. Scenario III: In many cases, the voltage level of the DC source may change during the supply loads, which may affect the overall performance of the power electronic interfaces. Thus, in the final stage, the power's CPL and input source are changed simultaneously to examine the feasibility of the designed controller in the worst condition of the power converter system. For this purpose, the DC source voltage was reduced by 5% at t = 0.3s from its nominal voltage (t = 45.6s) and was also increased by 5% at t = 0.7s from its nominal voltage (t = 50.4s) (see Figure 9). The capacitor voltage waveforms of the FB DC-DC power converter with the time-varying CPL and changes of DC source with the application of classic PI controller, SMC, and QDRL algorithm are depicted in Figure 10. = 0.3 from its nominal voltage ( = 45.6 ) and was also increased by 5% at = 0.7 from its nominal voltage ( = 50.4 ) (see Figure 9). The capacitor voltage waveforms of the FB DC-DC power converter with the time-varying CPL and changes of DC source with the application of classic PI controller, SMC, and QDRL algorithm are depicted in Figure  10.

The Justification and Advantages of the Proposed Scheme
With the progress in the production of wide-bandgap (WBG) semiconductors, the performance of power interface systems has been remarkably enhanced at the device level. However, much of that potential is being lost since the system level (drivers, control algorithms, etc.) has not experienced matching advancement. This gap motivated the researchers to develop advanced control algorithms to exploit the maximum potential of semiconductors for improvement of the system performance. In particular, quantum computation can be adopted as a promising technique to control semiconductor devices with high-speed motor drivers, which was addressed in this paper.
The advantages of the QDRL technique for the power electronic case study are provided as follows: (i) In comparison with model-based schemes (MPC, backstepping, SMC, etc.), which need model identification, a model-free QDRL learning scheme was developed to regulate the coefficients of the feedback controller. (ii) Since the QDRL-based controller was developed in a model-free framework, the proposed QDRL scheme can be applied to a wide range of power electronic test systems. (iii) In comparison to conventional controllers, which only have optimal performance at the operating condition, the proposed controller was adaptively adjusted by QDRL, which ensured the high efficiency of the FB DC-DC boost converter for all changes to the CPLs. (iv) While ideal CPLs were often considered in previous works, in this study, a time-varying CPL was applied to evaluate the flexibility and effectiveness of the suggested QDRL-based controller. Designs 2023, 7, x FOR PEER REVIEW 11 of 14 = 0.3 from its nominal voltage ( = 45.6 ) and was also increased by 5% at = 0.7 from its nominal voltage ( = 50.4 ) (see Figure 9). The capacitor voltage waveforms of the FB DC-DC power converter with the time-varying CPL and changes of DC source with the application of classic PI controller, SMC, and QDRL algorithm are depicted in Figure  10.

The Justification and Advantages of the Proposed Scheme
With the progress in the production of wide-bandgap (WBG) semiconductors, the performance of power interface systems has been remarkably enhanced at the device level. However, much of that potential is being lost since the system level (drivers, control algorithms, etc.) has not experienced matching advancement. This gap motivated the researchers to develop advanced control algorithms to exploit the maximum potential of semiconductors for improvement of the system performance. In particular, quantum computation can be adopted as a promising technique to control semiconductor devices with high-speed motor drivers, which was addressed in this paper.
The advantages of the QDRL technique for the power electronic case study are provided as follows: (i) In comparison with model-based schemes (MPC, backstepping, SMC, etc.), which need model identification, a model-free QDRL learning scheme was developed to regulate the coefficients of the feedback controller. (ii) Since the QDRL-based controller was developed in a model-free framework, the proposed QDRL scheme can be applied to a wide range of power electronic test systems. (iii) In comparison to conventional controllers, which only have optimal performance at the operating condition, the proposed controller was adaptively adjusted by QDRL, which ensured the high efficiency of the FB DC-DC boost converter for all changes to the CPLs. (iv) While ideal CPLs were often considered in previous works, in this study, a time-varying CPL was applied to evaluate the flexibility and effectiveness of the suggested QDRL-based controller.

The Justification and Advantages of the Proposed Scheme
With the progress in the production of wide-bandgap (WBG) semiconductors, the performance of power interface systems has been remarkably enhanced at the device level. However, much of that potential is being lost since the system level (drivers, control algorithms, etc.) has not experienced matching advancement. This gap motivated the researchers to develop advanced control algorithms to exploit the maximum potential of semiconductors for improvement of the system performance. In particular, quantum computation can be adopted as a promising technique to control semiconductor devices with high-speed motor drivers, which was addressed in this paper.
The advantages of the QDRL technique for the power electronic case study are provided as follows: (i) In comparison with model-based schemes (MPC, backstepping, SMC, etc.), which need model identification, a model-free QDRL learning scheme was developed to regulate the coefficients of the feedback controller. (ii) Since the QDRL-based controller was developed in a model-free framework, the proposed QDRL scheme can be applied to a wide range of power electronic test systems. (iii) In comparison to conventional controllers, which only have optimal performance at the operating condition, the proposed controller was adaptively adjusted by QDRL, which ensured the high efficiency of the FB DC-DC boost converter for all changes to the CPLs. (iv) While ideal CPLs were often considered in previous works, in this study, a timevarying CPL was applied to evaluate the flexibility and effectiveness of the suggested QDRL-based controller.

Conclusions
In this paper, adaptive controller-based quantum computing was designed to suppress the effect of constant power loads in full-bridge converters in the form of a microgrid. By employing the training capability of the quantum deep reinforcement learning (QDRL) technique, the control parameters embedded in the feedback controller were adjusted appropriately and resulted in a robust controller with quick response. By using the error system in the reward function, the training of the QDRL algorithm was realized to stabilize the output voltage of the FB DC-DC boost power converter feeding constant power load. To verify the efficiency of the suggested technique (realized based on the QDRL algorithm), real-time examinations with the OPAL-RT platform were conducted under two typical scenarios of microgrids. It was validated that despite the CPL being connected to the DC bus on the load side, the power electronic interface system operated in the ideal condition from a systematic point of view. In addition, the proportional-integral controller and sliding mode control were also designed and applied to the FB power converter for comparison purposes. The HiL outcomes of the QDRL technique-designed feedback controller showed a higher level of dynamic performance than other state-of-the-art techniques. In future work, a prototype of the FB power electronic system should be built to assess the feasibility of the proposed controller-based quantum theory from an experimental point of view. Additionally, the quantum principle can be adopted as a promising option to design modelpredictive control for control output of the next generation of power electronic systems.