Fault Management Techniques to Enhance the Reliability of Power Electronic Converters: An Overview

The reliability of power electronic converters is a major concern in industrial applications because of using prone-to-failure elements such as high-power semiconductor devices and electronic capacitors. Hence, designing fault-tolerant inverters has been of great interest among researchers in both academia and industry over the last decade. Among the three stages of fault management, compensating the fault is the most important and challenging part. The techniques for fault compensation can be classified into three groups: hardware redundancy methods which use extra switches, legs, or modules to replace the faulty parts directly or indirectly, switching states redundancy methods which are about omitting and replacing the impossible switching states, and unbalance compensation including the techniques to compensate for the unbalances in the system caused by a fault. In this paper, an overview of fault-tolerant inverters is presented. A classification of fault-tolerant inverters is demonstrated and major cases in each of its categories are explained.


I. INTRODUCTION
In recent decades, power electronics have been increasingly used in modern power systems. Power electronic converters are the main energy conversion system in a wide range of applications such as renewable energy systems, energy storage systems, smart and microgrid technologies, dc transmission and distribution systems, electric motor drives, and power supplies [1], [2], [3], [4]. The widespread use of power electronic converters in various industries has made their reliable performance a top priority [5].
The associate editor coordinating the review of this manuscript and approving it for publication was Bo Pu .
Reliability is defined as the ability of an item to perform a required function under stated conditions for a certain period [6]. It is often measured by the probability of failure, frequency of failure, or terms of availability. The essence of reliability engineering is to prevent the creation of failures and faults. A fault in a power electronic system not only may cause an unscheduled interruption, which is not tolerated, but may even lead to a disastrous accident [7], [8]. These unplanned interruptions may cause significant safety concerns and an increase in system operation costs as well [9]. It is therefore clear that the push toward ever-more reliable power electronic products is critical for all industries power networks in rural areas, critical medical equipment power supplies, aircraft, and naval power systems, satellite systems with unfeasible maintenance, and wind and solar farm with extensive and widely distributed parts [10]. Considerable manufacturers have been getting a growing awareness of the protection efficiency and maintenance costs of power electronic devices [4]. Hence, the reliability of power electronics is recognized as one of the top research topics, the importance of which is growing rapidly [11].
Inverters play an important role in the reliability of electrical systems such as renewable energy systems, motor drives, electric vehicles, etc. Industrial experiences show drives, electric vehicles, etc. Industrial experiences show that converters are frequent failure sources in many applications such as wind and PV systems [12]. As an example, as shown in Fig. 1, inverters are responsible for about 37 percent of unscheduled maintenance events in PV systems. Therefore, the reliability of inverters is a major concern in industrial applications, especially due to the use of a large number of high-power semiconductor devices with high power densities and high failure rates.
Although there are lots of efforts to make two-level inverters reliable, the fault-tolerant ability is a more significant challenge in the multilevel inverters, as the possibility of failure is higher for these converters due to the higher number of switching devices [13], [14], [15]. Therefore, the study of the fault management operation is mainly focused on multilevel converters.
As shown in Fig. 2, generally, the failure in converters can be classified into catastrophic fault due to single-event overstress and wear-out failure due to the long-time degradation of components. Fig. 3 demonstrates the general guideline for the reliability of power electronic converters. This figure divides the discussion of the reliability of power converters into two aspects: fault management and lifetime management. However, in research, There is usually a distinction between these two reliability areas [16]. Fault management is responsible for managing the catastrophic faults in converters, such as Short-Circuit (SC) and Open-Circuit (OC) faults that can cause destructive damage [17]. The other aspect of reliability is lifetime management, which is mainly concerned with the wear-out issue of the components and devices of the system. It consists of three major subcategories: lifetime analysis, lifetime prediction, and lifetime extension. However, the focus of this survey is just on the fault management aspect of reliability, especially fault compensation and the survey on lifetime management of power electronic converters has been discussed in [16].
Power semiconductors and capacitors are the most vulnerable power electronic components [18]; therefore, the reliability of power converters mostly focuses on these failure-prone power electronic components. capacitors are sensitive to thermal and electrical stresses and have the main disadvantage of low lifespan and high degradation failure rate [19]. It is demonstrated in Fig. 4 that about 18% of the faults in converters are caused by the degradation of capacitors. Electrolytic capacitors, which are mostly used as dc-link capacitors, have the shortest lifetime among all capacitors. These capacitors are among the major failure factors in PV inverters. Many efforts have been done to improve the reliability of power electronic converters by minimizing dc-link capacitance so that small capacitors with a long lifetime can be used to replace electrolytic capacitors. Hence, some research investigations have been devoted to reducing the size of capacitors in inverters as well as replacing the electrolytic capacitors with non-electrolytic capacitors. However, these approaches include extra components along with the increased complexity of the switching patterns. In three-phase applications, a lower amount of capacitance can be used where the power pulsation is lower [20]. In fact, addressing the faults of capacitors is mostly focused on the prevention of faults, and fault management techniques are not discussed in the literature. however, due to the features of the switches, there are many methods that can ameliorate converters' fault conditions.
As was mentioned earlier, generally, the faults in power devices can be divided into two cases: SC fault and OC fault. SC faults affecting the switches are the most serious faults [21]. An SC fault will produce an abnormal overcurrent, causing serious damage to other parts within a very short period of time [22]. Therefore, SC fault-tolerant control  strategies rely heavily on hardware [23]. This fault is caused primarily by continuous gate pulses, overvoltage, an internal fault caused by overheating, and freewheeling diode failure caused by high reverse recovery voltages [24]. Fast fuse devices connected in series with power devices can convert short circuit faults to open circuit faults whenever the fusible element opens [25].
A variety of mechanisms can cause OC faults, including bond-wire lift-off, gate driver failure, or internal connection rupture caused by thermal or mechanical shocks [26], [27]. The converter operates at low power quality after an OC fault, causing additional stresses on its circuit components, causing secondary problems [28].
When a fault occurs, the fault management operation is activated which consists of fault diagnosis and fault compensation.
This work focuses on methods to compensate for the faults. These techniques of fault compensation in inverters are classified. First, fault isolation techniques are explained which are categorized by hardware-based and modulation-based methods. Then the redundancy methods as the popular approach to reconfiguring the inverter in the post-fault are discussed in detail. Finally, the control techniques to regain the converter's pre-fault performance are described using figures.

II. FAULT DIAGNOSIS
Fault diagnosis or fault detection is the first step once a fault occurs. Fault diagnostic techniques for inverters can be divided into model-based and data-driven methods. The model-based methods are based on the analytical model of the converter [29]. They usually need to consider the dynamic properties and operation mechanism of the system, then establish an accurate mathematical model [30]. The datadriven fault diagnosis methods do not need to know the exact analytical model of the system. they directly analyze and process the measured data [31]. These techniques include signal processing methods, statistical analysis, and artificial intelligence. There is also a hybrid method that uses a combination of these two methods.
It is worth noting that the fault detection method is not the main concern of this paper. In [32], fault Diagnosis techniques for Modular Multilevel Converters (MMCs) are reviewed. References [33] and [34] have evaluated IGBT's different fault diagnosis approaches. References [35] and [36] present a comprehensive survey of fault diagnosis techniques.

III. FAULT ISOLATION
Fault isolation is the first step in tackling a fault in a system. When a fault occurs in an inverter, some switching states may be unavailable due to the SC or OC of the faulty switches. These switching states should be avoided or the faulty components themselves should be isolated so that the system continues to function and prevents damage to the whole system. These schemes are performed by adding some extra elements such as fuses and TRIACs and their goal is to isolate the faulty switch(es). Fault isolation usually results in the degradation of the system's performance, especially in the output voltage and THD. Therefore, there have to be solutions to compensate for the effects of the fault which are discussed in section III.
In Fig. 5(a) [37] one phase of a three-phase two-level inverter is demonstrated. If the switch S a2 fails SC, first, the switch S a1 should be turned off temporarily. Next, the TRIAC T a is turned on to make a shoot-through in the bottom dc bus and blow the fuse F a2 . Now the S a2 is off the circuit. Requiring access to the midpoint of the dc-link and increased parasitic inductance because of the fuses are limitations of this approach.
In Fig. 5(b) [38], [39], when S a2 fails OC, S a2 and T a are turned on to create an SC across the top capacitor which blows the fuse F a . The inverter leg which corresponds to the phase ''a'' is now isolated. On the other hand, if S a2 fails short, turning off S a1 is the first step. Then the TRIAC T a is turned on which causes a shoot through in the bottom dc bus capacitor and blows Fa which isolates the whole leg ''a''.
In Fig. 5(c) [40], when the switch S a2 fails either SC or OC, S a1 is turned off and T a1 is turned on. This creates a shootthrough that blows the fuse F a2 which removes the switch S a2 from the circuit. Require a high number of components and increased parasitic inductance because of the fuses. are the drawbacks of this technique. Also, relatively large capacitors are needed to decrease the isolation time.
In another approach which is shown in Fig. 5(d), after isolating the faulty leg, the neutral point of the three-phase motor is forced to connect to the dc-link midpoint by turning on the TRIAC T a .
One leg of a three-phase Neutral Point Clamped (NPC) inverter with an isolation circuit is shown in Fig. 5(e) [41], [42]. If S a1 fails short, then the top dc-bus capacitor will experience an SC through D a2 and F a2 during the zero switching state in which S a2 and S a3 are turned on. To avoid this switching state, T a2 is turned on to blow the fuse F a2 . The inverter is turned into a two-level inverter where the output voltage is the same as before, but its THD decreases.
In the modified NPC inverter in [43], [44], and [45] (Fig. 5(f)) the faulty phase is forced to connect to the dclink midpoint via an additional TRIAC. After faults, the reconfigured system is similar to the structure where only four switches are used to drive a three-phase machine. Since the inverter is still capable of providing the full rated current, the maximum balanced line-to-line output voltage in postfault operations is reduced to half of its nominal value. The limitations of this approach are requiring access to the midpoint of the dc-link, and oversized dc-bus capacitors.
ANPC converter shown in Fig. 5(g) can be operated as a three-level leg after a single-switch SC fault [41], [42]. For example, if S a1 fails short, thyristor T a2 is turned on to blow fuse F 2 . However, unlike NPC, the zero state still can be obtained by turning on switches S a2 and S a5 . The other switching states remain unchanged. After the fault, the output voltage will experience no change in value, however, the voltage stress on the healthy devices equals dc-bus voltage.
The cascaded H-bridge (CHB) inverter shown in Fig. 6, is one of the popular converter topologies used in high-power medium-voltage motor drives to achieve medium-voltage with low harmonic distortion [46]. The wide adoption of Cascaded Multilevel Converters (CMC) and Modular Multilevel Converter (MMC) in the high-voltage direct current (HVDC) industry is mainly due to their modularity, scalability, and inherent fault tolerance [47], [48], [49]. It is composed of a number of modular H-bridge power cells and isolated dc voltage power sources, which can be obtained from the phaseshifting transformer and diode rectifiers [50].
The CMC topology ( Fig. 7) has inherent module-level redundancy [47], [51]. If a module e.g., A 1 , experiences failure, it is bypassed by the TRIAC T A1 and the corresponding healthy modules of two other phases which are B 1 and C 1 can be bypassed for making the voltage balanced [50]. However, as demonstrated in Fig. 7, the symmetry output voltages are achieved with a 33% amplitude reduction, which limits the operation range under fault.
In [52], the suggested structure ( Fig. 8) has four relays in each module. The relays mentioned above are connected so that in the event of an SC or OC failure, the defective module can be eliminated and isolated from the whole system, ensuring the normal operation of the inverter with two healthy modules. If a second fault occurs in two remaining modules, the faulty module will be eliminated using the related relays and the output voltage level will be decreased from 7 to 3 so that the remaining modules can continue to operate. The main drawbacks of the mentioned fault-tolerant scheme are the high voltage stress on the remained healthy switches and decreased output voltage level. Therefore, the switching algorithm is modified to allow the inverter to either continue working in nominal condition, if it is possible or in derated operational mode [53].

IV. FAULT COMPENSATION
After fault isolation, fault compensation schemes are required to guarantee the operation of the faulty inverter as close as possible to normal operation. In this paper, as shown in Fig. 9, fault compensation techniques are classified into three groups: hardware redundancy, switching states redundancy, and unbalance compensation control. In order to choose a suitable fault-tolerant method, output performance consisting of factors including the total harmonic distortion (THD) of output voltages or currents, system efficiency, and dynamic response should be considered. Cost is another important factor in comparing different techniques of fault compensation.
Redundancy means that when a feature of a system is down, it can be replaced by another feature that is already included in the system [54]. In the case of inverters, redundancy can be either switching state redundancy, which includes alternative current paths to obtain the same voltage level, or hardware redundancy, which includes extra switches, legs, and modules.

A. HARDWARE REDUNDANCY
Redundant hardware techniques involve adding some redundant hardware to the original system. The addition of hardware increases the cost of the system, but it provides advantages in post-fault operations, especially in applications where cost is not a major concern [8].
A simple solution for switch redundancy is shown in Fig. 10(a) which handles SC of the switches S1 and S2 [55], but it suffers from voltage sharing problems and doubled conduction losses in healthy conditions.
The topology in Fig. 10(b) can resolve both OC and SC faults by using TRIACs [56]. It does not have the problems of the previous topology; however, its cost is higher due to the number of components.
A fault-tolerant switch-redundant flying capacitor leg is introduced in [57]. which is demonstrated in Fig. 10(c). When one of the switches fails, its complementary switch is turned on. The redundant cell is composed of R 1 , R 2 , C R replaces the faulty one. The FC leg continues to provide the same output voltage. During normal mode, the additional cell is in a permanent on-state [57].
In the switch-redundant topology in Fig. 11, if one of the upper switches fails OC or SC, it can be replaced by the   The topology presented in [59] proposes a fault-tolerant five-level inverter for PV applications consisting of a two-level half-bridge inverter, a three-level diode-clamped inverter, and a bidirectional switch made with four diodes. As a result of a switch fault or dc-source fault, the topology operates as a three-level, resulting in half the output voltage. Two additional switches and a center-tapped transformer are suggested by [59] in order to maintain the output voltage of the inverter at the same value before the occurrence of the fault.
Some hardware redundant topologies use the redundancy of a whole leg to make the leg replicable when a probable fault occurs. The redundant leg can be connected in parallel or in series.
The converter in [60] is based on a back-to-back converter and S 7 and S 8 as its redundant switches. One leg of this    topology is shown in Fig. 12(a). If one of the switches e.g., S 1 fails OC, the TRIAC T 1 is turned on to connect which makes the faulty leg get replaced with the redundant leg containing S 7 and S 8 . In the SC case, the faulty leg is isolated by very fast-acting fuses; consequently, the SC fault becomes an OC fault after the isolation of the faulty leg by the two fuses.  The topology in Fig. 12(b), does not use fuses and TRIAC. Turning on the relay removes the faulty leg and connects the redundant leg. However, it cannot handle two SC switches in one leg.
In [32], as shown in Fig. 13, switches S 2 and S 2 act as redundant switches. In normal operation, during the positive half cycle of the current i O , the TRIACs T 1 and S 2 are continuously on. The powering ode is obtained by turning on S 1 and S 4 and the freewheeling mode is obtained when either S 1 and D 2 or D 3 and S 4 conduct. During the negative halfcycle of i O , the TRIACs S 7 and S 8 are on instead of S 5 and S 6 . In this configuration, the switches S 1 and S 4 and the diodes D 2 and D 3 are utilized in the same way as in the positive half cycle.
When S 1 fails short, S 4 is turned on to obtain powering mode, and S 4 is turned off to obtain freewheeling mode. In the event of an OC fault in S 1 or S 4 , the converter continues to function using S 2 and S 3 . In this case, while during the negative half cycle the TRIACs S 5 and S 6 are on, S 7 and S 8 are on during the positive half cycle. During the normal operation, S 2 and S 3 use the same switching strategy as S 1 and S 4 . When S 7 experiences an OC fault, S 8 is turned off permanently, and S 5 and S 6 are permanently turned on. As a result, the converter is permanently reconfigured to the conventional VSI configuration. As soon as an SC fault occurs in S 7 , S 8 will be turned ON, and S 5 and S 6 will be permanently turned off. and the circuit is permanently changed to the conventional VSI configuration.  In [62] a fourth leg is added to the conventional NPC inverter which is connected to the neutral point of the converter through an inductance. This fourth leg apart from its duty as a redundant leg for the postfault operation works under the normal operation as well to balance the Neutral point voltage by injecting the locally averaged current to the neutral point. Relays are added to reconfigure the converter as soon as a fault is detected in any of the switches. Due to their inductance being in series, the parasitic inductance of these relays is negligible. From an operational perspective, the first solution in Fig. 14(a) is the simplest. Convertor reconfiguration does not require changing modulation indexes or blowing fuses. Nevertheless, semiconductors must be able to withstand the total dc voltage. As a result, the converter is considerably more expensive, and its use is severely limited. The second solution in Fig. 14(b) can be useful in some applications such as controlling an induction motor. It does not require switches that can withstand the total dc voltage, and its price is the lowest. The third solution in Fig. 14(c) like the previous one does not require switches that can withstand the total dc voltage, and it is not necessary to reduce the modulation index during the reconfiguration process. This solution can be a good option for grid-connected applications.
In the topology in [63], a redundant leg is added to a singlephase five-level NPC (Fig. 15(a)). If the switches S a2 and S b2 fail OC, in an NPC without the redundant leg, the connections to points P and O are not available for the NPC legs. Hence, the redundant leg R compensates for this fault by using S R1 with S R2 or S R4 to connect the input point P to legs A and B respectively. It also turns on the switches S R6 with S R2 or S R4 to connect point O to legs A and B, respectively. Therefore, the five-level output voltage can be preserved. If the switches S A1 and S B1 fail SC, the NPC legs cannot provide connections to points P and O, and the fuses F 1 and F 4 must be blown. The switches combinations of (S R1 and S R2 ) and (S R1 and S R4 ) are used to connect the leg A and B, to the point P respectively. Also, the switch combinations of (S R2 and S R6 ) and (S R4 and S R6 ) can connect the neutral point O to the legs A and B, respectively. Some merits of this inverter are tolerating all types and locations of faults with full output ratings, reducing components count, preserving high efficiency in postfault operation, and avoiding the usage of bidirectional switches.
In the topology in Fig. 15(b) proposed in [55], the main inverter comprises a conventional three-level NPC leg and a conventional three-level FC leg along with a redundant bridge at the output terminal as the redundant leg.
When OC failure occurs on the switch S 3 , the inverter loses its fourth voltage level. In the negative half of the fundamental cycle, this reduces the load current, which reduces the current through the FC. Which leads to the loss of inherent capacitor voltage balancing. Switching R 2 from the redundant bridge generates the fourth level. And preserves the output power of the inverter. When OC occurs, switches S 3 , S 5 , S 8 , R 2 , R 3 , and R 4 are activated with appropriate pulses, resulting in the generation of a three-level output voltage waveform. [64] classifies the SC fault of switches in terms of the part of the inverter they make SC, including SC of the input voltage source and SC of the capacitor. In the first case, because of OC across the input voltage source, a fuse will be blown to turn the SC fault into an OC fault which is already discussed. In the second case which is the SC of the capacitor, the inverter operates at equal power levels as before the fault while generating a three-level voltage waveform on the output.
The nine-level leg-redundant topology proposed in [65], consists of two three-level flying capacitor legs that are connected by two controlled switches (Fig. 15(c)). bidirectional switches. It can tolerate OC and SC faults in single and multiple switches and maintains the output power and voltage levels in post-fault operation. The switching scheme in this topology maintains the voltage of the capacitor balanced under pre-and post-fault operations.
There are also module-level redundancy approaches. For CMCs and MMCs, redundant modules are added in series with the basic topology as shown in Fig. 16. Normally, the redundant modules are inactive. Whenever a module experiences a fault, it is isolated, and the redundant module replaces the faulty module to restore normal operation [66].
Reference [67] investigates the effectiveness of using redundant cells by the means of reliability assessment and  In [69], a fault-tolerant CHB is proposed which uses an extra H-bridge module as shown in Fig. 17. The redundant module just operates after a fault happens. If one of the top switches, in any of the H-bridge cells e.g., switch S1 fails, the top-side relay with the red color in the figure will start functioning. Whenever this relay is triggered, the normal-open conductors of the inverter will be closed, and the conductors that are normally closed will be opened, removing the faulted component from the inverter and the redundant module joins the circuit. when a fault occurs in the bottom switches, the bottom relays in the blue color act and the rest of the action is the same as in the previous case. If a switch from the top side and one from the bottom side fail together, all the existing dc sources connect to form a set of series-connected dc sources in parallel with the redundant module. However, if a second fault happens, all dc sources connect in series which results in a simple threelevel Cascaded Half Bridge (CHB). When one fault occurs, the shape of the output voltage remains unchanged. But, when the second fault occurs the number of the output voltage level reduces to three.
The fault-tolerant inverters using the system-level redundancy are cascaded inverters and parallel inverters. In the cascaded structure in Fig. 18(a), two inverters are connected in series. Although in this approach, several faults including single-switch SC, single-switch OC, and phase-leg OC can be handled, the power rating is reduced after the fault [70].
In the parallel structure in Fig. 18(b), If one inverter fails, the other inverter can replace it so that the system can operate continuously. However, the reduction of circulating currents between converters is an important problem to deal with when the dual converters are performing simultaneously in the normal mode [71].
In the system-level redundant topology in [72] which is shown in Fig. 19, all components including diodes and  capacitors are replaceable by turning on the relay R 1 and turning of the R 2 . When one of the switches S 1 to S 4 fails SC or OC, switch S 4 is turned off. Then with the assistance of a second diode D 3 and D 4 , the remaining healthy switches S 1 and S 2 are combined with the redundant switches S 5 and S 6 to reconstruct the inverter with the rated output voltage or power. When a diode fails OC, the faulty diode is automatically isolated and when it fails SC, the fuse in series would be melted, thus bypassing it. When C 1 experiences an OC, the faulty capacitor would be also isolated automatically and if SC was experienced, the secondary windings of the transformer T 1 will be shorted melting the fuse F 1 . In the case of failure in just capacitors and diodes, since all switches S 1 to S 4 are healthy in this mode, there are four possible modulation strategies. For instance, in a possible switching set, S 3 could be constantly on, S 4 could be always off, and S 1 and S 2 could be complementary.

B. SWITCHING STATES REDUNDANCY
Switching States Redundancy includes avoiding the unavailable switching states and minimizing the impact of the fault by a proper switching sequence. The space Vector Modulation (SVM) approach is typically used to avoid the states involving the failed device. Here, the SVM is represented in an α-β frame which is the conventional technique. The SVM can also be represented in a g-h coordinate system and K-L coordinate system.
In most cases avoiding the unavailable switching states may not be enough to obtain the desired performance of a faulted converter. By adding extra components, using redundant switches, and altering the control strategy, faulttolerance performance can be achieved which will be discussed in sections III and IV.
In the three-phase NPC inverter of which the leg ''a'' and its corresponding phase is shown in Fig. 20 Fig. 21. Since these states fall on the output perimeter of the hexagon, the maximum modulation index is reduced [43], [45]. A similar approach can be implemented for SC or OC of the other switches and also the diodes. With this approach, the fault is cleared, however, the modification of the PWM strategy to avoid unavailable states leads to dc-bus mid-point imbalance, spurious fault detection, and overrating of device voltage to full dc-bus voltage [45].
In another approach, the purpose is to change the modulation at the time of the fault in order to make the system survive the impact. Active Neutral Point Clamped (ANPC) converter, which is obtained by replacing diodes in NPC with switches, is widely used in high-power medium-voltage applications including distributed generation such as photovoltaic systems, motor control in traction systems, and industrial motor drives [73]. In this converter as shown in Fig. 20(b) [74], [75], if an OC fault occurs in the switch S a2 , the switches S a3 and S a6 can be turned on to connect the phase voltage to the dc-bus mid-point which minimizes the impact of the fault by reviving the three-phase system.
In this approach, in case of failure, redundancy in the switching states of the inverter, enables the controller to choose an alternate conduction path to retain the same output voltage [76], [77].
In the flying capacitor inverter (FC) as shown in Fig. 22, in the normal mode, the voltage level can be provided by turning on switches S 1 , S 3 , and S 4 (the current flows through the capacitor C 2 and diode D 2 ). If for example the switch S 3 fails open, while i L > 0, the same voltage level can be  obtained by turning on the switches S 4 and S 1 (the current flows through the capacitor C 3 and diodes D 2 and D 3 ). On the other hand, in the healthy condition, turning on the switch S 2 (the current flows through capacitors C 1 and C 2 and the diodes D 1 , D 3 , and D 4 ) the voltage level is produced. If S 3 fails short, while i L <0, the same output voltage is obtained when current flows through diodes D 1 to D 4 . Therefore, the FC inverter benefits from the switching state redundancy which makes it retain its output voltage level after an OC or SC fault occurs.
The four-level MAC converter demonstrated in Fig. 12(a) can always continue operating under a single-device SC and OC fault maintaining at least three of the four levels using redundant switching states. When an SC fault occurs, some  switching states will be unavailable. However, they can be replaced by other switching states. For example, to produce level 1, there is sometimes more than one option to replace the unavailable switching state. Therefore, there are two set modulation strategies in the case of SC. One prioritizes the number of levels and the other one prioritizes the blocking voltage. The produced voltage levels and the switches with overvoltages are demonstrated in Table 1 for both priorities. the possible voltage levels are colored blue. The ones that are achievable by the new switching states are colored with a lighter blue. In the case of two failed switches, the situation is similar to the previous case with the difference that the voltage levels will be either two or three or even four levels.
The topology in Fig. 23(b) is created by adding two additional switches to a MAC inverter [78]. This inverter provides multiple conduction paths in each phase and intra-phase redundancy can be achieved for certain switching states.
In addition, the post-fault control scheme can be Complex. As a result of its redundancy, this topology is limited to a certain output level or semiconductor device. In the case of failure in S 11 , there is no alternative path and level 1 will be lost. Further, healthy devices may be subjected to increased blocking voltages under certain failure scenarios.

V. UNBALANCE COMPENSATION CONTROL
Despite the fact that the most important function of an inverter when a fault occurs is to continue servicing as close to normal as possible, other features should also be considered [79]. Imbalance control techniques refer to the altercation of the control strategy to correct the imbalances created by the fault and achieve an optimum operating point concerning the voltage, THD, or any other objective. By using this algorithm, fault-tolerant control can be implemented without changing the inverter's topology. As a result, using them can save hardware costs and simplify topologies [80]. Three techniques of control-based fault compensation methods are discussed in this section.

A. PHASE SHIFT
In the topologies using the dc-link midpoint connection as discussed in section II, when a fault occurs e.g., on a switch in the leg ''a'', the corresponding phase ''a'' gets connected to the midpoint of the dc-link through turning on a TRIAC. The reduced system is like a four-switch inverter. To create balanced line-to-line output voltages, the phase angle of the healthy phase voltages (b and c) should be adjusted by shifting by 30 degrees which is demonstrated in Fig. 24 [81].
In the case that the neutral point of the three-phase motor is connected to the dc-bus midpoint, the phase angle of the healthy phase currents (b and c) should be shifted by 30 degrees [82].

B. NEUTRAL SHIFT
When bypassing a module, the voltage and power available from the drive are reduced, but the available current is not affected. As it was mentioned in section II, when a fault occurs in a module in an MMC or CMC, one option after isolation of the faulty module is isolating the corresponding modules in the other two phases to keep the output voltage balanced. However, the output voltage is reduced.
By using the NPS method, there is no need to bypass the corresponding healthy modules and have a balanced output at the same time. As shown in Fig. 25(a), the line-to-line voltages in normal operation are 8.67 p.u.
When modules W 4 , W 5 , and V 5 experience a fault, the correspondent healthy modules, which are V 4 , U 4 , and U 5 are bypassed. The new line-to-line voltages are 5.19 p.u.  ( Fig. 25(b)). By solving the following equations, we can find the angles between phases that make the voltage balanced [83].
As shown in Fig. 25(b), the output voltage is higher than the conventional method.

C. EXTENDED NEUTRAL SHIFT
This method is applied in cases in which the converter's neutral point obtained through the traditional neutral-shift approach is located outside the triangle of the output lineto-line voltages [84]. Using this approach, the angle between the two voltages with the lowest amplitude is calculated at 180 degrees, and the amplitude and angle of the other phase are calculated to maximize the output voltage. This method can increase the output voltage by 15%.

D. VOLTAGE EXTENSION
This method is similar to the NPS technique with the difference that the output voltage can be sustained at the same level as that in the pre-fault condition [85], [86]. An important drawback of the previous reconfiguration strategy is its effect on the common mode voltage, which can lead to unbearable stress on the machine bearings [84]. Many papers including [87], [88], [89], [90], [91], [92], [93], [94] have used this method in their fault-tolerant operation scheme.
To increase the converter's maximum output range, the average of the maximum and minimum reference phase voltages is injected into the common-mode voltages. Fig. 26(a) shows the phasor diagram of the normal operation. When a fault occurs, in order to maintain the output voltage level, the input dc-bus voltage of the faulty phase is increased in order to keep the total voltage unchanged. As shown in Fig. 26(b), the three voltages are balanced, and their value is the same as the normal operation. However, the modules in the phase which had the faulty modules, experience overvoltage. Therefore, to equally share the increased voltage burden among all healthy modules of three phases, optimal angles of the phase voltages are calculated by equations (1) to (5) to obtain Fig. 26(b).
The output harmonic distortion, however, may be significant since this method requires the converter to work in the overmodulation region by injecting excess common-mode voltage [50]. Even though common-mode voltages do not appear in output line-line voltages, they can lead to unbearable voltage stress on motor bearings and shafts. With this method, the fundamental amplitude of the common mode voltage is reduced, resulting in a more bearable operating condition for the load [84].
Not only are there fault conditions without an equation solution or unique solution, but they may also not cause the maximum available voltage.

VI. CONCLUSION
In fault-tolerant topologies, due to the increased number of switches, fault-tolerant converters are more likely to have switch breakdowns than standard converters. However, when these solutions are used, there is a significant decrease in the probability of a complete converter failure, something that is imperative for many industries that use power electronic converters. The first step of fault management is fault diagnosis and after that, fault isolation is considered the primary step to minimize the aftermath of a fault by isolating the fault using extra hardware including fuses, TRIACs, etc. After fault isolation, the effects of the fault must be compensated. The switching state redundancy techniques may use a few extra components to make the possibility of using alternate conduction paths for post-fault, however, these extra component does not necessarily put them in the hardware redundancy category which usually uses the extra switches or legs or modules to replace the faulty ones directly. When a fault happens in the inverter, or even after using a fault hardware redundancy or switching states redundancy methods, there may be imbalances such as voltage unbalance that can be taken care of using enhancing the control strategy of the converter. Choosing the best fault compensation technique first depends on the type of inverter and the application in which the inverter needs to be fault tolerant. Since usually extra components are added in fault tolerance strategies, the cost is an important factor in considering the desired fault compensation solution along with Losses and complexity.
Although various fault-tolerant topologies have been proposed in the last two decades, due to the increase in the demand for reliable systems, there are lots of potentials to propose novel fault compensation ideas, especially imbalance control strategies.
As users may have different main considerations, the suitable redundant designs are application dependent. For a noncritical application, the economy is the main consideration. Although using redundant cells increase the cost of the system, a high level of reliability and survivability of the drive system is essential in some critical industrial processes involving high standstill costs and safety concerns. In addition, lifetime prediction and cost assessment enable us to identify redundant designs that are most cost-effective. Despite the abundance of redundant designs and fault-tolerant algorithms, their reliability improvements remain largely unquantified. Redundancy costs are assessed to determine the cost of fault-tolerant converters in order to provide manufacturers with an affordable option. DMITRI VINNIKOV (Fellow, IEEE) received the Dipl.Eng., M.Sc., and Dr.Sc.Techn. degrees in electrical engineering from the Tallinn University of Technology, Tallinn, Estonia, in 1999, 2001, and 2005, respectively. He is currently the Head of the Power Electronics Group, Department of Electrical Power Engineering and Mechatronics, Tallinn University of Technology, Estonia. He is the Head of the Research and Development and a Co-Founder of Ubik Solutions LLC-Estonian start-up company dedicated to innovative and smart power electronics for renewable energy systems. Moreover, he is one of the founders and leading researchers of ZEBE-Estonian Centre of Excellence for zero energy and resource efficient smart buildings and districts. He has authored or coauthored two books, five monographs, and one book chapter, as well as more than 400 published articles on power converter design and development and the holder of numerous patents and utility models in this field. His research interests include applied design of power electronic converters and control systems, renewable energy conversion systems (photovoltaic and wind), impedance-source power converters, and implementation of wide bandgap power semiconductors. He is the Chair of the IEEE Estonia Section. HADI TARZAMNI (Student Member, IEEE) was born in Tabriz, Iran, in 1992. He received the B.Sc. and M.Sc. degrees (Hons.) in power electrical engineering from the Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, in 2014 and 2016, respectively. He is currently pursuing the dual Ph.D. degree in power electronics engineering with the Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran, and the Department of Electrical Engineering and Automation, Aalto University, Espoo, Finland. He has authored and coauthored more than 30 journal and conference papers. He also holds six patents in the area of power electronics. Since January 2021, he has been a Researcher with the Department of Electrical Engineering and Automation; and the Department of Electronics and Nanoengineering, Aalto University, Finland. His research interests include power electronic converters analysis and design, DC-DC and DC-AC converters, high stepup power conversion, soft-switching and resonant converters, and reliability analysis. He was a recipient of the Best Paper Award in 10th International Power Electronics, Drive Systems and Technologies Conference (PEDSTC), in 2019. He has been awarded a three-year Aalto ELEC Doctoral School grant, and a Jenny and Antti Wihuri Foundation Grant, in 2021 and 2022, respectively.