Infrared thermography‐based diagnostics on power equipment: State‐of‐the‐art

National Basic Research Program of China (973 Program), Grant/Award Number: Nos. 2018YFB0904400; National Natural Science Foundation of China, Grant/Award Number: Nos. 51877171 Abstract As a non‐contact temperature distribution measurement method, infrared thermography (IRT) has emerged as an indispensable tool in condition monitoring and fault diagnosis of electrical equipment based on the absolute and relative temperature values. Manual fault inspection, as an expert‐experiences based evaluation method, has formed a mature technical scheme with a large number of application cases. However, the efficiency and accuracy of manual fault inspection are being challenged by the rapid growth in the number of equipment in power grid. The situation is improving with the advanced of image processing technique. Machine‐assisted fault diagnosis provides a novel method to assist human beings to complete fault diagnosis under the intervention of human prior knowledge. However, the limitations of infrared images bring challenges to image analysis processing especially target detection. In pursuit of automatic fault diagnosis, deep learning algorithms are introduced to achieve target detection in the complex environment. This study reviews the development of IRT‐based diagnostics beginning with the general procedures, objects, and limitations of IRT‐based fault inspection, and then gives an insight into the popular machine‐assisted fault diagnosis as well as image‐based intelligent fault identification. In addition, the future recommendations of IRTare also provided from construction of intelligent infrared detection system, establishment of an open and shared infrared image database and comprehensive utilization of joint visualization diagnosis technology.


| INTRODUCTION
A great deal of practical experience shows that in all probability the abnormal working state and insulation degradation of power equipment give rise to heat accumulation which is deemed as a major cause of accelerated ageing even the whole equipment failure [1]. Accordingly, temperature rise monitoring is widely applied to early manifestation of insulation failure, overloading as well as inefficient operation for transmission and transformation equipment in the power system. Infrared thermography (IRT), as a high sensitive, precise and noncontact temperature distribution measurement [2], has become one of the most indispensable condition monitoring and fault diagnosis tools of electrical equipment in past decades [3], and has significantly enabled and improved on/off-line monitoring and routing inspection in substation and transmission line.
The development of IRT-based fault diagnostics on power equipment, in retrospect, could be summed up with three stages, namely manual fault inspection, machine-assisted (or semi-automatic) fault diagnosis and image-based intelligent fault identification. Manual IRT fault inspection is a manualbased IRT, which relies on the years of summary of long-term fault field experience and human prior knowledge which can hardly be equally shared by others. In the past half century, the manual fault inspection formed a mature technical scheme with a large number of application cases [4,5]. In such inspection, well-qualified and experienced workers are of the essence throughout the entire process including visual object identification, thermal image acquisition, overheating region searching and fault matching with judging index. Admittedly, such manual fault diagnosis method is flexible in field application, but it is over-reliant on subjective and indescribable experience and time-consuming especially for a whole substation. The efficiency and accuracy of the manual IRT fault diagnosis are increasingly being challenged by the rapid growth in the number of equipment in the power grid [6]. As a consequence, some image processing technologies are adopted to replace a part of manual operation and has laid a foundation for a more rapid and accurate machine-assisted fault diagnosis [7]. Machine-assisted fault diagnosis, also known as semi-automatic fault diagnosis, is a method that uses computers to replace human beings to complete primary target detection, temperature information extraction and assist human beings to complete fault diagnosis under the intervention of human prior knowledge. In general, the fault diagnosis is implemented in two steps, that is, extracting the object regions from the background with image segmentation and feature extraction algorithms, and evaluate the overheating fault with judgement criteria [8]. The bottleneck in this stage is the extraction of equipment regions and the recognition of multiple types of equipment under complex background [3]. With deep learning (DL) algorithms, represented by convolutional neural network (CNN), image processing has been promoted to a new stage in the field of object detection [9]. We name this stage as imagebased intelligent fault identification, and in this stage the artificial intelligence algorithms have been introduced to achieve target detection in complex environment. Compared with the machine-assisted fault diagnosis method, the self-learning ability and generalization ability of artificial intelligence algorithms can realize the simultaneous recognition of multiple types of devices by using the trained model without human intervention. That is, it places greater emphasis on detecting a broad range of equipment types, even different instances of the same object class, instead of specific object category. Some intelligent image processing models have been introduced and tailored for IRT-based fault identification and demonstrated excellent results in terms of efficiency and accuracy [10].
In this study, the status and development of IRT-based diagnostics on power equipment are collated and summarized in terms of hardware capability, fault identification procedure, traditional and state-of-the-art thermal image processing approaches, clarifying the future development. The study is organized as follows: In Section 2, the fundamentals and the limitations of the IRT are briefly introduced. The typical thermal faults of power equipment and the general diagnosis procedure in field are demonstrated in Section 3. In Section 4, some traditional image processing approaches are collated. In Section 5, intelligent processing methods tailored for IRT-based fault diagnosis are presented. Finally, some thoughts on existing approaches and prospects for future development are drawn in Section 6. We hope this review provides a reference for innovative IRT-based applications, strategies and algorithm studies.

| PRINCIPLES AND LIMITATIONS OF IRT
In order to better apply IRT in fault diagnosis, it is necessary to fully understand its technical principles and limitations.

| Fundamental principles of IRT
In a nutshell, everything in nature with a temperature above absolute zero (À 273°C) radiates infrared at all times, and this infrared radiation carries temperature characteristic information about the object [2]. The generation and propagation of thermal radiation follow the thermal radiation laws: Planck's law: It shows the relationship between the radiance u(T) and wavelength λ of radiation emitted from a blackbody, which absorbs all incident radiations and radiates a continuous spectrum, at any temperature T.
where u(T) is the radiance by the blackbody per unit and per solid angle for a particular wavelength λ (μm), W·cm -2 ·μm -1 ·sr -1 . T is the blackbody temperature, K. Stefan-Boltzmann law: The total power W b radiated by per unit area of a blackbody during per unit time is proportional to the fourth power of the blackbody's temperature T.
where W b is the total radiant power, W/m 2 . ε is the radiation coefficient and the radiation coefficients of common materials of power equipment are shown in Table 1 [11,12]. Wien displacement law: The wavelength of the peak of the radiation spectrum is corresponding to the blackbody temperature.
On this basis, IRT imager collects the thermal radiation emitted by the object and converts it into temperature distribution pictures. This process can be described as the following model which shows the conversion process of thermal radiation signals into electrical signals and pixel values through the optical system, sampling system and detector, as shown as Figure 1 [13].
In Figure 1, u P 0 , λ (r 0,λ ,t) is the radiance emitted at point P 0 of the power equipment in direction r 0 at time t. E P i ,λ (λ,t) is the radiance received at point P 1 (corresponding to P 0 ) in direction r i through the solid angle Ω opt subtended by the exit pupil at time t. E P i , λ (λ,t) can be calculate by the relation (4).
where θ is the angle of the direction r i to the optical axis.ϕ λ ðλ; tÞ is the spectral flux received by the detector area A det and S is the electrical signal taken from the detector 388 - output in the sensitive spectral range [λ min , λ max ] after integration during the time T int .
where R(λ) is the spectral transmittance of the detector.
Therefore, each pixel value on the detector can be calculated by the relation (7).
In the conversion process model, the structures and performance of detectors directly determine the qualities of the infrared images. The development of infrared detectors, in retrospect, could be summed up with four generations [14], as shown as Figure 2. The first generation carried on the infrared (IR) thermal images mainly by the charge coupled device unit or the multi-units scanning the objects. In the advanced second generation, focal plane array detector with numbers of units became the new mainstream direction. The progress of third and fourth generations lies in the development of photosensitive materials: HgCdTe(mercury cadmium telluride), Quantum-well IR photodetectors and type-II superlattice systems. It is expected that the larger number of pixels, higher frame rates, better thermal resolution as well as multicolour functionality and other on-chip functions are pursuing goals. -389
� Low resolution, strong spatial correlation Infrared image technology obtains the object images through the temperature difference between object and environment rather than the boundary or colour information. However, due to the small temperature difference and the low spatial resolution of infrared detectors, infrared images are always with low resolution and strong spatial correlation.
� Low signal noise ratio Infrared images are always with low signal noise ratio (SNR) because of the complex sources of noise signals during the process of atmospheric propagation, photoelectric conversion, digital signal processing and non-uniformity correction.
� Heterogeneity [18] The heterogeneity of infrared images will be caused by the response characteristics of detector unit, coupling characteristics of circuit, working state, working environment and other reasons, which will eventually be manifested as noise and distortion of image.

| GENERAL DIAGNOSIS APPROACH OF THERMAL FAULTS OF POWER EQUIPMENT
This section collates the general procedures and objects of thermal diagnosis of power equipment. First, the typical faults of power equipment are classified according to the categories of equipment and underlying causes. Then the methods of equipment status evaluation are introduced, and the typical infrared images of multi-type faults of power equipment are given. In the end, the general diagnosis approach is demonstrated with a practical case.

| Thermal faults of power equipment
Harsh environmental electrifications continue to endure high field, heavy load, contamination, and hazardous stress [19], most of which could give rise to internal or external heat accumulation and accelerate the thermal ageing. Therefore, temperature anomaly is a critical indication to determine the malfunctioning condition of the equipment [7].
According to the categories of equipment and underlying causes, thermal faults can be divided into four categories, that is current-induced heating fault, voltage-induced heating fault, synthetic heating fault and non-electrical fault, as summarized in Table 2 [19]. Current-induced heating faults are common in current-carrying units because of the abnormal increase of electric resistivity or load current [20]. In general, this type of fault is featured by a significant temperature rise. With regard to voltage-induced heating fault, it is normally caused by the increase in dielectric loss, leakage current and local-enhanced field [21]. For synthetic faults, which are caused by electromagnetic effects, such as eddy current and induction current, are normally featured by relatively low temperature [22]. In addition, abnormal temperature change caused by gas leakage is also a common fault [23].

| Abnormal temperature rise judgement
There are three common abnormal temperature rise judgement methods named absolute temperature rise-based method, image feature-based method and homogeneous comparisonbased method.

| Absolute temperature rise-based judgement
This method is suitable for voltage-induced heating faults and electromagnetic heating faults. The temperature value of the hotspots will be compared with temperature limits for the specific climatic conditions and operating load given in the standards.

| Image feature-based judgement
This method is suitable for voltage-induced heating faults. The temperature rise condition is rated by comparing the acquired infrared image to the similar images in normal and abnormal conditions matched in the image depot.

| Homogeneous comparison-based judgement
Considering that the temperature value is inevitably affected by surrounding environments and operating conditions, the temperature difference (△T) between the hotspots of homogeneous equipment working at similar operating and environment conditions such as parallel connections or different phases, is a more practicable factor for status evaluation [24]. By considering the environment temperature, the relative temperature difference η can be calculated by relation (8): where τ 1 is the hotspot temperature rise, T 1 is the hotspot temperature, τ 2 is the reference spot temperature rise, T 2 is the reference spot temperature, T 0 is the environment temperature. Table 3 gives fault degree of power equipment based on η by Chinses power industry standards DL/T 664-2016 [25]. In addition, there are also some standards such as Inter National Electrical Testing Association [26], American Society for Testing & Materials-E1934 [27] and National Fire Protection Association-NFPA 70-B [28] being employed as guidelines for IRT inspection.

| Typical fault infrared image cases
Infrared diagnosis relies on the existing experience, so it is of great significance to construct the typical fault infrared image library based on a large number of practical cases. By sorting out the practical cases in the literature [21,25,26,29], infrared case maps are given as examples according to the type of equipment and typical faults, as shown in Table 4 and Figure 3.

| General procedure of thermal fault identification
The general procedure of thermal fault identification is described as Figure 4 based on a great deal of field application experience.
i. Select a proper field of view covering the main object ii. Locate the hotpots on the specific devices of the object equipment iii. Evaluate the thermal status by using abnormal temperature rise judgement guidelines described in Section 3.2 iv. Identify the origin of the fault in light of past experiences or match the fault in the image case library If the equipment is not faulty, the inspection will be completed and the hotpot temperature value will be recorded for could-be historical trend analysis.
There is an example of performing IRT fault diagnosis on a high voltage post insulator being shown in Figure 5 [25]. By acquiring an IR image covering the insulators of the three phases, the hotpot is positively located on the B phase post insulator with temperature value of 26.2°C, homogeneous comparison method is thus adopted in the fault diagnosis. By comparing with the temperature value of the A phase post insulator, the result shows that the temperature rise (ΔT) is 1.3°C . It is speculated that the fault is due to increase of leakage current, which means the B phase post insulator should be replaced immediately.

| Summary
The extensive application of thermal faults diagnosis of power equipment based on IRT has significantly improved the validity of early fault diagnosis and formed a set of effective manual fault diagnosis method. Admittedly, such human-based method can flexibly complete fault diagnosis in complex test conditions, but it is over-reliant on subjective and indescribable experience. In addition, it is difficult to ensure its work efficiency as the number of power equipment increases greatly. Therefore, using machine vision technology instead of manual analysis has become a new development trend. In this process, studies on automatic IRT fault diagnosis has gone through two stages that is machineassisted fault diagnosis and image-based intelligent fault identification, which will be discussed in Section 4 and Section 5.

| MACHINE-ASSISTED FAULT DIAGNOSIS
Machine-assisted fault diagnosis, as a kind of semi-automatic method, can replace a part of manual operation in image pretreatment and target detection, and significantly improve the accuracy and speed of image processing. The basic flowchart of machine-assisted fault diagnosis is described by Figure 6. In this section, the principles and the advances of image pre-treatment, image segmentation, target identification, feature extraction and thermal status rating are investigated respectively.

| Image pre-treatment
As aforementioned in Section 2.2, a high quality of infrared image is the promise of effective target identification and fault location. Therefore, prior to target detection, image pre-processing should be performed to suppress the noise, enhance the contrast and improve the image quality.

| Image segmentation
Image segmentation is the essential element in target extracting, which is performed to divide the image into several specific regions of interest with unique features including grey scale, colour, texture and shape of the image and extract the target of interest [41]. In principle, it can be classified into threshold segmentation, edge detection and region correlation [42]. In addition, there are also some methods based on the roughness, contrast, direction and compactness of adjacent pixels with Body connection ◎ ◎ - Cooler Oil pillow Oil pillow capsule fell off Shunt capacitor bank fuse Internal damp poor connection Series reactor Abnormal voltage internal damp cooling system failure Bus Parallelling reactor Turn-to-turn short circuit poor connection internal damp (Continues) XIA ET AL.
-393 similar features space clustering [17]. To summarize the efforts made in such field in recent years, some representative examples of infrared image segmentation methods are listed in the Table 5.
And an example of IRT images segmentation results using various methods is shown as Figure 7 [56]. As the shown in Figure 7, due to the limitations of infrared image, the problem with simple image segmentation methods is over segmented or under segmented. Fusion of segmentation methods is an effective way to solve the problem [43]. For instance, Sobel operator is introduced into the regional growth algorithm as an additional growth condition in the growth criterion, which will effectively reduce the over-segmentation and under-segmentation caused by noise signals and ensure sufficient computing speed. In addition, some morphology and random process such as morphological algorithm [46,55] and Markov random field [56] are introduced to improve the accuracy of infrared image segmentation.

| Target identification
As aforementioned in Section 3.4, an accurate object recognition is the basis of automatic fault location and following diagnosis. In specific realization process, the features of the equipment region in the infrared image should be firstly extracted, and then the prior classifier should be employed to reconstruct the object with the extracted features.
In principle, the essence of feature extraction is to describe the spatial features of high-dimensional images with lowdimensional spatial features by means of mapping [58]. In brief, colour, texture and geometric feature are three most important features for images [59]. However, the colour feature of infrared images corresponds to the temperature distribution rather than the colour information. As a consequence, it is not suitable as the feature in infrared images. The texture feature can accurately describe the device feature depending on the quality of the infrared image, and the typical texture feature is histogram of oriented gradient [59]. With the invariant of rotation, translation and scale, a series of invariants of geometric features such as Hu moments [60,61] and Zernike moments [62] are used as the equipment feature in infrared image. An effective classification model is another key factor of target identification, which is generally built via continuous renovation and perfection with the features extracted from

-
new observation data in the application. In pursuit of the training efficiency and model robustness, two classical classifiers should be mentioned, that is, support vector machine (SVM) [63] and artificial neural network (ANN) [64], which are well recognized in wide applications in practice. The former has better generalization ability and interpretation ability, and is applicable to the optimal solution of small sample space. The latter can have strong nonlinear fitting ability and self-learning ability, which is more suitable for the optimal solution of large sample space. For different research objects and modelling accuracy, the two methods achieve different results. Table 6 sums up the experiences of infrared image target identification in recent years, which provides various attempts at target identification of different power equipment. It should be pointed out that though feature extraction and target identification has gained some traction, a generic model is still lacking, which is inclusive for diverse infrared image samples in practical use without human intervention.

| Temperature information extraction
For open source infrared imager, it provides a header file with a spatial temperature matrix included, by which the temperature distribution for the region of interest (ROI) can be extracted expediently [68]. But for non-digital imagers still in use and source-restricted commercial imagers, the temperature information of ROI has to be extracted from infrared image directly. As aforementioned in Section 2.1, Table 4) XIA ET AL. the infrared imager converts the temperature information of each detecting pixel into the image with a colour mapping function, which makes it possible to extract the temperature information with a certain fitted function describing relationship between the red-component, green-component and blue-component or grey value and temperature value. Toward this end, linear/nonlinear continuous functions [66,68] or more complex functions such as piecewise function [69] are utilized. Based on this, the temperature range can be mapped into colorimetric value range, and the ROI can be coded into a spatial temperature matrix. The general process of temperature information extraction is described in Figure 8.

| Thermal fault identification and diagnosis
Thermal fault recognition is the final step of the whole IRT-based fault diagnosis. Based on the premise that the target identification and temperature information extraction are fully completed, the fault recognition can be easily realized by comparing the relative temperature difference η of ROI with the corresponding criteria of standard references [60,64,68]. But in reality it is hard to guarantee the confidence of such evaluation by using a single characteristic parameter. On account of the variety of thermal faults of power equipment and the complexity of underlying causes, thermal fault recognition is the step that most dependent on manual intervention and experience transplantation. Even so, the use of machine assistance can still greatly improve the efficiency of diagnosis, and the evolution and application of intelligent algorithms make it possible to replace human labour.
ANN and its extended models have achieved good performance in thermal fault diagnosis [24]. Hongying

F I G U R E 6
The flowchart of machine-assisted fault diagnosis. IR, infrared 396ambient temperature and relative humidity [71]. A probability neural network model with temperature eigenvector as characteristic parameters was constructed to diagnosis the low or zero resistance faults and pollution faults of the insulator string [72]. Multilayer perceptron network was applied to evaluate the equipment condition in reference [73]. Test results of the above networks show that the classification accuracy of these systems can reach more than 80% even 90%.
As a matter of fact, even the intelligent approaches have been adopted, such machine-assisted diagnosis still falls behind practical requirement. The variety of thermal faults of power equipment and the complexity of underlying causes are still the main challenges in reality. At the present stage, machine-assisted diagnosis is proven to be effective for the object with simple structure and single fault cause. In the example of a thermal fault of contact connector studied in reference [4], as shown in Figure 9, the η between the A

Method Model and algorithm Object Reference
Threshold method Fuzzy Renyi entropy and chaos differential evolution algorithm Isolating switch [43] OTSU algorithm and normalized cross correlation (NCC) template matching Transformer bushing [44] Bi-threshold OTSU algorithm Insulator [8] Mathematical morphology improved OTSU algorithm Insulator [45] Control components for electric machines [46] Edge detection method Roberts operator Solar panel connecter [47,48] Prewitt operator
phase and C phase is 87%, and the fault degree is determined as major. By matching the temperature rise range and the potential faults causes of contact connector, the thermal fault is accurately recognized as poor connection due to the loosen of bolt. However, for the equipment with multiple fault causes, such diagnosis is inadequate without input of auxiliary information. For example, as shown in Table 4, the overheating fault reasons of the mutual inductor electromagnetic unit mainly include internal damp, turn-to-turn short circuit and ferromagnetic resonance which will lead to the similar temperature distribution on the equipment surface, therefore the fault reason cannot be concluded based on the infrared image alone. To solve this problem, the combination of IRT diagnosis and other various sensing information, such as dielectric loss measurements and moisture content, dissolved gases analysis (DGA) and voltage monitoring is a feasible approach.

| Summary
In machine-assisted thermal fault diagnosis, the accuracy fault recognition depends to a great extent on feature extraction and target detection, while the features are still manually selected with means of complex enhancements and segmentations, and traditional target detection is still inadequate to identify multicategory power equipment under complex background. The problems remaining in machine-assisted fault diagnosis, such as weak universality, complicated process and slow recognition speed hinders its extensive use, and thus are expected to be addressed by more intelligent and universal strategies.

| IMAGE-BASED INTELLIGENT FAULT IDENTIFICATION
Compared with traditional target detection, the intelligent algorithms represented by DL have raised the level of target detection and realized image-based intelligent fault identification. The contributions of the intelligent algorithms are mainly reflected in the process of target detection, while the temperature extraction and fault diagnosis are still similar with previous methods. Therefore, this section mainly reviews the application of DL algorithm in target detection.

| Overview of the deep learning
In 2006, the concept of the DL was proposed by Geoffrey Hinton in reference [74]. DL is a representation-learning method with multiple levels of representation, obtained by composing simple but non-linear modules that each transform the representation at one level (starting with the raw input) into a representation at a higher, slightly more abstract level [75]. Compared with traditional learning algorithms such as SVM, DL is a data driven method without the process of manual feature extraction. Due to the use of complex models, depth features have more accurate and general expression ability [76]. DL technique has been widely applied to identify objects in transcribing speech into text, match news items, posts or products with users' interests, and select relevant results of search, especially in image target detection [75]. The typical DL net structure includes Deep Belief Network, CNN, recurrent neural network and Capsule Network (CapsNet). Among them, CNN and its extended framework have made great achievements in the area of image processing [77].
In terms of network architecture, CNN is a kind of deep feedforward network and the basic structure consists of input layer, multiple convolutional layers, multiple pooling layers, full connection layer and output layer. The basic structure of the CNN is shown as Figure 10 [78].

| Input layer
It is the gateway to the entire network for data input.

| Multiple convolutional layers
In these layers, the input data will be extracted with the convolution features by the convolution kernel (also called filter). Then the results will be nonlinear mapped through the activation function to generate the 2-dimension feature maps and stack along the depth direction to obtain the output data body of the convolutional layers. Commonly used activation functions include the Sigmoid function, Tanh functions and the Rectified Linear Unit (ReLU).

| Multiple pooling layers
Pooling layer is to combine the response values of the feature maps based on pooling function, so as to realize the reduction and abstraction of the convolutional features, which can reduce the space size of the feature maps and the amount of network computation. Common pooling functions include the average pooling function and the maximum pooling function.

| Full connection layer
Full connection layer integrates and classifies the features after multiple convolution and pooling operations.

| Output layer
It is the outlet for classification results.
In this study, the application of CNN and its extended structures in infrared image processing of power equipment will be introduced. Due to the topic of this study, more details about the background knowledge and the basic network architecture of CNN can be referred to literatures [75,77,79].

| CNN-based target detection on IR images
There are two frameworks for CNN-based target detection methods used in IR image processing [77]: i. Two stage detection framework, or region proposal framework. In such framework, a pre-processing for building category-independent region proposals from the image is carried out before the CNN features can be extracted from the proposals, and then the labels of the proposals can be determined by classifiers. The representative algorithms are Region-based CNN (R-CNN), SPPNet, Fast R-CNN, Faster R-CNN, Region based Fully CNN, Mask R-CNN and Light Head R-CNN. These methods are satisfactory in term of the detection accuracy but they will inevitably take lots of computing resources and processing time ii. One stage detection framework, or region proposal free framework. This framework supports making the overall pipeline in only one stage without separation of detection proposal. In this framework, CNN is proposed as a regression device, and the whole image to be detected is regarded as a candidate region. By directly inputting the image into the convolutional neural network, the position information of the target in the image can be detected. The representative algorithms include DetectorNet, OverFeat, You Only Look Once (YOLO), YOLOv2, Single Shot Detector, YOLOv3, most of which are usually superior to the first framework methods in term of detection speed To further understand the applications of CNN on IRT image target detection, some latest research advancements for practical use are demonstrated as follows.
In 2017, Zhenbing Zhao et al. [10] presented a deep CNN feature extraction strategy and vector of locally aggregated descriptors (VLAD) feature map aggregating method for insulator strings detection in infrared images, as shown as Figure 11. Different from the conventional CNN-VLAD method, this network extracted deep activations from convolutional feature maps of infrared insulator and modified the deep feature extraction framework by replacing the last three full connection layers with the VLAD pooling layer, finally an SVM classifier was trained for infrared insulator classification and detection. Compared with the previous work, this method greatly enhances the invariance of deep features and achieves the extraction of deep features. Similar work has been reported in the literature [80]. The authors achieved the identification accuracy as 99.9% on seven types of equipment, that is, insulator strings, lightning arrester, circuit breaker, current transformer, capacitor voltage transformer, disconnecting switch and high voltage bushing, by inputting 4119 samples into a CNN model with 10 layers.
In 2018, an automatic segmentation and recognition system for IRT images of power equipment based on CNN was reported, in which, a segmentation method named JSEG was used to extract the overheat spots, and YOLO network was applied to identify the equipment type of the hotspots [81]. Xiaojin Gong et al. also reported a deep CNN based on YOLO for power equipment detection in thermal images [82]. The overview of the model is shown as Figure 12. The deep CNN took an IRT image as input and output both oriented bounding boxes and associated class probabilities, followed by a non-maximum suppression step to obtain final detection results. This model overcomes the decrease of identification accuracy caused by orientation changes and eliminates the background noise by predicting boxes tightly bounded instead of simple upright bounding boxes. In addition, this model realizes the prediction of the coordinates, orientation angle and class type of each equipment part from the complex scenarios.
In 2019, Li Lianqiao et al. realized target detection and automatic readout of the temperature value of the hottest spot based on YOLO [83]. In addition, authors compared its recognition accuracy and processing speed with R-CNN-based model. The results showed that in spite of the slight loss of the recognition accuracy, introduction of YOLO can significantly improve the recognition speed of IRT image target.
To enhance feature extraction ability and improve target positioning accuracy, Guangjun Yang introduced a Faster R-CNN based on Feature Pyramid Network model, which expressed good potential in solving the small target detection of the infrared image and demonstrated a considerable identification speed at 15 frames per second on graphics processing unit for streaming media applications [84].
In 2020, Mask R-CNN was first introduced by our research group in instance segmentation, by which the outlines of objects are predicted at the pixel level and the different objects belonging to the same category are distinguished [68]. The principle of Mask R-CNN-based instance segmentation for insulator infrared image is shown in Figure 13.
To address the Small Sample Size (SSS) problem which is the most common causes of training failure, we adopted the transfer learning method to complete the preliminary training of Mask R-CNN model. Thereinto, the Common objects in context dataset was tailored with limited amount of annotated F I G U R E 1 1 The global image descriptor generating framework. An infrared image of insulator is fed into an ImageNet pre-trained CNN model, and then deep convolutional activations are extracted. The feature maps of the convolutional layer are vectorized and pooled by VLAD coding and quantifying. Finally, the final image representation is generated. CNN, convolutional neural network; VLAD, vector of locally aggregated descriptors infrared images; while the other infrared images were used as the verification set and testing set to determine the Mask R-CNN weight. The design and flowchart of the instance segmentation process for infrared images was shown as Figure 14. In fact, in the training process of neural network, the adjustment of network parameters still depends on the participation of human experience, such as learning rate, number of hidden units, mini-batch size, training strategy, number of iterations per epoch and so forth. Therefore, when the sample size is small, manual intervention has a direct impact on the network precision.
With the high quality of the instance segmentation, the temperature distribution of the target could be easily obtained by converting the temperature grey value into temperature -401 matrix by function fitting. Figure 15 shows a practical diagnosis case of an insulator. The main steps to realize the automatic fault diagnosis can be described as follow: First, by using a trained Mask R-CNN model, the instance segmentation will be realized from the original infrared image and the mask region coordinates will be returned, as Figure 15a, b.
Second, the temperature information will be extracted from the pixels of the insulator mask, as Figure 15c. By image greying, the false colour information of mask areas is converted to grey values, as Figure 15c (1) to (2). On this basis, the temperature value will be fitted by grey values of every pixel of mask areas, as Figure 15c (3) and (4).
Finally, the insulator was thought as in fault state according to the abnormal temperature value, and the potential fault cause was considered to be transverse or longitudinal cracks combined with the information of temperature distribution.

| Summary
Image-based intelligent fault identification, as a data-driven feature extraction method, has a more accurate and general expression ability compared with semi-automatic method. It can extract power equipment accurately under complex F I G U R E 1 3 Flowchart of instance segmentation for insulator F I G U R E 1 4 Design and flowchart of the insulator instance segmentation process for infrared images. R-CNN, Region-based convolutional neural network 402background and the simultaneous recognition of multiple types of equipment. However, due to the accuracy and generalization ability of these models depended on the equipment infrared dataset whose establishment requires a great deal of manpower and time, the performance of the existing target detection models still cannot reach the level of manual recognition. At the same time, with the models being more complicated, it requires a greater supply of computing power by hardware.

| FUTURE RECOMMENDATIONS
With the increase of manpower cost and the expansion of system scale, automation and intelligence are the inevitable trend of the development of IRT of power equipment. In this study, the development of IRT is summarized, and several mainstream intelligent methods are described. This section provides the following suggestions for the future application and engineering practice based on the particularity and speciality of power equipment:

| Construction of intelligent infrared detection system
Whether the application of the machine-assisted method or the image-based intelligent method, the large-scale models and operations require huge computational power beyond the capacity of portable or online monitoring devices. To benefit from cloud computing and edge computing technologies which open up a new solution in terms of computing force distribution and storage resource sharing, the intelligent IRT diagnosis is expected to achieve higher efficiency as well as processing speed with limited local hardware resources [85]. A framework of the future edge-cloud synergetic intelligent IRT diagnosis system can be drawn as Figure 16.

| Establishment of an open and shared infrared image database
Like machine vision applications in other industrial and consumer markets, the development and application prospect of the IRT are strongly dependent on the openness and data sharing by manufacturers and users. Therefore, an ideal data ecochain should be established by the open access of the real data interface and software development kits provided by IRT device manufacturers, and a large amount of infrared fault cases provided be power equipment managers.

| Comprehensive utilization of joint visualization diagnosis technology
As mentioned as Section 4.5, the thermal fault of power equipment is a comprehensive result by various underlying causes, which are companied with other physical phenomena, such as electrical discharges and mechanical vibrations and can be reflected in a specific range of optical or acoustical spectrum, as shown in Figure 17. It is believed that in the foreseeable future, the knowledge dimensionality on fault diagnosis can be significantly improved by adopting more visual perception technologies [76,[86][87][88][89].

| CONCLUSION
In this study, we reviewed the IRT-based fault diagnostic on power equipment from its development history to recent advances. As the motive force of development, the rapid growing number of equipment in power grid necessitates the replacement of manpower with automatic and intelligent technologies. Following the evolutions of image processing algorithms and hardware computer power, the development of IRT-based fault diagnosis went through two stages, that is, machine-assisted (or semi-automatic) fault diagnosis and imagebased intelligent fault diagnosis, both of which render the great potential to replace a part of manual work even the whole in infrared image analysis. In this review, we attempted to sum up the experiences to answer the two critical questions, that is, how to use the image processing technology to extract the thermal status of the power equipment from the infrared image, and how to tell the fault causes with apriority and new knowledge mined from the temperature distribution information.
To solve these questions, the technical essentials including image pre-treatment, image segmentation, target identification, temperature information extraction and thermal fault recognition have been linked into a generic procedure for manual, F I G U R E 1 6 The framework of the future edge-cloud synergetic intelligent IRT diagnosis system. DL, deep learning; IRT, infrared thermography F I G U R E 1 7 Power equipment state visual perception technology and detection objects. (a) Optical visual means (b) Acoustic visual means 404semi-automatic as well as intelligent IRT diagnosis. It should be mentioned that the algorithms involved in such procedure, especially for image segmentation and target identification, are not dedicated to infrared image analysis, thus should be tailored to apply to low resolution infrared images and targets in complex background.
With the advent of the DL algorithms represented by CNN, the image-based intelligent fault identification has been inspired in recent years. Such data-driven method provides a fully automatic feature extraction without human interventions. Compared with the conventional approach, target detection based on depth features was proven to have higher accuracy and stronger generalization ability. However, the limited local computational power and the small sample size of the infrared image cases are the major bottlenecks that prevent the widespread use of the intelligent IRT diagnosis on power equipment. On the face of this case, more efforts are required to improve the algorithm efficiency of intelligent IRT diagnosis and to establish an open and sharable infrared image database in the future.