User-transformer relation identification based on power balance model and adaptive AFSA

User-transformer relation identification plays an important role in the correct management of low-voltage area archives and the improvement of line loss. In order to obtain an accurate user-transformer relation identification, this paper proposes a user-transformer relation identification method in the low-voltage area based on power balance model and adaptive artificial fish swarm algorithm(AFSA). This method uses the summation relationship between the total meter of the transformer and the user’s meter/meter box to fit the coefficients of the power balance equation through the AFSA, then we use the coefficients and related statistical values to judge the user-transformer relation. The main innovations are: this paper proposes a power balance model to solve the problem of user-transformer relation identification, which is simpler than previous methods and has strong operability; AFSA is used to fit the regression coefficients of the power balance equation, which has advantages in calculation accuracy and efficiency compared with the traditional least squares method; an improvement strategy of adaptive step length is proposed to make the ability of AFSA to find superior stronger. By selecting real station data for verification, the result shows that the method in this paper can quickly and accurately identify user’s meters/meter boxes with abnormal user-transformer relationship, the method in this paper has high computational efficiency and recognition accuracy without additional labor and hardware costs.


Introduction
The line loss rate of the transformer is an important indicator to measure the operation and management level of the power company. Improving the calculable of line loss rate is the fundamental way to reduce loss and increase efficiency [1] . The correct relationship of user-transformer in lowvoltage stations is conducive to the calculation and analysis of line losses. However, in reality, there are often deviations in the archives of transformers, which is caused by reasons such as untimely updating or loss of recorded information. This also resulted in the inability to identify the exact ownership of a certain electric meter during the inspection process in the transformers, which easily caused customer disputes, affected the social image of the power company, and reduced the efficiency of the power grid operation [2] .
At present, there are two methods for user-transformer relation identification in low-voltage stations. The first method is to install a transformer identification device. This device mainly uses power carrier technology, but the carrier signal cannot be effectively isolated by the transformer, it often causes crossover communication problems in the transformer, which has reduced the accuracy of user-transformer relation identification. The literature [3] improves the accuracy of the usertransformer relation identification by applying deep learning to the detection of the transformer recognition instrument, but when using it, it is necessary to install some hardware devices such as current transformers at the outlet end of the transformer, which often faces some safety problems. The other method is to extract the electrical data of the transformer such as voltage, current, etc. in the power consumption information collection system, and use the relevant algorithm model to the usertransformer relation identification. The voltage zero-crossing drift data is clustered in the literature [4], and compared with the voltage zero-crossing drift data of the transformer meter to finally determine the user-transformer relationship, but the zero-crossing drift data needs to be measured in advance by professional equipment, and the algorithm model proposed in the literature is relatively complex and not practical; the literature [2] analyzes the correlation of voltage between the user meter and the transformer meter from the two dimensions of space and time, and then determines the usertransformer relationship. This method needs to manually obtain the attribution relationship between the transformer and some subordinate meters as sample for training, which brings additional labor costs. Therefore, it is important to improve a method for user-transformer relation identification ensures recognition accuracy and computational efficiency, which can also reduce labor and hardware costs.
This paper proposes a user-transformer relation identification method based on the power balance relationship. The user-transformer relationship is determined by the electricity additive relationship between the transformer meter and the user's electricity meter.

Power balance model
Without calculating the line loss, there is a summation relationship between the power consumption of the transformer meter and the daily power consumption of each user meter, as shown in the following formula: In formula (1), t y represents the collected power on the Day t of the transformer meter to be analyzed( 1, 2,3, , t D   ), and t x represents the collected power on the Day t of the user's electricity meter i to be analyzed( 1, 2,3, , i D   ). The analysis days is D , and there are n meters to be analyzed in the transformer. i a is the coefficient of the user meter to be solved. Ideally, 1 i a  indicates that the user's electricity meter i belongs to the transformer to be analyzed and the usertransformer relation identification is correct; 0 i a  indicates that the user's electricity meter i does not belong to the transformer to be analyzed, and the user-transformer relation identification is wrong.
In normal power grid operation, line loss must exist. According to electrical engineering theory, in the same transformer, except for other factors, the normal line loss of each line is determined by the current of the line. The power consumption of user meters is closely related to the current, so there is a positive correlation between line loss and power consumption of user's meters [5] . Due to the different electrical distances of user's meters, the contribution ratio of the power consumption of each user's meter to the line loss is also different, so the additive relationship of the power consumption in the transformer is: (2) In the formula (2), i  represents the contribution ratio of the user's electricity meter to the line loss of the transformer(the power consumption of users in the low-voltage station area has little fluctuation and i  can be approximated as a fixed value). i w represents the actual coefficient of power consumption in the user's electricity meter considering the line loss. Ideally, when 1 i w  and is close 3 to 1, it means that the user's meter belongs to the transformer to be analyzed, and the archives is correct; 0 i w  means that the user's meter does not belong to the transformer. Normally, the user's meters are installed in the meter box. In view of the low possibility of abnormal affiliation between the electric meter box and the user's meters, this article will summarize the power consumption in the unit of the meter box. The power consumption of the user's meters is summarized as the power consumption of the meter box. By judging the relationship between the meter box and the transformer, the affiliation relationship between each user's meter and the transformer is determined. This article will focus on analyzing the affiliation relationship between the transformer and the subordinate meter box.

Artificial fish swarm algorithm
It can be seen from 2.1 that the coefficient fitting of the power consumption data requires the multiple linear regression method, so as to obtain the regression coefficients of each user's meter, and then determine the attribution relationship between the user's meter and the transformer. At present, the least square method is the commonly method to fit regression coefficients, but this method has certain limitations for data sets with complex structures and large scales [6] . There are many electrical data fields in the transformer area, and the amount of data is huge. Therefore, the traditional least squares method cannot achieve ideal results in the user-transformer relation identification. In recent years, intelligent algorithms such as bee colony algorithm [7] and particle swarm algorithm [8] , which have been widely used to solve the parameter estimation problem of multiple linear regression, they have advantages of higher accuracy and faster convergence speed. Among these intelligent optimization algorithms, the artificial fish swarm algorithm is more robust, and has better optimization capabilities [9] . Therefore, this article chooses artificial fish swarm algorithm(AFSA) to solve the parameter estimation problem of multiple regression equation.
The AFSA is an intelligent algorithm proposed by studying the foraging methods of fish schools. Its principle is to continuously update the status of individuals through activities such as foraging, clustering and rear-end collision, using fitness function for evaluation, and iteratively find the optimal value [10] .
The basic steps of the AFSA are as follows: (1) Initialize settings Initial setting of algorithm parameters: including the number of fish swarm N , the initial position of the individual X , the field of view visual , the step length step , the crowding factor  , and the number of repetitions Trynumber .
(2) Calculate fitness value The bulletin board is responsible for recording the optimal individual in the algorithm. After the initial settings, the initial fish swarm uses the evaluation function to evaluate the location of the individual, and transmits the information of the individual with the highest food concentration to the bulletin board.
(3) Execute behavior function Executing behavioral functions including foraging, clustering, rear-end collision, and randomization for all individuals. The execution rules and the operation process of each behavior function are as follows [11] : ① Foraging behavior: this behavior is simulating individuals swimming to the direction of more food. The individual i W randomly swims to a certain position j W in the field of view.
If the forward condition is not met, i W continues to select the state j W in its field of view, and executes random behavior when the number of attempt is reached. ② Grouping behavior: This behavior mainly learns that fish swarm often gathers to find food or avoid harm.
③ Rear-end behavior: This behavior simulates that when an individual tracks food, other individuals will also swim over. Individual i W looks for the position j W with the highest food indicates that the surrounding of the optimal individual is loose, i W move a certain distance to the position of the optimal individual, otherwise perform foraging behavior.
④ Random behavior: this behavior indicates that the individual walks randomly, and the range of swimming is limited to the size of the field of view. The individual i W moves a distance randomly and arrives at a new location. () (4) Update fish swarm The fish swarm in the previous round executes the various behavior functions of step (3) to generate a new fish school. Using the evaluation function to calculate the fitness value of all individuals in the new fish school. Compared with the bulletin board individuals, if there is a better fitness value, please update the bulletin board.
(5) Algorithm termination conditions When the optimal solution reaches the set threshold or reaches the number of iterations, the algorithm stops running, otherwise, go to step (3).
Summarizing the above process, the overall flow of the AFSA is shown in Figure 1.  Fig.1 The flow chart of AFSA.

Adaptive artificial fish swarm algorithm
In the traditional AFSA, the search range of the individual is determined by the field of view, the convergence accuracy and speed are determined by the step size. These two parameters are fixed values in the original algorithm. So individuals cannot quickly converge to the optimal value in the later stage of the algorithm [12] . In order to quickly iteratively calculate the optimal solution, this article adopts a variable step size improvement idea to improve the traditional AFSA. The basic idea is: (1) After each round of iteration is completed, all individuals will transmit their position and fitness value to the bulletin board.
(2) The field of view ( visual ) needs to be determined by the individual according to the current state of the fish swarm, which is a non-fixed value.
(3) Introduce a new parameter-visual step coefficient a ( 0 1 a   ), the step length formula of individual is set to step a visual   .
4) When individual executes rear-end and clustering behaviors, if the fitness value of individual in the optimal position or the center position in the field of view is greater than the current position, then move a certain distance to the position with the largest fitness value. At the same time, in the adaptive AFSA, the crowded state is not considered.
The step size setting rules of the adaptive AFSA: suppose 0 Z is the current artificial fish and the position is C ; 1 Z is the current optimal artificial fish; 2 Z is the nearest artificial fish to 0 Z . 0 Z uses two fields of view 1 visual (distance between 1 Z and 0 Z ) and 2 visual (distance between 2 Z and 0 Z ) to find the optimize position in the surrounding, and determines two positions A and B randomly, calculates and compares the fitness values of A 、 B and C , if the fitness values of A and B are both smaller than C , at the same time the number of attempts is not reached, it will continue to search. If the fitness value of A or B is better than C or both better than C , the individual will swim to the point with the highest fitness value( A or B ). The step length is set to 1 () visual a rand   when moving forward to A , the step length is set to 2 () visual rand  when moving forward to B , as shown in Figure 2. The improved algorithm can adaptively conduct selective foraging in different fields of view, which not only speeds up the convergence speed, but also improves the optimization effect.

User-transformer relation identification based on power balance model and adaptive AFSA
The main steps of user-transformer relation identification method based on the power balance model and adaptive AFSA are as follows: (1) Data collection The power data used in this article are all obtained from the power consumption information collection system, and the main data information required is shown in Table 1. Meter box number 4 Meter number 5 Daily power consumption (2) Adjacent transformers division Considering that user-transformer relationship errors often occur in adjacent transformers geographically, the scientific and efficient division of adjacent transfromers can reduce the scope of user-transformer relation identification and quickly improve the accuracy of recognition. In view of the standardization of transformer naming, this paper uses the similarity of transformer names as the basis for division, transformer with high similarity in names are divided into adjacent transformers, such as " "PMS_10KV XX Bay 1# Transformer T1" and "PMS_10KV XX Bay 1# Transformer T2" are a group of adjacent transformers, and each group of adjacent transformers usually contains 2-4 transformers.
(3) Data preprocessing Due to the delayed collection or missing collection of the electricity collection equipment, the data collected from the system will meet the problems of missing some power consumption data and duplication of some data fields. Data preprocessing is required before the user-transformer relation identification is carried out. It is important to delete duplicate records and the meters with missing power consumption data. (

4) Multiple linear regression equation construction
The independent variable X of the regression equation is a matrix of D n  , n represents the number of meter boxes under adjacent transformers, and D represents the number of days that can be actually calculated after data preprocessing. As shown in (8), each column of X represents the power consumption of the meter box in each day of D days.
The dependent variable Y is a matrix of D s  , s represents the number of the transformers in adjacent transformers, and D has the same meaning as above. Each column of Y represents the power consumption of transformer meter in each day of D days.
According to the constraint conditions of multiple linear regression analysis, in order to solve the equations, the number of independent equations of the power balance equations should be more than the number of coefficients to be solved. Due to the uncertainty of user load, there is basically no two sets of equations that are fully proportional to power consumption information(the two sets of equations are not linearly related), which only need to ensure that the number of days D available for electricity collection is more than the number of meter boxes N in the transformer to be analyzed. At the same time, each group of adjacent transformers needs to execute s times, and a column in Y and X need to be selected for regression calculation in each time.
(5) Use adaptive AFSA to fit multiple linear regression parameters The goal of the problem to be solved is to find the optimal regression coefficients so that the fitting result of the power balance equation is close to the actual value. Therefore, the fitness function of the adaptive AFSA is set to calculate the deviation between the actual power consumption and the power consumption through the equation calculation of the transformer meter.
(6) Calculate the t-test statistic value of the multiple linear regression coefficient The t statistic is used to characterize the sub-fitness of the regression equation. The larger the statistical value, the better the fitting degree of the regression equation and the more reliable the regression coefficient. The calculation formula for the t statistic as follows: w is the regression coefficient, j is a natural number from 1 to p ,   is the standard deviation of j w , jj c is the element in the j row and j column of   1 ' X X  , and X is the independent variable matrix of the regression equation.
(7) Judgment of abnormal meter box Combined with the theory of power balance model, it can be seen that when the regression coefficient of the meter box approaches 0, and the regression coefficient with other transformer in the adjacent transformers approaches 1, it is considered that the meter box may be abnormal. In the actual calculation process, due to the existence of meter boxes with low power consumption or even zero power consumption, and the small number of samples that can be calculated, it will cause deviation when we use the coefficient discrimination method simplely.
In response to this problem, this article introduces the t statistic value as an additional basis for judgment based on the relevant theories of the significance test of multiple linear regression coefficients. When there are two transformers under a group of adjacent transformers. Suppose the regression coefficient between the meter box and the original transformer is 1 w , and the t test statistic value is 1 t ; the regression coefficient with the other transformer is 2 w , and the test statistic value is 2 t . Then the abnormal judgment rule can be described in the following mathematical way: In formula (12), the degree of 1 w approaches 0 needs to be determined according to the data quality after preprocessing (including the actual line loss rate, the proportion of low power consumption meter boxes, and the ratio of the number of samples that can be calculated to the number of meter boxes). The same is true for 2 w ; the value of n is an integer greater than or equal to 2, and the specific value also needs to be determined according to the data quality. When there are more than two transformers under a group of adjacent transformers, we need to calculate the coefficient and statistic of the same number.

Experimental results
All simulation experiments are based on Windows 10 Professional x64 system, and Anaconda3 development tools. First, select three typical Benchmark functions [13] to verify the effectiveness of the adaptive AFSA used in this paper. The related information of the test functions is shown in Table 2. The test results of the ASFA and adaptive AFSA(A-AFSA) are shown in Table 3 The iteration stop condition is to reach the accuracy. Adaptive AFSA parameter settings: population size  Table 3.   6.58 10   208 It can be seen from Table 2 that adaptive AFSA has faster convergence speed and optimization accuracy, which proves the effectiveness of the improvement measures proposed in this paper.
We select some test sets from Kaggle to verify the effect of the adaptive AFSA in multiple linear regression analysis, the relevant information of the test sets is shown in Table 4.  Table 5 (the name of the test sets adopts the abbreviated form of Table 3). 2s It can be seen that adaptive AFSA has advantages in calculation accuracy and efficiency in the fitting of multiple linear regression coefficients compared with the LSM. Then we will use the adaptive AFSA to identify the user-transformer relationship.
The power consumption data in this chapter comes from the electricity consumption information collection system of a prefecture-level city in Jiangsu Province from June 2020 to November 2020. It includes 20 transformers in two power supply stations, as shown in Table 6. In view of the small proportion of missing and repeated values in Jiangsu Province, the ratio of the number of samples that can be calculated to the number of meter boxes in each group of adjacent transformers is generally about 2-3, and the number of samples that can be calculated is large. Moreover, the quality of archives management in Jiangsu Province is relatively good. Therefore, the meter box with 1 w in -0.1-0.1, 2 w in 0.9-1.1, and 2

Conclusions
In order to solve the problem of user-transformer relation identification in low-voltage area, this paper proposes a user-transformer relation recognition method based on power balance model and adaptive AFSA. It can quickly determine the meter box with an abnormal user-transformer relationship without using additional measurement devices, thereby solving the problem of archiving the original lowvoltage area; effectively improving the level of management in the transformers. The disadvantage is that the method proposed in this paper is highly dependent on the quality of the electricity data. When the electricity data is severely missing or the overall power consumption of the meter box is small, the recognition effect of the method will be reduced, We will focus on improving the robustness of the model.