Demand-side Response User Selection Based on Improved BPSO

In order to use the demand-side response activities to stabilize the peak load, ensure the balance of power supply and demand, and promote the quality and efficiency of the power grid, this paper constructs the demand-side response user competition models in different stages. The improved BPSO algorithm is used to optimize the subsidy cost weighted model in the saturated user competition model, and the algorithm is implemented automatically based on the blockchain intelligent contract. Simulation and experimental results show that, compared with other user selection schemes, these models can improve the actual user response and reduce the subsidy cost of the demand response centre. It can be seen that in the case of saturated user competition, the improved BPSO algorithm is feasible in the application of demand-side response to user selection and optimization of demand-side management.


Introduction
With the advancement of air pollution prevention and control battles [1], changes in power consumption structure [2] and explosive growth of new energy power generation [3], grid operation is facing new challenges, such as gaps in power supply and demand, and prominent seasonal peak load contradictions, the increasing pressure of peak regulation. Therefore, it is necessary to fully tap the demand-side response potential to effectively alleviate the gap between power supply and demand to ensure the safety and economy of power grid operation. Power market dispatch agencies or power companies and other power grid operators draw up demand response plans to stabilize peak loads, ensure the balance of power supply and demand, and promote the quality and efficiency of power grids. At present, Tianjin, Shandong, Shanghai, Jiangsu, Zhejiang, Henan and other provinces have issued power demand response work plans to effectively cope with the power load gap that may occur during the peak period of summer and winter power consumption in 2021.
The existing demand-side response mode mostly is that the power company issues bidding information, including demand response capacity, maximum price, response time period, etc, then the demand response centre proposes an invitation to the user, and the users declare the response with the information of response capacity, compensation price according to their own situation. The demand response centre selects users to participate in demand response until the response capacity demand is reached according to the principle of "price first, time first". However, in this process, only the price factor is considered, and there is no guarantee that the selected users can complete the declared response volume with quality and quantity. According to the demand-side response results publicized 3 relevant regulations on demand-side response, and then introduce the demand-side response unsaturated user competition model and the saturated user competition model respectively.

Relevant Regulations on Demand-side Response
Normally, demand-side response can be divided into four types: real-time peak-shaving response, realtime valley-filling response, contracted peak-shaving response, and contracted valley-filling response. In order to improve the pertinence of the research, this article only considers peak-shaving response for the time being. The contracted response refers to the completion of the response invitation and confirmation process on the day before the response day, and the response is executed at the contracted time on the response day. Real-time response means that the responding user responds immediately after receiving the response instruction when the grid has insufficient power supply capacity due to emergencies. A user implements the contracted response not more than 2 times within a day, accumulation does not exceed 2 hours, and implements real-time response not more than 2 times a day, and each time does not exceed 30 minutes.
In the demand-side response process, when the actual response volume is less than 80% of the bid response volume, it is an invalid response and the user cannot receive response subsidies. when the actual response volume is within the range of 80% to 120% of the bid response volume, the corresponding subsidy will be calculated based on the quantity. For the part that exceeds 120% of the bid response volume, user will no longer obtain subsidy. The relevant data will be reported to the provincial power company for approval, and the approved effect evaluation data will be publicized by the demand response centre and the users will be notified after demand-side response.
At present, when the declared response volume has not reached the demand response volume within the specified time, the principle of "demand response priority, orderly power consumption guarantee" is often adopted. At the same time, the demand response centre will report to the provincial power company, which determine whether to start an orderly power management process. However, in the implementation process of orderly use of electricity, there are problems, such as lack of supporting incentive policies and measures, difficulty in implementing the plan, lack of continuous financial support for the construction and operation of the load management system and so on. This paper proposes the principle of "contracted response first, real-time response guarantees the bottom". When there is a gap capacity in the contracted response, we supplement it with real-time response. In this way, we can mobilize the enthusiasm of participating users and improve the feasibility of implementing the plan.
When a user declares a real-time demand response, after the review is passed, a resource reserve is formed. The user resource reserve can be divided into two types according to the response speed and control method, as shown in table 1. Among them, the self-control users will respond whether to participate in the response within one hour after receiving the invitation; while direct-control users are directly controlled by the response centre. The subsidy fee for direct-control users is twice that of self-control users

Demand-side Response to Unsaturated User Competition Model
There are fewer users to participate user competition in the demand side response unsaturated stage. In order to increase the enthusiasm of users' participation, the penalty for non-performing users will not be considered for the time being. Therefore, the estimated subsidy F can be expressed as: In equation (1), c is the total demand capacity of the demand-side response for this time, is the response capacity of the k-th direct-control user.

Demand-side Response Saturated User Competition Model
When the demand-side response has developed to a certain stage and the participating users have reached a certain scale, in order to give full play to the demand-side response market mechanism and explore the establishment of a power demand-side response working mechanism that adapts to the medium and long-term and spot market models, it is necessary to establish an adjustable capacity covering the demand-side The power market demand response model of bidding and electricity energy bidding will give play to the decisive role of the market in resource allocation, further enhance user load management capabilities, increase user-side peak shaving enthusiasm, and strive to build a clean, low-carbon, safe and efficient new power system. From the perspective of the demand response centre, this section constructs a weighted model of user performance priority. Users with higher priority are more likely to be selected to participate in the response.

Demand-side Response Saturation User Competition Model.
In order to standardize user behaviour and encourage participating users to implement contracted responses in sufficient time and in sufficient quantities, this section introduces a priority mechanism. Users with higher priority have an advantage in the process of competing and participating in demand response. The priority is mainly determined by users. The performance of the contract is determined. The user performance probability can be used to quantify the user response execution effect, and it can be expressed as: In equation (2), total i T is the number of user i's historical participation in demand response, and valid i T is the number of times that user i participates in the demand response confirmation, that is, the number of times that the actual response volume is greater than or equal to 80% of the declared response volume. In order to ensure the competitiveness of users participating in demand response for the first time, the initial probability of performance is set to 0.3. At the same time, users who participated in the response before but whose effective response times are zero are set to 0.1.

Subsidy Cost Weighted Model.
Generally, the existing demand-side response user selection scheme sorts the declared prices of users from low to high, and preferentially selects users with low prices. However, the demand-side response user selection scheme determined in this way cannot guarantee sufficient response volume. In this section, priority is introduced into the subsidy fee calculation model, and the declared price and priority jointly affect the probability of users being selected. The contracted response in the case of user saturation, when the i-th user participates in the demand-side response, the weighted subsidy i r can be expressed as: Therefore, the weighted subsidy cost i R can be expressed as: In equation (4), i x is the selection factor which is used to characterize whether the user i is selected to perform demand-side response. When 0 i x = , it means that the demand response centre does not select user i to perform demand-side response, and when 1 i x = , it means that the demand response centre selects user i to perform demand-side response. i c is the response capacity declared by user i , c is the total demand response volume, i Num is the number of contracted responses per day by the i-th user, ij Num L is the total single-day duration of the i-th user.
Therefore, the demand response capacity C that can be completed by the demand-side response is: In equation (9), i b is the actual response capacity of responding to user i .

Optimization of Subsidy Cost Weighting Model Based on Improved BPSO Algorithm
The BPSO algorithm can be used to optimize the objective function, while when the demand response capacity is close to the response capacity, it will take a very long time to run. In order to improve the convergence time, this paper proposes an improved BPSO algorithm. In the power market demandside response environment, this paper takes the response capacity, the number of responses, and response time as constraints, and considers the impact of user declared prices and performance probability on user choices. Under the premise of meeting the demand response volume, the improved BPSO algorithm is used to obtain the best user selection strategy that minimizes the weight of the subsidy cost, that is, under the constraints of equations (5) to (8), find the user response scheme that minimizes the equation (4).

Improved BPSO Algorithm
In order to improve the problem of premature convergence of the BPSO algorithm, this paper introduces the roulette selection operator to improve the BPSO algorithm. In the view of the fact that our purpose is to find the minimum weighted subsidy cost, the roulette selection operator is appropriately improved. The smaller the fitness value, the greater the probability of being selected. The algorithm flow chart is shown in figure 1. In summary, it can be divided into 5 parts, namely: 1) Initialization: Initialize the particle swarm (the particle swarm has M particles in total, and each particle has N dimensions): Give each particle a random initial position and velocity. Here the position of the particle corresponds to the decision factor i x in the weighted subsidy cost model, that is, when 0 i x = , it means that the demand response centre does not select user i to perform demand-side  In order to ensure the universality of the algorithm, a set of M×N matrices composed of 0 and 1 elements are randomly generated in this paper as the initial position of the particles, where M is the population number, that is, the number of particles, and each particle corresponds to a set of solutions, and N is the dimension, that is, the number of decision variables for each particle.

Begin
Initialize the velocity and position of each particle Calculate the fitness value of each particle according to equation (4) , calculate the probability of each particle being selected, and determine pBest, gBest Update the speed and position of each particle according to equation (11) and equation (14), construct a roulette, and proceed to the next round of particle selection by simulating roulette.
Are the particle speed and position within the feasible region?
Is the end condition met?
Output the best fitness individual and its fitness value Set the population size, iteration times, acceleration factor, inertia weight and other related parameters No Yes Yes

No
Calculate the fitness value of the updated particle according to equation (4) The position of the particle is changed by the particle speed to complete the optimization operation. The initial velocity of each particle can be randomly generated using equation (10).
BPSO algorithm realizes the update of particle velocity through equation (11).
In equation (11), i v is the velocity of the particle in the i-th iteration, 1 c and 2 c are constants, 1 r and 2 r are random numbers in the interval [0,1], i pBest is the individual optimal particle position, i gBest is the global optimal particle position, and i x represents the position of the particle in the i-th iteration.
i  is the dynamic inertia weight, which can adjust the local and global search capabilities. When the inertia weight value is large, the global optimization ability is strong, and the local optimization ability is weak. And when the inertia weight value is small, the global optimization ability is weak, and the local search ability is strong. According to the linearly decreasing weight strategy, the inertia weight i  of the i-th iteration can be expressed as: In equation (12), max  is the maximum inertia weight, that is, the set initial inertia weight value, min  is the minimum inertia weight, that is, the final set inertia weight value, G is the maximum number of iterations, and i g is the current iteration number. In this paper, the Sigmoid function is used to update the position of the particle according to the 2) Calculate the fitness value: In this stage we calculate the weight of the subsidy cost of each particle according to equation (4), and calculate the total fitness value 3) Find the best fitness value of the individual and the group: For each particle, we compare the weighted subsidy cost of its current position with the corresponding subsidy cost of the best position in history (pbest) and the global best position(gbest). If the subsidy cost of the current position is smaller than pbest, pbest will be updated by the current position. And if the subsidy cost of the current position is smaller than gbest, gbest will be updated by the current position. 4) Update particle position and speed: Then we will update the speed and position of each particle according to equation (11) and equation (14), here the roulette selection operator is introduced into the BPSO algorithm, and a roulette is constructed according to step 2):

Algorithm Execution Based on Blockchain Smart Contract
Blockchain, as a distributed database and decentralized P2P network, has the characteristics of smart contracts, distributed decision-making, collaborative autonomy, high security, openness and transparency of anti-tampering, etc. [11][12][13] Blockchain smart contracts go through five stages: negotiation, development, deployment, operation, and destruction [14]. Generally, rules are programmed through smart contracts and the contract is published on the chain. The smart contract system regularly monitors whether related events have occurred to meet the trigger conditions for contract execution [15]. In this paper, through smart contracts, the multi-objective search optimization algorithm based on the improved BPSO algorithm is run in a programmatic script to realize the equation of the demand-side response to the user's selection strategy.
When the demand response centre releases the response invitation to the contracted users through the platform, the demand response capacity, the peak price and their identifier information will be stored on the chain, so that when the scope of the invitation is expanded later, the information will not be sent repeatedly. When a contracted user participates in the response, the user ID, the declared response capacity, the declared response price, the response time period and other information are stored on the chain through the private key signature. When the amount of response exceeds the demand, the smart contract system of the blockchain is triggered. The winning user is determined by the user's declared price and contract performance probability based on the improved BPSO algorithm written into the smart contract in advance to obtain the minimization weighted subsidy fee. Then the winning bidder is notified through the platform in a timely manner, and relevant data such as the user ID, declared capacity, and declared price of the winning user are recorded on the chain. The bidwinning user carries out demand-side response during the contracted time period on the response day, and the user side automatically measures the amount of electricity during the response period. After the response, the private key will sign and store its response start time, response end time, power at the beginning of the response, and power at the end of the response on the chain. The response data is recorded on the chain, and the demand response centre monitors the load in real time according to the electric mining system. The demand response centre summarizes the user demand response data tables participating in the demand-side response, hashes the actual response capacity, subsidy fees and other data, then records them on the blockchain, and distributes them to power users for confirmation. The demand response centre conducts the Merkel tree processing on the demand response data table after the final verification, and the root hash is stored on the chain. The power company uses the public key of the subsidy bank to encrypt the financial data table and sends it to the bank. The bank uses the private key to decrypt it, and automatically transfers the subsidy fee to the account bound to the responding user on the demand-side. The bank encrypts the transfer information table with the public key of the power company, sends it to the power company, and calculates the Merkel tree of the transfer information table, and the root hash deposits the certificate on the chain. The specific flow of this process is shown in figure 2. Blockchain has the characteristics of non-tamperable and traceable data. The whole process of demand-side response is stored in the trusted alliance chain, which not only ensures that the data on the chain cannot be tampered, but also can be traced to the source of doubtful data when necessary.

Experimental Data and Parameter Settings
Assume that 50 users respond to the request and declare their response capacity and price in a demand response. The declared capacity, declared price, historical participation response times, historical effective response times, and historical average completion percentage of these 50 users are shown in Begin The power company releases demand response information (including response capacity, maximum price, response period, etc.), and deposits the certificate on the blockchain The demand response centre determines the scope of the response invitation and sends the response invitation The user promptly feedbacks whether to participate in the response, and declares the capacity and price The demand response centre counts the load response volume Whether the response volume meets the demand of the grid Prepare real-time demand plan Not soliciting a full response amount within the scheduled time Based on the improved BPSO algorithm written into the smart contract in advance, the winning user is determined based on the user's declaration information and related historical data, and the user is notified in advance, and the relevant data is stored on the chain.
The user performs the contracted response at the agreed time on the response day Power companies and demand response centers monitor the response in real time Calculate the actual response of the user, determine whether it is a valid response, calculate the response subsidy fee, and automatically settle after the power company s review, and the relevant data will be deposited on the chain

Analysis of Experimental Results
When the total number of responses declared by the contracted responding users is lower than the response volume, all responding users will participate in the demand-side response, and the insufficient part will initiate a real-time response. When the total amount of contracted response users' declaration is much higher than the demand response volume, the demand response centre comprehensively considers the user's declared price, performance probability and other factors, and selects the user response plan that minimizes the weighted subsidy. If the user's offer and performance probability are the same, the user who confirms the response first will be selected.
In the experiment, the improved BPSO algorithm is used to optimize the weighted model of the subsidy cost of the demand response. The related parameter settings of the improved BPSO algorithm are shown in table 3. When the demand-side response capacity is higher than the response capacity of the user response, real-time response is required to fill the gap in the contracted response capacity. Taking the demand-side response volume of 5 6*10 kW as an example, the total declared contracted response capacity is 494963kW, which is lower than the demand-side response volume. In this case, all contracted demand response users participate in the contracted response, and the subsidy fee is 837306.26 yuan, and the remaining 105037kW needs to be supplemented by real-time response. The method of user resource reserve classification has been elaborated in section 1.1 of this article. It is assumed that self-control real-time response user has declared a total response capacity of 4 5*10 kW, and the remaining 55037kW will be completed by the direct-centre user. Assuming 2.5 p = , the subsidy cost can be calculated by equation (1), and the cost is

Saturated Demand-side Response.
When the demand-side response capacity is lower than the user's response capacity, this paper proposes a weighted subsidy cost model that comprehensively considers the declared price and the performance of the contract. The user participation response with low weighting is preferentially selected, and the improved BPSO algorithm is used to optimize the 12 operation strategy. The plan is determined according to the weighted order of subsidies, and the BPSO algorithm is used to select users to participate in the response, referred to as Plan One. The strategy of choosing from small to large according to the smallest order of subsidy costs, hereinafter referred to as Plan Two. The strategy of choosing users with small weighting to participate in response and optimizing operation by improved BPSO algorithm is referred to as Plan Three. In order to prove the effectiveness of the proposed optimization model and algorithm, the decision response capacity, subsidy cost and probabilistic completion response capacity of the three schemes were compared.
The following analyses the decision-making response capacity, subsidy cost, and actual completion response capacity according to the probability of Plan One, Plan Two, and Plan Three under different demand response capacities.  Figure 3 shows the decision response capacity obtained by running three scenarios. It can be seen from figure 3 that the decision response volume obtained by the three schemes is higher than the demand response capacity. That is because in the sequencing process, it is difficult to obtain the capacity that exactly meets the demand. Therefore, the decision response capacity is greater than the demand response capacity. Plan One and Plan Three are more flexible in responding to user choices, while Plan Two uses the smallest order of subsidies and chooses from small to large with poor flexibility. Therefore, in most cases, the decision response capacity obtained by the Plan One and Plan Three are lower than the decision response capacity obtained by the Plan Two.   Figure 4 shows the change curve of the subsidy cost with the response capacity under the three operation schemes. Generally speaking, the subsidy cost increases with the increase of the response capacity. Plan Two select users from the smallest to the largest declaration cost, while Plan One and Plan Three comprehensively take declaration cost and performance probability into consideration. Therefore, not all the users with the minimum declaration cost selected by Plan One and Plan Three, so the subsidy cost of Plan One and Plan Three is generally higher than that of Plan Two. In figure 5, we can see the curve of the actual response capacity of users is estimated based on the user performance probability determined by equation (2). The actual response capacity increases with the increase in demand response capacity. For Plan Two, the change is basically linear, but for Plan One and Plan Three, as the demand response capacity increases, its change trend gradually slows down. That is because users with a high-performance probability have a higher probability to be  14 selected when the demand response capacity is low. When the demand response capacity is high, more low-performance probability users are selected, so the actual response capacity growth trend slows down.
The predicted actual response capacity in figure 5 is the estimated response capacity based on the historical completion of the selected user. If the goal is to complete the demand response, the amount of demand response that is not completed by the contracted response will be completed by the realtime response. Here, demand response capacity of 5 2.5 10  kW is taken as an example. The actual response capacity of Plan One is 213833.33kW, which requires a real-time response of 36166.67kW, and the actual response capacity of Plan Two is 155276.93kW, which requires a real-time response capacity of 94723.07kW. The actual response capacity of Plan Three is 217519.48kW, which requires a real-time response volume of 32480.52kW. Refer to section 3.2.1, we assume that the self-control real-time response users declare a total response capacity of 4

10
 kW, and the remaining response volume is completed by the direct-control real-time response users. The real-time response subsidy fee for Plan One is yuan. Integrating the contracted response and realtime response subsidy costs, the total cost of Plan One is 521499.65 yuan, the total cost of Plan Two is 670666.74 yuan, and the total cost of Plan Three is 505788.28 yuan. The total cost of Plan One and Plan Three are much lower than Plan Two.
In summary, there is no obvious difference between Plan One and Plan Three in terms of decisionmaking response capacity, subsidy costs, and actual completion of response capacity based on performance probability. However, in terms of convergence time, especially when the demand capacity is close to the response capacity, Plan Three is significantly better than Plan One. The details are shown in figure 6.

Conclusion
This article outlines the operating principles of smart contracts, combs the demand-side response transaction framework based on blockchain technology, and constructs a demand-side response user optimization algorithm. The following conclusions are obtained.
1) Demand-side response management measures and management mechanisms are still under development, and the evaluation indicators and means of demand-side response effects need to be