Reliability Estimation and Optimization of a Smart Meter Architecture Using a Monte Carlo Simulation

Altenburg, Tobias; Staegemann, Daniel; Volk, Matthias; Turowski, Klaus

doi:10.1007/s42979-023-01917-8

Reliability Estimation and Optimization of a Smart Meter Architecture Using a Monte Carlo Simulation

Review Article
Open access
Published: 10 June 2023

Volume 4, article number 438, (2023)
Cite this article

Download PDF

You have full access to this open access article

SN Computer Science Aims and scope Submit manuscript

Reliability Estimation and Optimization of a Smart Meter Architecture Using a Monte Carlo Simulation

Download PDF

844 Accesses
Explore all metrics

Abstract

The technological evolution is dominating the social change. There is a constant increase in the number of networked devices and the use of smart meters has become more relevant. A required feature of smart meter architectures is reliability, which ensures the continuous provision of the services of the system. In this article, the research question “How can a low-fault smart metering architecture be achieved based on constructive methods?” is answered with the objective of designing a low-fault smart meter architecture. By implementing optimization methods, an improvement in reliability can be achieved. A structured approach to reliability optimization using topological modifications to the architecture is described, along with evaluation of the improved reliability. For this purpose, reliability block diagrams are combined with Monte Carlo simulation to calculate the most realistic approximation of system reliability. Finally, the results of the individual reliability analyses are summarized and compared to show the optimization potential.

Fast and Flexible Design of Optimal Metering Systems for Power Systems Monitoring

Article 06 December 2017

Quantitative modelling and analysis of a Chinese smart grid: a stochastic model checking case study

Article 27 April 2014

Technoeconomic Review of Smart Metering Applications

Introduction

The Internet of Things (IoT) is a megatrend that is dominating current social transformation. The number of networked devices and the resulting volume of data is constantly increasing worldwide. Until 2025 there will be 75 billion networked devices [1] worldwide with a data volume of approximately 80 zettabytes [2]. The IoT has become a key technology for future-oriented scenarios. Driven by Murphy’s law—“Anything that can go wrong will go wrong”, the reliability of computer systems is becoming more important. In particular, the civil infrastructural systems get an extremely high societal relevance [3,4,5]. The provided services such as water or electricity supply are increasingly dependent on highly available and functional information technology. The so-called smart meters can record real consumption data and forward them to higher level instances to provide this data for the overall management of whole ecosystems. A fault, an impairment, or even a failure could lead to significant effects to public safety or other dramatic consequences [3,4,5]. The resulting dependence of modern society on complex information systems, especially for the above-mentioned infrastructures, is constantly growing [3,4,5].

This article is an extension of the paper "Reliability Estimation of a Smart Metering Architecture using a Monte Carlo Simulation" [6] published in IoTBDS 2022. In that paper, we have approximated the reliability of the reference architecture for smart metering systems that is depicted in Fig. 1 and presented as well as interpreted the results. Focusing on improving the reliability of this reference architecture using specific optimization methods, by validating the optimized smart meter architecture using our approximation method from the published paper. This approximation method aims to represent reality in the most accurate manner, and therefore, the Monte Carlo simulation based on RBD models is used. The outcome will be a more reliable smart meter architecture. As a result, a low-fault overall system is reached.

Figure 1 shows a Europe-wide reference architecture for smart metering systems (gas, water, heat, or electricity) [7, 8] in a schematic structure. The central concept in these specifications provides a separate unit—the smart meter gateway (SMGW) as a central communication device. This provides the interfaces between the diverse domains and the smart metering system. According to the last European Commission report [9] in 2020, the penetration rate of smart electricity meters is estimated to 43% (123 million) and of smart gas meters to 27% (31 million). In 2030, there will be a penetration rate of 92% (226 million) for smart electricity meters. Furthermore, a penetration rate of 44% (51 million) is projected for smart gas meters in 2024. For comparison, about 53 million metering locations [10] will be equipped with smart metering systems in Germany. In this context, a metering location is a component that measures energy and includes all the technical equipment required to determine and transmit the metered values. This projected increase demonstrates that the Europe-wide and national rollout of smart meters will be driven by grid operators.

The general goals of this digital data collection are a more efficient and transparent energy distribution as well as the sustainable control of energy generation and the overall network utilization [11, 12]. To ensure the required goals of this ecosystem the reliability is a fundamental objective of the design phase [13]. The present article focuses exactly on this subject—the optimization of the constructive reliability of smart metering architectures. In general, the smart metering systems are more fault-prone than conventional metering devices because of the more complex interaction between hardware and software components [11].

Based on the reliability-oriented V-model, which is based on ISO 26262 Standard, this article follows a structured procedure for reliability optimization [14, 15]. The V-model is used as a structural approach and for a simplified explanation of the procedure. Figure 2 shows the V-model, which defines the reliability-oriented design of a system on the left side. At this level, the requirements for the system will be set, so that a low-fault smart metering architecture can be developed. There are various optimization methods for these requirements in the literature [16] and some elements of these are used in this article. The right side of the V-model describes the reliability analysis and verification. Analytical approaches are not generally possible for common reliability problems at the component or system level. Therefore, approximative reliability methods, such as Monte Carlo simulation techniques, have become very popular [17]. In comparison with other reliability methods, the Monte Carlo simulation has the advantage that it is both precise and easy to implement. Therefore, for the presented reliability analysis of a smart metering architecture, the Monte Carlo simulation is used. Derived from the described challenges we defined the following research question: “How can a low-fault smart metering architecture be achieved based on constructive methods?”.

Following this introduction, section “Foundations” presents the theoretical foundations for reliability and its optimization. After that, we will demonstrate and explain the optimized smart meter architectures in section “Optimized Smart Meter Architecture”. The approach for reliability analysis is described in section “Approach for Reliability Analysis”. Based on this approach the reliability is approximated using Reliability Block Diagrams (RBD) and a Monte Carlo simulation in section “Evaluation of the Smart Meter Architectures”. Section “Conclusion and Future Work” concludes this article by summarizing the paper and outlining future work.

Foundations

This chapter presents the basics of the reliability domain in a logically structured order. First of all, the reliability theory and the systematic reliability optimization are described, since this is the basis to develop the reliability-optimized smart meter architectures. Subsequently, the reliability analysis approach is explained.

Foundations of Reliability

The research field of reliability was characterized by Jean-Claude Laprié. He established a standard framework and general terminology for reliable and fault-tolerant systems [18]. According to Bertsche [19] and Laprié [18], the reliability R(t) is defined as the probability that a system performs its functions satisfactorily and without any failures under given functional and environmental conditions over a specific time period. The literature classifies four methods for a reliable system design: fault prevention, fault tolerance, fault removal, and fault forecast [18, 20]. This article focuses on fault prevention, because based on a previous literature review [16], the highest potential for reliability optimization is in the design phase. Fault prevention refers to methods that are intended to prevent the occurrence of a fault condition or the implementation of faults into the entire system [18, 20]. To be able to prove that the identified methods of the literature review [16] increase the reliability R(t) of a smart metering architecture, it is necessary to do a validated reliability analysis. Reliability analysis is a methodical approach to be able to identify the reliability of a system and the frequency of failures. This approach starts with the conception of the RBD model and is finished with the statistical calculation of the overall reliability [21].

There are several techniques for quantitative and qualitative analysis of reliability in the literature [22]. Basis for our approach is a combination of quantitative methods. In this group, the most important techniques are the RBD [23], the network diagram method [24], Markov modeling [25] and the Monte Carlo simulation [26]. To be able to calculate the most precise reliability, it is necessary to combine the above mentioned techniques [22, 27]. The valuation approach we have chosen is using the RBD to model the entire system together with a Monte Carlo simulation to calculate the reliability per component, which is described in detail in section “Approach for Reliability Analysis”. RBD is a schematic notation of the main components of the overall system, which represents the hierarchy and interaction with each other for the function of the entire system [22, 28]. At the next step, the Monte Carlo simulation is used to simulate the reliability of each component. The Monte Carlo simulation implementation is based on repeated random sampling and statistical analysis to estimate the reliability R(t) for complex system functions [29, 30]. This approximation technique helps to generate realistic values that we can use for the reliability analysis of the whole smart metering architecture.

Reliability Optimization

In the literature, the reliability R(t) is calculated with the following formula: Reliability R(t) = e^−λt

Based on the conducted literature review [16], different methods for optimizing system reliability were identified. Here, it is evident that the failure rate λ(t) interacts with the time t. In many cases, the level of the failure rate, that is the reliability of a system component that is still intact, depends on the age that is already reached [31, 32]. The so-called bathtub curve in Fig. 3 describes the time history of the failure rate λ(t) for hardware components in three phases. The first phase of the bathtub curve is known as the period of early failure or "infant mortality" and is characterized by a decreasing failure rate λ(t). For example, through construction or material defects, a component may fail after a short period of operation [31,32,33]. If the component has passed a certain time without damage, then the risk of failure decreases significantly per time unit. The middle area of the bathtub curve is almost flat, so that the failure rate λ(t) is constant. The risk of the failure of a component stays unchanged over a longer time period. In this case, the reasons for downtimes are primarily random faults. The third area of the bathtub curve is characterized by a significantly increasing failure rate λ(t). These late failures are usually the result of wear and fatigue processes [32, 33].

As explained in section “Foundations of Reliability”, this article focuses on constructive reliability at the hardware level. Measures for optimizing the reliability of the system or for minimizing the risk of failures can be classified here into two fundamental categories—patterns for hardware reliability engineering and architecture patterns. Hardware reliability engineering is primarily characterized by quality control techniques that are used in the design and manufacturing of hardware [34]. In contrast, the architecture patterns define structured and strict design rules for the architectural structure or topology of the entire system. Each of these categories has a different impact on the three phases of the bathtub curve. Hardware reliability engineering patterns primarily have an impact on phase I and phase III, because early failure or wear and tear is due to hardware-specific conditions. However, the architecture patterns have a greater impact on the much longer service life (phase II), because faults in the architecture or system topology generally occur after the initial hardware or software faults at a higher level of maturity.

Design Science Research (DSR)

In the present article, we use the DSR approach, because a key feature of DSR is to solve societal and practical problems through the construction and evaluation of a scientific artifact [35]. Artifacts can be classified as concepts, models, methods, or realizations that contribute to a scientific result. According to Peffers [36], the DSR consists of six major steps—problem identification and motivation, definition of the objectives for a solution, design and development, demonstration, evaluation and communication (cf. Fig. 4). This article describes a practice-oriented problem that has to be solved by optimizing reliability. For this purpose, a low-fault smart meter architecture was defined in Fig. 5c and evaluated using reliability analysis in section “Evaluation of the Smart Meter Architectures”. This specific approach is detailed and implemented in the following chapter as well as the result is interpreted and communicated.

Optimized Smart Meter Architecture

In this chapter, popular reliability optimization methods [16] are used to design a low-fault smart meter architecture. The individual optimization stages for a smart meter architecture with a lower error rate will be shown and explained. A smart meter architecture basically consists of three layers, as shown in Fig. 5 [7, 8]. The data layer is equivalent to the Local Metrological Network (LMN) from Fig. 1, which includes all smart meters in a house or household. Located above is the gateway layer, in which the SMGW as a telecommunication device provides all information to the application layer. The Home Area Network (HAN) and the Wide Area Network (WAN) from Fig. 1 have been combined into the Application Layer, because in both domains, the meter information can be read and visualized or a remote configuration can be executed [37]. Due to these features, it can also be summarized as a meter data management system (MDMS) [38].

Starting with the reference architecture [7, 8] up to the reliability-optimized smart meter architecture, Fig. 5 shows the individual optimization stages. For this purpose, the smart meter architectures have already been transferred to a simplified model. In section “Evaluation of the Smart Meter Architectures”, an approximation for the reliability of the individual smart meter architectures from Fig. 5 is provided by a simulation and the step-by-step optimization of the overall system is verified.

Figure 5a shows the simplest approach and has no constructive reliability methods for fault avoidance. All components of the smart meter architecture are connected to each other via a single channel, so that a failure of the SMGW or an interruption of the communication channels between the smart meters and the SMGW or the SMGW and the application will affect the overall system immediately. To achieve the first optimization level of the architecture in Fig. 5b, physical hardware redundancy is used as reliability method and applied to the gateway layer. Due to semiconductor components becoming smaller and cheaper, the concept of hardware redundancy has become popular in recent years [39, 40]. This hardware redundancy can compensate the failure of the SMGW and also enables a multi-channel connection between the smart meters and the two SMGWs as well as the SMGWs and the application, so that a disruption of the connection can be tolerated. The result is that interruptions in the communication channels do not affect the service provision of the system.

The last optimization stage of the smart meter architecture as shown in Fig. 5c consists of the previous hardware redundancy of the SMGW and additional reliability methods in the data layer for smart meters. In this case, the principle of clustering is applied and two of the smart meters are defined as root nodes that act as data concentrators and aggregate all information of the subordinate smart meters [41, 42]. To guarantee this, all smart meters are interconnected multiple times and form a kind of mesh network topology [43]. Each of the two root nodes has a redundant connection to the superordinate SMGWs which act as pure telecommunications devices and send the information to the application [7, 8, 40]. The application, at the top level, aggregates all the information. From here, the smart meters can be managed and the smart meter data can be visualized or analyzed and used for general purposes (cf. MDMS) [38].

Approach for Reliability Analysis

In this chapter, the methodology for reliability analysis is presented and will be applied in the next chapter 5. The smart meter architectures in Fig. 5a to –c are the basis for the reliability analysis. These smart meter architectures already represent a simplification of the entire system, so they can be used directly for the methodology. In our reliability analysis, we assume five smart meters, because in the future, there will most likely be no only smart electricity meters in common use, but also smart water or gas meters. In the next step, it is necessary to transfer the simplified models from Fig. 5 into the logic of the RBD. A RBD configuration could consist out of three basic component connections, which can be combined with each other—the series connection, the active redundancy, or the standby redundancy [44, 45]. Depending on the configuration, the failure of any component can cause the entire system to fail or restrict individual services of the entire system, so that the required system functions are not fulfilled [44, 45]. In Fig. 6, we have transferred the three smart meter architectures from Fig. 5 into the RBD logic. This formalization of the smart meter architectures allows the mutual dependencies of the hardware components to be evaluated with formulas.

To calculate the quantitative reliability of the entire system, the failure probabilities of each component are required. For a validated value for the failure probability of the smart meters and the SMGW, we scanned five publicly available databases and contacted ten organizations. The analysed databases contain aggregated raw data of smart meters over a specific period of time. We examined the following databases—opennetzero.org, osf.io, kaggle.com, data.gov.uk and ieee-dataport.org. The list of organizations comprised national institutions with regulatory supervision over the entire energy supply and large companies (> 500 employees) that offer hardware products, such as smart meters or SMGWs as well as services for the digitalisation of the energy sector. This research revealed that there are currently no validated values for failure probability. Currently, there are just no long-term data from the practice and the grid level. Because validated values for the failure probabilities cannot be determined, we use a Monte Carlo simulation to calculate the values. The Monte Carlo simulations can be used to approximate the individual reliabilities of the hardware components, so that they correspond more to reality. The interaction of the hardware component, which is represented by RBD in Fig. 6, determines the formula for calculating the reliability of the entire system.

Evaluation of the Smart Meter Architectures

This section presents the incremental approach for reliability analysis of the three smart meter architectures from Fig. 5. Reliability distributions of systems must be modeled with suitable mathematical functions, so that the practice can be mapped. The bathtub curve from Fig. 3 can be approximately described as a summary of Weibull distributions [46]. Due to its versatility, the Weibull distribution has become one of the most commonly used reliability techniques. The Weibull distribution can be used to represent the decreasing, constant, and increasing failure rates λ(t) in technical systems. Therefore, it is able to represent different failure modes and all ranges of the bathtub curve [46]. Depending on the life phase (cf. Fig. 3) of a component, the Weibull distribution corresponds to an exponential distribution or a logarithmic normal distribution [47]. As described in section “Reliability Optimization”, the focus of this reliability analysis is on the phase of the useful life, which has a constant failure rate λ(t). In this case, the reliability distribution corresponds to an exponential distribution. The exponential distribution is often used in the development of electronic systems, because it is accurate for reliability analysis [46]. Therefore, the following formula for the reliability R(t) is obtained [48,49,50]:

$$Reliability\,\, R\left(t\right)= {e}^{-\mathrm{\lambda t}}.$$

(1)

For an overall reliability analysis, the system must be divided into individual components. These are shown in Figs. 5 and 6—the smart meter, the SMGW and the application. Because of the high technical similarities between the smart meter and the SMGW [11, 51], it is possible to use identical reliability analysis for these two components. For the application as a separate component, we assume that it is operated in a cloud environment. To obtain the reliability R_App of the application, the characteristic availability from the three major cloud providers (AWS, Azure, GCP) is used. The minimum availability is 99.90% [52, 53]. Therefore, for the reliability analysis of each smart meter architecture, there is a reliability R_App = 99.90%.

Reliability Simulation of Smart Meter and SMGW

In the following section, the reliability of the smart meter and SMGW is approximated. To be able to calculate the reliability, the characteristic lifetime T and the failure probability G(t) of the components are required. Based on various European studies [51], a characteristic lifetime T of 12 years can be assumed. The failure probability G(t) can be assumed with 2% on average [11, 54]. The following formulas show the calculation of the failure rate λ(t) and the lifetime t:

$$\begin{gathered} Failure\,\, Rate \,\,\lambda \left( {\text{t}} \right) = \frac{1}{T}, \hfill \\ Lifetime \,t = T \times G\left( t \right). \hfill \\ \end{gathered}$$

(2)

Using the e-function, which is an exponential function with Euler's constant [55] as the base, the reliability R(t) for the two components can be calculated according to formula (1):

$$Hypothetical \,\,Reliability R\left(t\right)\approx 98.02\boldsymbol{\%}.$$

(3)

We use the principle of the Monte Carlo simulation to make the accuracy of this calculation more realistic by approximating the reliability of the smart meter and SMGW. The objective is to approximate a realistic value of the reliability R(t) based on the Law of large numbers [56].

$$Lifetime t \left(x, \mu , \sigma \right)= \frac{1}{\sigma \sqrt{2\pi }}{e}^{-\frac{\left(x-\mu \right)}{{2\sigma }^{2}}},$$

(4)

$$x\in \left[\mathrm{0,1}\right]; \mu =2.081, 52 \,\,hours; \sigma =5.256 \,\,hours$$

This function [57] calculates the percentile for a specified mean and standard deviation. The parameters for the reliability calculation in formula (4) are described below:

For the parameter $x$, which indicates the probability in the normal distribution, we create a random number between 0 and 1,
The parameter μ, which indicates the arithmetic mean of the distribution, is equivalent to the lifetime tµ of our previously calculated reliability R(t) from formula (3). This is calculated as follows:
$$Lifetime\,\, {t}_{\upmu }=12\,\, years \times 1.98\mathrm{\%}\approx 2.081, 51 \,\,hours.$$
(5)

The parameter σ, which indicates the standard deviation of the distribution, is empirically assumed at 5% [54] and inserted into the related formula 2 for the lifetime tσ:

$$Lifetime\,\, {t}_{\sigma }=12 \,\,years \times 5\%=5.256 \,\,hours.$$

(6)

Formula (4) will be executed for 80.769 random samples to simulate the lifetime t (x,μ,σ). According to Liu, the 80.769 random samples constitute an optimal number of trials for a Monte Carlo simulation [58]. Each simulated lifetime t (x, μ, σ) has to be inserted into the formula (1), so that we can determine the reliability R(t) for 80.769 smart meters or smart meter gateways in a realistic manner. This statistical simulation of 80.769 samples was performed with an automated tool to reduce processing time and avoid human error. Finally, the average of the results can be calculated to obtain an approximately real reliability of the two components:

$$Reliability\,\, R\left(t\right)\approx 96,93\boldsymbol{\%}.$$

(7)

This reliability R(t) is the basis for the subsequent reliability analyses of the optimized smart meter architectures from Fig. 6. The simulation process of the reliability R(t) from formula (7) is shown in Fig. 7. The diagram shows the smoothed reliability R(t) for the smart meter and the SMGW. Due to the large number of samples, only every 327th random sample, in total 247 measurements, were included in the x-axis of the graph. The least reliable value is just a bit more than 85% so we have set the range of values of the y-axis between 0.85 and 1. The red trend line represents the moving average of the random samples. The diagram illustrates the strong variations of the calculated reliability R(t), which is caused by the simulated failures of the Monte Carlo simulation. We can see that the reliability R(t) of the smart meters and SMGW is between 100 and approx. 85% because of the integrated coincidence (cf. formula (4)). In this case, a reliability R(t) of 100% means that the characteristic lifetime T of the component is reached or even exceeded.

Reliability Approximation of Smart Meter Architectures

In this section, the overall reliability analysis of the three architectures from Fig. 5 and the three RBD models from Fig. 6 will be calculated. The objective is to demonstrate the increased reliability of smart meter architectures by calculating the overall system reliability. The reliability R(t) of the smart meter and SMGW simulated in paragraph 5.1 is used for the following reliability analyses. For the RBD model of the reference architecture from Fig. 6a, the reliability R_a(t) is calculated. The reliability R_b(t) can be assigned to Fig. 6b and the reliability R_c(t) to Fig. 6c. These two approximations represent the optimization level of the smart meter architectures.

Calculation of Reliability R_a(t)

In this section, the simulated reliability R(t) is merged with the defined RBD model from Fig. 6a to obtain the overall system reliability R_a(t) from Fig. 6a. For the smart meters, we assumed a "k-out-of-n" dependency [44, 45]. The variable k corresponds to variable i in present formula (8). Thereby, the objective is that all of the five smart meters from the architecture in Fig. 5a will not fail. Therefore, the following formula is obtained for the reliability R_SM1(t) of the smart meters based on the RBD model in Fig. 6a:

$$\begin{gathered} R_{SM1} \left( {i,n,R\left( t \right)} \right) = \mathop \sum \limits_{i}^{n} \left( {\begin{array}{*{20}c} n \\ i \\ \end{array} } \right)R\left( t \right)^{i} \left( {1 - R\left( t \right)} \right)^{n - i} ,\,\,i = 5;\,\,n = 5;\,\,R\left( t \right) = 96.93\% , \hfill \\ R_{SM1} \left( t \right) = (R \left( t \right)^{5} \times \left( {1 - R\left( t \right)^{0} } \right)) \times \left( {R \left( t \right)^{4} \times \left( {1 - R\left( t \right)^{1} } \right)} \right) \times \left( {R \left( t \right)^{3} x \left( {1 - R\left( t \right)^{2} } \right)} \right) \times \left( {R \left( t \right)^{2} x \left( {1 - R\left( t \right)^{3} } \right)} \right) \times \left( {R \left( t \right)^{1} \times \left( {1 - R\left( t \right)^{4} } \right)} \right) \times \left( {R \left( t \right)^{0} \times \left( {1 - R\left( t \right)^{5} } \right)} \right), \hfill \\ R_{SM1} \left( t \right) = 88.35\% . \hfill \\ \end{gathered}$$

(8)

The following applies to this:

Variable i is the minimum number of units required for successful service provision of the system,
Variable n is the total number of parallel connected units,
And R(t) is the simulated reliability of the smart meter from section “Reliability Simulation of Smart Meter and SMGW”.

The remaining components of the architecture in Fig. 6a are connected in series. Hence, it is a simple multiplication of the determined reliabilities to calculate the reliability R_a(t) of the entire system from the architecture in Fig. 6a.

$$\begin{gathered} R_{a} \left( t \right) = R_{SM1} \times R\left( t \right) \times R_{App} , \hfill \\ R_{a} \left( t \right) = 88.35\% \times 96.93\% \times 99.90\% , \hfill \\ Reliability \,\,R_{a} \left( t \right) \approx 85.55\user2{\% }. \hfill \\ \end{gathered}$$

(9)

Calculation of Reliability R_b(t)

In this section, the reliability analysis of the overall system of the architecture in Fig. 6b is presented. This is a reliability-optimized smart meter architecture, as described in section “Optimized Smart Meter Architecture”. The redundancy of the SMGW is equivalent to a parallel RBD model. This means that there are two identical communication channels, which are independently connected to all five smart meters. The reliability for this type of system is calculated below [43, 45]:

$${R}_{Parallel}\,\,\left(t\right)=1-\prod_{i=1}^{N}{(1-R}_{i}(t)).$$

(10)

For the reliability analysis, the calculated reliability R(t) is integrated into formula (10). Following the architecture in Fig. 6a, we have a "k-out-of-n" dependency. The variable k corresponds to variable i in present formula (10). The objective is that all five smart meters will not fail. Therefore, it is possible to take the result R_SM(t) from formula (8) and integrate it into formula (10). The result is the following formula for the reliability R_Parallel(t) of the architecture in Fig. 6b:

$$\begin{gathered} R_{Parallel} \left( t \right) = 1 - \left( {\left( {1 - R_{SM1} } \right) \times R\left( t \right)} \right)*\left( {\left( {1 - R_{SM1} } \right) \times R\left( t \right)} \right), \hfill \\ R_{Parallel} \left( t \right) = 1 - \left( {\left( {1 - 88.35\% } \right) \times 96.93\% } \right)) \times \left( {1 - 88.35\% } \right) \times 96.93\% )), \hfill \\ Reliability \,\,R_{Parallel} \left( t \right) = 97.94\% . \hfill \\ \end{gathered}$$

(11)

In the final step, the series-connected reliability R_App of the application must be multiplied by the calculated reliability R_Parallel(t) to get the reliability R_b(t) of the entire system from Fig. 6b. The higher system complexity implies an optimized reliability R_b(t) of the entire system, that we calculate as follows:

$$\begin{gathered} R_{b} \left( t \right) = R_{Parallel} \left( t \right) \times R_{App} \left( t \right), \hfill \\ R_{b} \left( t \right) = 97.94\% \times 99.90\% , \hfill \\ Reliability \,\,R_{b} \left( t \right) = 97.84\user2{\% }. \hfill \\ \end{gathered}$$

(12)

Calculation of Reliability R_c(t)

This section presents the reliability analysis for the architecture in Fig. 6c. This smart meter architecture is the last stage of our optimized architectures shown in Fig. 5. Based on the used reliability optimization methods, we expect the highest reliability R_c(t) for this smart meter architecture. To be able to calculate the reliability R_c(t) of the overall system, the reliability analysis for the "k-out-of-n" dependency of the smart meters is performed first. The variable k corresponds to variable i in present formula (13). For this case from Fig. 5c, the reliability R_SM2(t) is calculated for just four smart meters, because the root node described above is connected separately before them. Thus, based on formula (8) [44, 45] and the corresponding RBD model from Fig. 6c, the following formula is obtained:

$$\begin{gathered} R_{SM2} \left( {i,n,R\left( t \right)} \right) = \mathop \sum \limits_{i}^{n} \left( {\begin{array}{*{20}c} n \\ i \\ \end{array} } \right)R\left( t \right)^{i} \left( {1 - R\left( t \right)} \right)^{n - i} ,\,\,i = 4;\,\,n = 4;\,\,R\left( t \right) = 96,93{\text{\% }}, \hfill \\ R_{SM2} \left( t \right) = (R \left( t \right)^{4} \times \left( {1 - R\left( t \right)^{0} } \right)) \times \left( {R \left( t \right)^{3} \times \left( {1 - R\left( t \right)^{1} } \right)} \right) \times \left( {R \left( t \right)^{2} \times \left( {1 - R\left( t \right)^{2} } \right)} \right) \times (R \left( t \right)^{1} \times \left( {1 - R\left( t \right)^{3} } \right)) \times (R \left( t \right)^{0} \times \left( {1 - R\left( t \right)^{4} } \right)) , \hfill \\ R_{SM2} \left( t \right) = 91.16{\text{\% }}. \hfill \\ \end{gathered}$$

(13)

At the next level, the root node is connected to the SMGW in a parallel series. These create two completely independent communication channels, that can still communicate with each other and exchange data if a fault occurs. The result is a significantly increased reliability R_c(t) of the overall system. The formula for a parallel series RBD model is shown below:

$${R}_{Parallel/Series}\left(t\right)=1-\prod_{i=1}^{M}{(1-\prod_{j=1}^{N}(R}_{ij}(t)).$$

(14)

The next step is to merge the formulas, so that we get the reliability R_z(t) as an intermediate result by the following formula:

$$\begin{gathered} R_{z} \left( t \right) = 1 - \left( {\left( {1 - R\left( t \right)} \right) \times \left( {1 - R\left( t \right) \times R_{SM2} \left( t \right)} \right) \times \left( {1 - R\left( t \right)} \right) \times \left( {1 - R\left( t \right) \times R_{SM2} \left( t \right)} \right)} \right), \hfill \\ R_{z} \left( t \right) = 1 - \left( {\left( {1 - 96.93\% } \right) \times \left( {1 - 96.93\% \times 91.16\% } \right)} \right) \times \left( {\left( {1 - 96.93\% } \right) \times \left( {1 - 96.93\% x 91.16\% } \right)} \right), \hfill \\ Reliability \,\,R_{z} \left( t \right) = 99.9987\% . \hfill \\ \end{gathered}$$

(15)

In the final step, the series-connected reliability R_App of the application will be multiplied with the calculated intermediate result R_z(t) to get the reliability R_c(t) of the entire system from Fig. 6c. The defined system topology results in an optimized reliability R_c(t) of the overall system, which we calculate as follows:

$$\begin{gathered} R_{c} \left( t \right) = R_{z} \left( t \right) x R_{App} \left( t \right), \hfill \\ R_{c} \left( t \right) = 99.99\% \times 99.90\% , \hfill \\ Reliability\,R_{c} \,(t)\, = \,99.90\% . \hfill \\ \end{gathered}$$

(16)

Consolidation of the Results

Finally, the performed calculations are summarized and evaluated in Table 1. In the first evaluation, the hypothetical reliability R(t) was calculated. This shows that the reliability R_a(t) for the smart meter architecture in Fig. 5a is approximately 5% lower than the hypothetical reliability R(t). The result illustrates that about 15% of the smart meter architectures could fail within the characteristic lifetime T of 12 years. Based on extrapolations for Germany, almost 7,6 million of the 53 million [10] metering locations would be affected annually. To counteract that, the reliability of the entire system has to be increased. To achieve this, several reliability optimization methods [16] were applied to the smart meter architecture. An optimization of the reliability R_c(t) to 99,90% was achieved for the smart meter architecture in Fig. 5c. This corresponds to only 53.000 metering locations that would fail annually, if we assume the above-mentioned statistics for Germany. The best case scenario is that the possible number of annual failures of smart meter architectures has been reduced to over 99%. The results of the reliability calculation R_a(t), R_b(t) and R_c(t) are specified in the column Result by RBD models.

Table 1 Summary of the results

Full size table

Conclusion and Future Work

Among the three essential design dimensions for computing systems (cost, performance, reliability), the reliability is least understood [20] and offers the most scientific potential. Therefore, we focused on the reliability aspect in this article. The three architectural options for reliability optimization of a smart meter architecture were presented and a structured reliability analysis was performed to calculate the reliability of the entire systems for several different architectural approaches from Fig. 5. In the beginning, the theoretical basics and the optimized smart meter architectures were described. After this, the reliability of the entire systems was calculated by a Monte Carlo simulation based on the defined RBD models. The results are realistic reliability values, which are summarized in Table 1. The performed approximation demonstrates the optimization potential in the design phase and the need for reliability optimization in the context of smart meter architectures. However, because the improvement of reliability with a positive cost–benefit ratio is an important academic and industrial objective [59], amending the findings of this article with a cost–benefit analysis appears to be a necessary next step for future work.

Besides the architecture, optimization methods at the software and hardware level also have an impact on the reliability of the overall system [18]. For example, more reliable hardware components could increase reliability by improved quality control during manufacturing or the use of higher quality materials [34]. Moreover, it is also possible to implement reliability optimization at the network level [59]. For Example, the cloud layer offers a high potential through the specific optimization of the cloud business performance related to the technical risks [60]. Furthermore, according to Laprié [18], the other types, such as fault tolerance, can also be applied for reliability optimization. However, those mentioned optimization methods were not considered in this article and have a high potential for future scientific research. The objective is to obtain a cost-neutral and easy-to-implement concept that enables a low-fault system. All these methods can be grouped into the "Reliability by Design" approach. This approach offers the greatest potential, because early consideration of reliability by defined design criteria forms the basis for a robust and low-fault system.

Data availability

The data that support the findings of this research article is subject to limitations. Information about the data sources or used tool are available from the authors upon request.

References

“Internet of Things (IoT) connected devices installed base worldwide from 2015 to 2025 (in billions).” Statista, 2018. https://www.statista.com/statistics/471264/iot-number-of-connected-devices-worldwide/. Accessed 19 Mar 2023.
“Data volume of internet of things (iot) connections worldwide in 2019 and 2025.” Statista, 2021. https://www.statista.com/statistics/1017863/worldwide-iot-connected-devices-data-size. Accessed 19 Mar 2023.
“Die Lage der IT-Sicherheit in Deutschland 2020.“ Bundesamt für Sicherheit in der Informationstechnik (BSI), 2020. https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Publikationen/Lageberichte/Lagebericht2020.pdf?__blob=publicationFile&v=1. Accessed 19 Mar 2023.
“The NIS2 Directive - A high common level of cybersecurity in the EU.” European Union, 2022. https://www.europarl.europa.eu/RegData/etudes/BRIE/2021/689333/EPRS_BRI(2021)689333_EN.pdf. Accessed 19 Mar 2023.
“Proposal for a Directive of the European Parliament and of the Council on resilience of critical entities.” European Commission, 2020. https://eur-lex.europa.eu/resource.html?uri=cellar:74d1acf7-3f94-11eb-b27b-01aa75ed71a1.0001.02/DOC_1&format=PDF. Accessed 19 Mar 2023.
Altenburg T, Volk M, Staegemann D, Turowski K. Reliability Estimation of a Smart Metering Architecture using a Monte Carlo Simulation. In: 7th International Conference on Internet of Things, Big Data and Security (IoTBDS 2022), 2022. https://doi.org/10.5220/0010988100003194.
“Smart Grid Coordination Group. Smart Grid Reference Architecture.” CEN-CENELEC-ETSI, 2021. https://ec.europa.eu/energy/sites/ener/files/documents/xpert_group1_reference_architecture.pdf. Accessed 19 Mar 2023.
“Das Smart-Meter-Gateway - Cyber-Sicherheit für die Digitalisierung der Energiewirtschaft.“ Bundesamt für Sicherheit in der Informationstechnik (BSI), 2022. https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Publikationen/Broschueren/Smart-Meter-Gateway.pdf?__blob=publicationFile&v=6. Accessed 19 Mar 2023.
“Benchmarking smart metering deployment in the EU-28 Final Report.” European Commision, 2020. https://op.europa.eu/o/opportal-service/download-handler?identifier=b397ef73-698f-11ea-b735-01aa75ed71a1&format=pdf&language=en&productionSystem=cellar&part=. Accessed 19 Mar 2023.
“Monitoringbericht 2021.“ Bundesnetzagentur für Elektrizität, Gas, Telekommunikation, Post und Eisenbahnen, 2022. https://www.bundesnetzagentur.de/SharedDocs/Mediathek/Monitoringberichte/Monitoringbericht_Energie2021.pdf?__blob=publicationFile&v=9. Accessed 19 Mar 2023.
“Kosten-Nutzen-Analyse für einen flächendeckenden Einsatz intelligenter Zähler.“ Ernst & Young GmbH, 2013. https://www.erneuerbare-energien.de/EE/Redaktion/DE/Downloads/Studien/kosten-nutzen-analyse-fuer-einen-flaechendeckenden-einsatz-intelligenter-zaehler.pdf;jsessionid=86EAE859DADCB2FB104D40D3798B22B4?__blob=publicationFile&v=3. Accessed 19 Mar 2023.
Huang Y, Grahn E, Wallnerström CJ, Jaakonantti L. Smart meters in Sweden – lessons learned and new regulations. In: 3rd AIEE Energy Symposium on Energy Secrurity, 2018.
Müller KJ. Verordnete Sicherheit - das Schutzprofil für das Smart Metering Gateway. Datenschutz und Datensicherheit (DuD). 2011. https://doi.org/10.1007/s11623-011-0135-6.
Article Google Scholar
Papadopoulos Y, Walker MG, Parker D, Sharvia S, Bottaci L, Kabir S, Azevedo LS, Sorokos I. A synthesis of logic and bio-inspired techniques in the design of dependable systems. Annu Rev Control. 2016;41:170–82.
Article Google Scholar
“Guidelines, processes and recommendations for the design of dependable IoT Systems.” IoT4CPS Consortium, 2019. https://iot4cps.at/wp-content/uploads/2019/07/IoT4CPS_D3.2_V1.1.pdf. Accessed 19 Mar 2023.
Altenburg T, Bosse S, Turowski K. Safety in distributed sensor networks – A literature Review. In: 5th International Conference on Internet of Things, Big Data and Security (IoTBDS 2020), 2020. DOI: https://doi.org/10.5220/0009311701610168.
Wang Z, Broccardo M, Song J. Hamiltonian Monte Carlo methods for subset simulation in reliability analysis. Struct Saf. 2019;76:51–67. https://doi.org/10.1016/j.strusafe.2018.05.005.
Article Google Scholar
Laprié JC. Dependable Computing: Concepts, Limits, Challenges. In: 25th IEEE International Symposium on Fault-Tolerant Computing, 1995.
Bertsche B. Reliability in automotive and mechanical engineering. Heidelberg: Springer; 2008. https://doi.org/10.1007/978-3-540-34282-3.
Book Google Scholar
Avizienis A, Laprie JC, Randell B, Landwehr C. Basic concepts and taxonomy of dependable and secure computing. IEEE Trans Dependable Secure Comput. 2004. https://doi.org/10.1109/TDSC.2004.2.
Article Google Scholar
Yuan R, Tang M, Wang H, Li H. A reliability analysis method of accelerated performance degradation based on bayesian strategy. IEEE Access. 2019;7:169047–54. https://doi.org/10.1109/ACCESS.2019.2952337.
Article Google Scholar
Niknafs H, Faridkhah M, Kazemi C. Analytical approach to product reliability estimation based on life test data for an automotive clutch system. Mech Mech Eng. 2018. https://doi.org/10.2478/mme-2018-0065.
Article Google Scholar
Bobalo Y, Seniv M, Yakovyna V, Symets I. Method of Reliability Block Diagram Visualization and Automated Construction of Technical System Operability Condition. In: International Conference on Computer Science and Information Technologies, (CSIT), 2018. https://doi.org/10.1007/978-3-030-01069-0_43.
Ridzuan MIM, Rusli MAZ, Saad NM. Reliability Performance of Low Voltage (LV) Network Configuration. In: 11th Annual Energy Conversion Congress and Exposition (ECCE), 2019. https://doi.org/10.1007/978-981-15-2317-5_65.
Aggarwal AK, Kumar S, Singh V. Markov modeling and reliability analysis of urea synthesis system of a fertilizer plant. J Ind Eng Int. 2014. https://doi.org/10.1007/s40092-014-0091-5.
Article Google Scholar
Wang Z, Broccardo M, Song J. Hamiltonian Monte Carlo methods for subset simulation in reliability analysis. Struct Saf. 2019. https://doi.org/10.1016/j.strusafe.2018.05.005.
Article Google Scholar
Li G, Zhang K. A combined reliability analysis approach with dimension reduction method and maximum entropy method. Struct Multidiscipl Optim. 2011. https://doi.org/10.1007/s00158-010-0546-2.
Article MATH Google Scholar
Raso A, de Vasconcelos V, Marques R, Soares W, Mesquita A. Use of reliability engineering tools in safety and risk assessment of nuclear facilities. In: International Nuclear Atlantic Conference (INAC), 2017.
Harrison RL. Introduction to Monte Carlo simulation. AIP Conf Proc. 2010. https://doi.org/10.1063/2F1.3295638.
Article Google Scholar
Mason SJ, Hill RR, Mönch L, Rose O, Jefferson T, Fowler JW. Introduction to Monte Carlo Simulation. In: Winter Simulation Conference, 2008. https://doi.org/10.1109/WSC.2008.4736059.
Menčík J. Bathtub curve. Concise Reliab Eng. 2016. https://doi.org/10.5772/62357.
Article Google Scholar
Bonart T, Bar J. Quantitative Betriebswirtschaftslehre Band III—Marketing und Marktforschung, technische Zuverlässigkeit. Wiesbade: Springer Gabler; 2020.
Book Google Scholar
Ohring M. Engineering material science. Cambridge: Academic Press; 1995. https://doi.org/10.1016/B978-0-12-524995-9.X5023-5.
Book Google Scholar
Avizienis A, Laprie JC, Randell B. Fundamental Concepts of Dependability 2001.
March ST, Storey VC. Design science in the information systems discipline: an introduction to the special issue on design science research. MIS Q. 2008;32(4):725–30. https://doi.org/10.2307/25148869.
Article Google Scholar
Peffers K, Tuunanen T, Rothenberger MA, Chatterjee S. A design science research us methodology for information systems research. J Manag Inf Syst. 2008;24(3):45–77. https://doi.org/10.2753/MIS0742-1222240302.
Article Google Scholar
Henneke D, Freudenmann Ch, Wisniewski L, Jasperneite J. Implementation of Industrial Cloud Applications as Controlled Local Systems (CLS) in a Smart Grid Context. In: 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), 2017. https://doi.org/10.1109/ETFA.2017.8247687.
“Advanced Metering Infrastructure and Customer Systems-Results from the smart grid investment grant program.” U.S. Department of Energy, 2016. https://www.energy.gov/sites/prod/files/2016/12/f34/AMI%20Summary%20Report_09-26-16.pdf. Accessed 19 Mar 2023.
Johnson BW. Design and analysis of fault tolerant digital systems. Addison-Wesley Longman. 1988. https://doi.org/10.1007/978-3-642-75002-1_5.
Article Google Scholar
Avizienis A. Design of fault-tolerant computers. In: 31st American Federation of Information Processing Societies (AFIPS), 1967.
López G, Matanza J, Vega D, Castro M, Arrinda A, Moreno J, Sendin A. The role of power line communications in the smart grid revisited: applications, challenges, and research initiatives. IEEE Access. 2019;7:117346–68. https://doi.org/10.1109/ACCESS.2019.2928391.
Article Google Scholar
Jan H, Paul A, Minhas A, Ahmad A, Jabbar S, Kim M. Dependability and reliability analysis of intra cluster routing technique. Peer-to-Peer Netw Appl. 2015. https://doi.org/10.1007/s12083-014-0311-1.
Article Google Scholar
Parvin JR. An overview of wireless mesh networks. Int J Adv Res Eng Technol (IJARET). 2019;11(7):533–42. https://doi.org/10.34218/IJARET.11.7.2020.053.
Article Google Scholar
Ahmeda W, Hasana O, Perveza U, Qadirb J. Reliability modeling and analysis of communication networks. J Netw Comput Appl. 2017;78:191–215. https://doi.org/10.1016/j.jnca.2016.11.008.
Article Google Scholar
Ahmeda W, Hasana O, Tahar S. Formalization of reliability block diagrams in higher-order logic. J Appl Log. 2016;18:19–41. https://doi.org/10.1016/j.jal.2016.05.007.
Article MathSciNet Google Scholar
Lienig J, Brümmer H. Fundamentals of electronic systems design. Cham: Springer International Publishing; 2017. https://doi.org/10.1007/978-3-319-55840-0.
Book Google Scholar
Härtler G. Wahrscheinlichkeitsmodelle der Zuverlässigkeit. Berlin, Heidelberg: Springer Spektrum; 2016. https://doi.org/10.1007/978-3-662-50303-4_3.
Book Google Scholar
Gelman L, Martin N, Malcolm A, Liew E. Advances in Condition Monitoring and Structural Health Monitoring. Singapore: Springer; 2021. https://doi.org/10.1007/978-981-15-9199-0.
Book Google Scholar
Ram M, Davim JP. Diagnostic techniques in industrial engineering. Cham: Springer International Publishing; 2018. https://doi.org/10.1007/978-3-319-65497-3.
Book MATH Google Scholar
Dey S, Bhale P, Nandi S. ReFIT: reliability challenges and failure rate mitigation techniques for IoT systems. Innov Community Serv (I4CS). 2020. https://doi.org/10.1007/978-3-030-37484-6_7.
Article Google Scholar
“Erkenntnisse zu Umweltwirkungen von Smart Metern - Erfahrungen aus dem Einsatz von Smart Meter in Europa.“ Institut für ökologische Wirtschaftsforschung Berlin, 2021. https://www.umweltbundesamt.de/sites/default/files/medien/5750/publikationen/2021-05-06_cc_34-2021_umweltwirkungen_smart_meter.pdf. Accessed 19 Mar 2023.
Hauer T, Hoffmann P, Lunney J, Ardelean D, Diwan A. Meaningful Availability. In: Networked Systems Design and Implementation (NSDI), 2020.
Wong W, Zavodovski A, Corneo L, Mohan N, Kangasharju J. SPA: harnessing availability in the AWS spot market. IEEE Infocom. 2021. https://doi.org/10.1109/INFOCOMWKSHPS51825.2021.9484646.
Article Google Scholar
Zhou J, Zonghuan W, Zhonghua Y. Research on the reliability allocation method of smart meters based on DEA and DBN. Appl Sci. 2021. https://doi.org/10.3390/app11156901.
Article Google Scholar
Humenberger H, Schuppar B. Mit Funktionen Zusammenhänge und Veränderungen beschreiben - Mathematik Primarstufe und Sekundarstufe I + II. Heidelberg: Springer-Verlag; 2019. https://doi.org/10.1007/978-3-662-58062-2.
Book Google Scholar
Hartbecke K, Schütte C. Naturgesetze: Historische und systematische Perspektiven. Mentis Verlag; 2005.
Ji X. Normal Inverse Function in teaching inference about population mean and population proportion. In: 9th International Conference on Teaching Statistics (ICOTS9), 2014.
“Optimal number of trials for monte carlo simulation” Liu M. https://mliu.org/wp-content/uploads/2019/03/simulation-trials.pdf. Accessed 19 Mar 2023.
Sun J, Zhu G, Sun G, Liao D, Li Y, Sangaiah AK, Ramachandran M, Chang V. A reliability-aware approach for resource efficient virtual network function deployment. IEEE Access. 2018;6:18238–50. https://doi.org/10.1109/ACCESS.2018.2815614.
Article Google Scholar
Chang V. Presenting cloud business performance for manufacturing organizations. Inf Syst Front. 2020;22:59–75. https://doi.org/10.1007/s10796-017-9798-3.
Article Google Scholar

Download references

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Very Large Business Applications Lab, Faculty of Computer Science, Otto-von-Guericke University Magdeburg, Magdeburg, Germany
Tobias Altenburg, Daniel Staegemann, Matthias Volk & Klaus Turowski

Authors

Tobias Altenburg
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Staegemann
View author publications
You can also search for this author in PubMed Google Scholar
Matthias Volk
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Turowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tobias Altenburg.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Altenburg, T., Staegemann, D., Volk, M. et al. Reliability Estimation and Optimization of a Smart Meter Architecture Using a Monte Carlo Simulation. SN COMPUT. SCI. 4, 438 (2023). https://doi.org/10.1007/s42979-023-01917-8

Download citation

Received: 27 December 2022
Accepted: 15 May 2023
Published: 10 June 2023
DOI: https://doi.org/10.1007/s42979-023-01917-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Reliability Estimation and Optimization of a Smart Meter Architecture Using a Monte Carlo Simulation

Abstract

Similar content being viewed by others

Fast and Flexible Design of Optimal Metering Systems for Power Systems Monitoring

Quantitative modelling and analysis of a Chinese smart grid: a stochastic model checking case study

Technoeconomic Review of Smart Metering Applications

Introduction