Skip to main content

Modeling interconnected ICT and power systems for resilience analysis

Abstract

Increasing interdependencies between power and ICT systems amplify the possibility of cascading failures. Resilience against such failures is an essential property of modern and sustainable power systems and networks. To assess the resilience and predict the behaviour of a system consisting of interdependent subsystems, the interconnection requires adequate modeling. This work presents an approach to model and determine the state of these so-called interconnectors in future cyber-physical energy systems with strongly coupled ICT and power systems for a resilience analysis. The approach can be used to capture the impact of various parameters on system performance upon suitable modification. An hierarchical modeling approach is developed with atomic models that demonstrate the interdependencies between a power and ICT system. The modeling approach using stochastic activity nets is applied to an exemplary redispatch process in a cyber-physical energy system. The performance of an interconnector when facing limited performance from the ICT subsystem and its subsequent impact on the power system is analysed using the models. The state of the interconnector, as well as the service level are mapped to a resilience state-space diagram. The representation of system state on the resilience state-space diagram allows interpretation of system performance and quantification of resilience metrics.

Introduction

Energy supply systems are changing. The purpose of consistently providing power, gas or heat to customers under the objective of cost-efficiency is extended by the objective of decarbonisation. This led and continues to lead to renewable energy resources being connected to existing systems. Solar and wind power plants connected to the power system (PS) are typically smaller, area-wide distributed and can be controlled rapidly, although only within a volatile range of capacity. While conventional energy supply systems were operated separately, the emerging cyber-physical energy system (CPES) relies on information and communication technologies (ICT) and span multiple subsystems.

In CPES, an interconnector (IC) is defined as a technical instance that exists in two or more subsystems. These instances are physically, virtually, geographically or logically dependent on all subsystems the IC connects. Focusing on the subsystems PS and ICT, an example of an interconnector is controllable generation. It consists of generators from PS and ICT deployed to control the generators. As interconnectors exist in more than one subsystem, they are vulnerable to challenges from all the subsystems they exist in. Such challenges could be high-impact low-probability events such as cascading failures. As a result, challenges faced by ICT, such as loss of control leading to degraded system performance, could propagate to the power system through an interconnector and vice versa. Therefore, an interconnector must be resilient against such propagating failures to protect the subsystems. For the assessment of resilience of interconnectors, system measures affected by such high-impact low-probability events must be identified (e.g., voltage, latency). To determine these measures in strongly coupled power and ICT systems, a suitable model is required that identifies the impact of these measures as well as improvements associated with implementation of resilience mechanisms on system operation. Such models, upon adequate modification, can predict system behaviour under various conditions.

In this paper, a new method to model interconnectors is presented using stochastic activity nets (SAN) to analyse their performance. Stochastic activity nets are a state-based modeling approach that facilitate unified performance evaluation using discrete event simulation. SANs facilitate the expressiveness required for events to occur in such diverse systems. They also allow the integration of two systems characterised by heterogeneity of parameters and variables. Discrete event simulation is used to determine changes in system state caused by events occurring at discrete time intervals. With a combination of discrete event simulation and SANs, the models may be used to predict system behaviour in hypothetical scenarios. This simulation technique allows the specification of events that lead to a change of system state at a high resolution. In the approach presented, multiple stochastic activity nets are combined in a hierarchical manner to portray the interaction between subsystems. By running simulations on the models, markings are obtained which are used to compute the state of an interconnector. Finally, the state of the interconnector is depicted within a resilience state-space diagram (Sterbenz et al. 2010) to illustrate the state evolution of an interconnector with respect to its operational state and service level. The representation of the state on the resilience state-space diagram sets the basis for the resilience assessment of CPESs. The approach presented utilises a high resolution through a discrete quantification of system features and hence can be applied to any CPES, regardless of the CPES’s specific performance metric thresholds. It is assumed that it is possible to adapt system features to the necessary resolution of the parameter of interest and derive the system state and service level based on this resolution. It is also assumed that the system of interest is well defined at a granular level, i.e., components, topology, properties, etc. To abstract any system level to the necessary resolution, the granularity of the system definition may be increased while maintaining a suitable degree of detail. The proposed methodology, thus presents an efficient way to assess resilience of interconnectors. The contributions of this paper include the introduction of the concept of interconnectors in CPESs, a methodology for performance modeling of interconnectors using stochastic activity nets and a comprehensive mapping to a resilience state-space diagram. The methodology is developed based on a medium voltage SimBench grid (Meinecke et al. 2020) and is demonstrated using the example of redispatch.

The paper is structured as follows: “Related work” section presents the related work and compares the research contributions of this paper to previous work in similar domains. “System modeling assumptions” section gives a description of the CPES under investigation as well as of its interconnectors. “Redispatch in cyber physical energy systems” section introduces redispatch as a power system service that relies on interconnectors containing ICT and can be extrapolated to load-demand management in future energy supply systems. “Stochastic activity net model of interconnectors” section presents the SAN model of CPES and its evaluation, including results. In “State representation” section, a resilience state-space diagram is introduced. Also, an approach to map the interconnector state onto the aforementioned diagram is introduced, enabling the illustration of an interconnector’s state evolution.

Related work

The resilience of energy supply infrastructures and respective metrics are the subject of continuous improvement. The general ideas and reasons can be found in (Watson et al. 2014; Vugrin et al. 2017). The resilience of ICT systems in terms of state analysis has been defined by the authors of (Sterbenz et al. 2010). The authors define a two-dimensional state diagram onto which the state of a system is mapped for a resilience analysis. The resilience state-space diagram is extended to interconnectors in “State representation” section in this paper. On a theoretical level, interdependencies of ICT and energy supply domain systems, specifically the power system, are outlined in (Rinaldi et al. 2001) and reviewed in (Martins et al. 2017) from a resilience perspective, providing insights regarding the relevant categories yet without modeling the interconnection. In (Kamps et al. 2018), ICT dependent smart grid technologies for reliability calculations of distribution grids are modeled to determine the impact on customers and distributed generation (DG) reliability. Three states of ICT components are introduced but the shown impacts are not used to determine the power system state. The authors of (Wäfler and Heegaard 2013) categorise smart grid components and services and show the interactions between them. The authors propose a combined ICT and power system state meta-model for state estimation. State machines are used to determine the dependability, putting focus on the interconnection. The work presented in this paper bridges the gaps mentioned above by modeling the interconnection between power and ICT system to assess system behaviour while also determining the state of an interconnector considering all subsystems involved. The resulting models can be used to predict interconnected system behaviour under various conditions by feeding them with events that are considered challenging for at least one of the subsystems, e.g., an ICT server failure or the complete loss of a power system branch.

The qualitative approach in (Laprie et al. 2007) models interdependencies using state machines and stochastic petri nets (SPNs) to depict the event chains of escalating, cascading and common-cause failures. It illustrates how failures propagate across systems but does not allow the identification of the operational state. Furthermore, resilience is not quantified using the SPNs. SPNs are also used by the authors of (Chen et al. 2011) to model cyber-physical attacks on smart grids. The authors consider a hierarchical modeling approach, similar to the approach presented in this paper. However, the use of SPNs limits the flexibility required to model interconnections between systems due to lack of conditional state-variable based events in an interconnected system resulting in a generic model for smart grid attack scenarios. Markov chains are used by the authors of (Longo et al. 2017) to model large scale systems. The authors use a sub-model approach which allows the modeling of different subsystems in complex cases to quantify resilience but do not address interdependencies between the different systems. The use of SAN modeling (Sanders and Meyer 2000), described further in “Stochastic activity net model of interconnectors” section, introduces flexibility that allows considerations for multiple causes for an event. The first steps to model interdependencies using SANs were taken by authors in (Chiaradonna et al. 2007) but do not utilise the model to quantify resilience.

In summary, an approach that extends the individual processes for analysing state changes of power system and ICT by an explicit consideration of the ICT service state is crucial for the quantification of resilience of future CPESs. The connection between ICT-enabled grid services and power system requires explicit modeling. Existing work has either strictly focused on one of the subsystems, omitting their mutual influences, or are based on abstract representation levels leaving out critical information that would be needed for a realistic resilience analysis.

System modeling assumptions

The integrated ICT and power system studied in this paper is defined to focus and study the interconnection between the two subsystems. The modeling and selection of parameters make the results applicable to future CPESs. It is assumed that the degree of automation in those CPESs down to the distribution level is at least as high as in present transmission systems, even though the services as well as the means of communication used probably will differ from present SCADA systems. To reduce the complexity of the model, the variety of failures in ICT is reduced to transmission failures, because these failures are most relevant from a power system services perspective. While the degrees of detail of ICT and PS reflect all features needed for static load flow analysis, the modeled interconnection is reduced to a single PS service: an automised, decentral, curative distribution system redispatch (Kunz and Zerrahn 2013). Other services, like tap changing, topological reconfiguration or reactive power provision are not explicitly modeled.

Subsystems

A SimBench semi-urban distribution system is used as the PS instance (Fig. 1). It contains 122 buses and 126 lines representing load and generation, thus either as subordinate distribution (SD) grids or large single loads and generators. It is connected to a higher voltage level, where the high voltage connection serves as the slack bus. The grid has a high penetration of PV and wind generation. Those DGs are connected via power electronic converters and can thus be operated with negligible delay, their ramping rates being very high. Line loading is expected to reach critical levels due to a high level of renewable penetration like wind, solar power and biogas power plants (McCollum and et al. 2017) and new loads like electric vehicles at the nodes under consideration, that enable unfavourable allocation of load and generation for specific points in time. Controllable generation like the SDs labeled R and B in Fig. 1 are connected via a PS line and to the ICT control center via ICT links. To depict a future power system, all plants used for the curative redispatch are renewables that are operated below their maximum output such that the plants cannot only reduce but also increase their output (McCollum and et al. 2017; Van den Bergh et al. 2015). The PS properties are derived from time series of load and generation for all nodes. For the example in this paper, a one-year time-series of the SimBench grid is used. Time series analysis reveals the behaviour of the load and generation in terms of output increase and decrease rates for individual SDs, resulting from the stochasticity introduced by the load and weather dependent generation. Load flow calculation results on all time-steps of a time-series provide information on the branch loading and respectively the rates at which lines become overloaded or relieved. ICT is a system that supports the PS by providing services. It can be seen as a set of devices and links that collect, transfer and assess data, as well as send control signals to the PS. Links between devices are characterised by their bandwidth, latency, and transmission delay. In this paper, the ICT system comprises the server at the control center, the networking devices for communication, and sensor network at the interconnector that receives and transmits data and controls the generators.

Fig. 1
figure1

SimBench semiurban PS with two ICT-connected, remotely controllable generation interconnectors

Interconnector description

An Interconnector (IC) is a technical instance that is present in all subsystems it connects. ICs are physically, virtually, geographically, or logically dependent on one or more of the subsystems in the interconnected CPES. Let the set of s components in the entire PS be defined by \( PS\,=\,\{p_{1}, p_{2}, \dots, p_{s} \}\) and the set of t components in the ICT system be defined by \(ICT~=~\{c_{1}, c_{2}, \dots, c_{t} \}\). Let \(P~=~\{p_{1}, \dots, p_{m}\}~\subseteq ~PS,~m~\leq ~s\) and \(C~=~\{ c_{1}, \dots, c_{n}\}~\subseteq ~ICT,\,n~\leq ~t\). For subsets P and C, a relation from C to P defines an interdependency. For instance, the relation R1={(ci,pj) | ci supplies set points topj,(ci,pj)C×P}. Therefore, the set CO describing the IC is CO(R1) = {c1,p1}, which defines the set of elements with a relation required to form the IC. From Fig. 2, c1 is an IED and p1 is a generator. Hence, the IC is defined by a set containing elements that exhibit a relation between them. By defining more relations between elements, the set CO can be expanded. A second relation R2 from P to C could be defined as R2={(pi,cj) | pi supplies active power output level to cj,(pi,cj)P×C}. The set describing the relation can be for instance, CO(R2)={(p1,c2),(p2,c3)}, where c2 and c3 are sensors in ICT system and p1 a generator in PS. The relations defined should describe the interdependencies between PS and ICT. To determine which relations exist and which elements are relevant, the role of the IC must be specified in a system service, such as controllable generation in redispatch. Finally, the IC is defined by CO(R1,R2)={c1,c2,c3,p1,p2}. Generally, the IC can be described as \(CO(R_{1},R_{2}, \dots, R_{l}) = \{c_{1}, \dots, c_{n}, p_{1}, \dots, p_{m}\}\).

Fig. 2
figure2

Interconnector example: controllable generation

ICs are described in terms of their capabilities to observe each of the connected subsystems and to act within them to modify variable values. The IC, shown in Fig. 2, illustrates the interdependencies of the subsystems PS and ICT. It consists of a controllable generator and an intelligent electronic device (IED). The IC needs power to function (self-consumption) from the PS and provides active power P and reactive power Q to the PS. Conversely, it sends schedules of available P and Q generation capacities, based on local information, via the ICT system to the control center and receives set-point schedules from the ICT system. The IED can decide whether or not to apply the set point in controlling the generator based on local considerations. The IC is powered by the generator in combination with small local backup-storage capacities.

Interconnector model layers

Figure 3 shows the layers of abstraction followed in the modeling approach, with abstraction increasing from left to right. The most concrete layer on the left is the SimBench grid which is represented in system description and serves as the basis for the next layer, interconnector description. The following layer is a SAN model of ICs in the SimBench grid, which abstracts the processes at the lower level. Simulation on SANs provides information about the IC. This information is shown on the right most layer, where the state of the IC is represented on the resilience state-space diagram. The state-space diagram is the highest level of abstraction.

Fig. 3
figure3

Model abstraction layers

Redispatch in cyber physical energy systems

Redispatch is a possible solution to avoid overloading of branches in a PS. The system operator analyses the expected load flow based on the schedules of the generation units within the grid and requests generation unit operators to change their scheduled generation when it would lead to overloading. The redispatch relies on ICT services to perform the following:

  • Information exchange: DG operators submit schedules

  • Processing: PS operator identifies overloading (by load flow)

  • Processing: PS operator identifies DG that could solve overloading

  • Information exchange: PS operator requests change of schedule

  • Operation: DG operator controls their generation unit respectively

Excessive power flows can cause critical situations in the distribution lines. When operational limits of a line are violated, physical damage is prevented by a protection system, that can cause it to trip, e.g., by activation of a circuit breaker. Such a contingency can lead to neighboring lines overloading and, as a result, causing them to trip as well. Cascading failures of line trips could lead to large scale blackouts (Minkel 2008). Redispatch is a suitable solution to this problem. For an exemplary overloading between SD grid R and SD grid B in Fig. 1, due to an increased load flow from R to B, the operator sends a control signal via ICT to R to lower its output. Therefore, controllable generator ICs at grid R may be controlled via ICT to turn off some generation units. As a result, the power demand in grid B would no longer be met. The remaining capacities in B are activated based on a control signal from the control center, satisfying the demand at B. Hence, the branch RB is not critically overloaded anymore. This example describes how a curative redispatch can be utilised to change the distribution of energy sources supplying power to maintain grid stability. Normally redispatch is a part of operational planning. It is a scheduling process that considers points in time far enough in the future to allow for low ramping rates of (slow) conventional power plants. In this paper, all participating generators have the very high ramp rates of converter-coupled DGs which allows the application of automated, probably decentralised curative redispatch as a part of operation in the future (Van den Bergh et al. 2015). The rest of the paper uses redispatch as an example in the modeling method, as it depicts a service that relies on synchronised operation of both considered subsystems. Local control approaches, e.g., droop control, do not rely on synchronised operation of the subsystem. Such service are omitted here, as they would only add more SANs that do not share places form more than one subsystem.

Stochastic activity net model of interconnectors

A stochastic activity net (SAN) formalism, described further in “Stochastic activity nets” section, is adopted as the modeling approach to assess the resilience of ICs in smart grids. In Section Hierarchical model, SANs are applied to an example of redispatch in the system described in “System modeling assumptions” section.

Stochastic activity nets

SANs (Sanders and Meyer 2000) are an extension to the stochastic petri net formalism, allowing specification of pre- and post- conditions of an event, called an activity. The condition for an activity to occur is coded in an input gate while the impact of an activity occurring (called “firing”) may be determined by an output gate condition, facilitating highly expressive models. This expressiveness along with SANs being a state-based simulation approach supports the determination of the state of a system before and after an event occurs. The determination of system state via discrete event simulation enables its mapping to the resilience state-space diagram. This factor combined with the ability to represent multiple systems independent of a specific implementation in a single model makes SANs a suitable approach when compared to approaches such as co-simulation. SANs, like petri nets, contain places that indicate state variables by markings. They also include reward formalism from stochastic reward nets. The reward formalism used in this paper is averaged over a simulation iteration. Other modeling techniques such as stochastic petri nets, reward nets and markov chains lack the expressiveness afforded by SANs provided by the pre- and post- conditions. The Möbius tool (Gaonkar et al. 2009) is used to develop the model. The tool allows the definition of atomic models that depict the subsystems of CPESs. The atomic models may have shared places i.e., places that appear in two or more models. These models are composed together to form the overall system.

Hierarchical model

The model presented in this paper follows a hierarchical approach, as shown in Fig. 4. The models are split into a hierarchy, which is intended to show the operational state of the subsystems, the service provided by them and highlight the interdependencies between them. The atomic models in the hierarchy are tractable and can be modified accordingly to assess system performance under various conditions. The hierarchical approach also ensures scalability, as simulations are performed on specific atomic models. Therefore, only those places relevant to the simulation are utilised. In the hierarchy, the PS and ICT atomic models characterise the operational state of the respective systems. The lower level models are aggregated together into a composed model through shared places, which characterises the redispatch service. The shared places hence underline the interdependencies between the systems. In the Fig. 4, PS model ={P1,P2,...,Pi}, is the set of PS-specific atomic models, where each P describes a model for one of i SD grids. Similarly, the Server model ={S1,S2,...,Sj}, is the set of models of redundant servers, where each S describes a model of servers at one of j control centers. The intermediate level in the hierarchy shows an Overload model ={O1,O2,...,Ok} which is a set of models where each model O describes the state of a branch between two SD grids. The cardinality k of this set can be determined by the number of branches that exist between two SD grids. The state of the ICT system is determined in the intermediate level model, ICT model ={I1,I2,...,Il}, where each model I describes the availability of the ICT system when threatened by a certain type of failure. Multiple models I may be used to show l types of failures, the impact of resilience mechanisms like redundancy and compute the availability. At the top level, the Redispatch model ={R1,R2,...,Rm} is the set of m dispatchers that relieve the loading on k branches by performing redispatch. Note that PS model Overload model and PS model Redispatch model, as places are shared between PS model and the two higher level models. The models are evaluated based on baseline values of valid systems described further in “Results” section. The following subsection describes atomic models used for redispatch.

Fig. 4
figure4

SAN model hierarchy

Atomic models

In this subsection, the hierarchical atomic models based on the System modeling assumptions are described.

Power system model

Each SD grid has a generation and a demand and is capable of decentralised control (see “System modeling assumptions” section). The load and demand levels are not explicitly determined as a result of the granularity. Each SD grid either produces a “deficit” level of power, i.e., the generation is lower than the actual demand leading to loads being shed or consumption being reduced in general, “sufficient”, where the demand and the generation are in balance and “surplus” where the actual generation exceeds the demand leading to surplus power in the grid. Each SD grid is modeled in the same manner as shown in Fig. 5.

Fig. 5
figure5

SAN models of grid R and branch loading

The activities model the events where the load balance at an SD grid changes. The activity firing times are exponentially distributed to model the time lapse for a rise (or fall) in the sum of generation and load. The distribution is determined by finding the smallest residual sum of squares for the time-series and a selection of probability distributions. It reflects the rates converging to a stable (constant) non-negative value for a given grid when the number of time-steps increases, which allows the rates to be described by exponential distribution (Balakrishnan 2018). These activities have a very high firing rate in this proof-of-concept example, determined by the load flow calculations on the system described in “Subsystems” section. Power plants that have lower ramping rates (e.g., conventional coal fired plants) would decrease the activity rate. Further research could investigate the impact of decreased activity rates and other distribution functions. There are two important use cases to be analysed. Firstly, the application of the concept of curative redispatch to lower voltage levels with many small generation units at a very high temporal resolution. And secondly, the integration of conventional power plants with low ramping rates.

Overload model

The process of redispatch is initiated when a branch is loaded higher than 60%. Overloading of branches occurs when the load flow is higher than expected due to generation on one end and load on the other end of a branch. This can either be the result of the behaviour of load and generation as in the timeseries or of inappropriate IC control. Such a situation is modeled in Fig. 5, where two places, i.e., Branch load ok and Branch overloaded determine the state of the branch between the two SD grids. For the example in this paper, the focus is on branch RB in Fig. 1.

The model depicts a situation where the load balance in R is surplus and in B is deficit, the branch RB is overloaded. The branch may also be overloaded when load balance in B is surplus and in R is deficit. In either situation, the activity Branch overload fires. The exponential distribution of the activity firing time is chosen in the same manner as rise and fall activities of the PS model. The rate is determined by load flow simulations on the SimBench (See “Subsystems” section). A marking in the place Branch overloaded indicates the branch is overloaded. The loading on the branch is relieved when load reduces. The activity Branch relieved fires when the load balance situation in SD grids R and B is neither deficit nor surplus, i.e., it is sufficient. This activity’s firing time is similarly exponentially distributed as activities in PS model. Consequently, a marking in the place Branch load ok indicates the branch loading is normal.

ICT model

The state of ICT system that supports redispatch is modeled by the SAN as shown in Fig. 6. The model is divided into two parts: one details the ICT system availability and the other details a resilience mechanism, i.e., redundancy in servers. The ICT system components from sensors to servers in the control center function normally or have failed, i.e., partial failures are not considered. ICT components do not have a uniform failure rate throughout their lifespan (Torell and Avelar 2004). The rates used in this paper model the useful life phase of the failure rate curve in (Torell and Avelar 2004), which features a quasi-constant failure rate leading to exponentially distributed failure times. As repair times for failures may vary, mean time to repair (MTTR) is used to calculate the repair rate (Matz et al. 2002; Theristis and Papazoglou 2013). Since MTTR is a constant, the repair rate μ is also a constant. Hence, exponentially distributed times are utilised for failure and repair activities. The rates of the failure activities, λ, is given by: λ=λ1×λ2...×λn, where λn denotes the failure rate of the nth component (Stimson 2017). In this example, the service redispatch is performed by a server at the control center, where a primary server is backed up by a redundant server. The server state is depicted in server model while the state of the ICT system service is shown in the ICT model. In this model, the impact of server failure is studied on the process of redispatch. These models may be appropriately modified to assess impact of implementation-specific parameters and failures on system performance under various conditions. The respective state variables can be added, such as sensor state or controller state.

Fig. 6
figure6

SAN model of ICT system

The ICT system model identifies the availability of ICT for the service, as shown in Fig. 6. A transmission failure occurs if the server or networking devices fail to send signals, for example, due to a hardware or software failure. Such a failure could occur due to random failures, systematic failure, failed software patches, environmental damages or even malicious attacks (such as denial of service). These failures could cause the redispatch process to stall or fail. The SAN can be expanded to several failures impacting ICT by adding more activities and places. However, the goal remains the same, i.e., to determine if the ICT system is available to the PS or not. Redundancy of servers is characterised by the number of markings in the server model. The presence of markings in place server ok indicates there is atleast one server available. In this paper, the exact origin of the failure is not considered, but the resulting impact on the ICT system service is of interest. Based on the presence of a marking in either OK or Transmission failure, the ICT system is either ICT available or Not available. The activity status check, which is exponentially distributed as it is an event that occurs continuously at fixed time intervals and hence has a constant rate, is used to determine the state of the ICT system (Rajarajan et al. 2012). The ICT system status check is set to run every hour (Rajarajan et al. 2012).

Redispatch model

The model shown in Fig. 7 describes the process of redispatch involving ICs for branch RB. The places R surplus, R deficit and ICT available determine the state of the respective IC, i.e., controllable generation at R. Similarly, places B surplus, B deficit and ICT available determine the controllable generation at B. The places connected to input gate IG1 indicate the requirements for redispatch to occur. The activity Redispatch is an instantaneous activity indicated by the thin vertical rectangle in Fig. 7, i.e., it fires instantly after the gate checks if the places required for redispatch have a marking. An instantaneous activity is used to highlight the need of redispatch to relieve the branch loading. Redispatch may occur when grid R faces a surplus and grid B faces a deficit or vice versa. If either is the case, the activity fires, as redispatch is required to relieve branch RB. When the activity fires, redispatch is successful, which results in grid R and B having sufficient load balance and branch RB being relieved. In future work, the reliability of redispatch process could be a topic of interest, and failure paths may be added to show failures during the redispatch process.

Fig. 7
figure7

SAN model of redispatch service

The overload and redispatch submodels are higher level models compared to PS and ICT atomic models, and are specific to the redispatch service. These submodels can be adapted to other PS services as well as ICT implementations to assess IC performance when specific parameters are impacted. Furthermore, multiple services may be added together to form a complete model of an entire ICT-reliant PS. The SAN models shown consist of a lower number of places due to the example service chosen for this paper. Additionally, state-based modeling approaches face the problem of state-space explosion which the method presented in this paper circumvents. Using a hierarchical atomic model approach limits the number of states in each submodel. To scale the approach to larger systems, the model hierarchy may be expanded as well as more atomic models may be constructed and reused for multiple intermediate level models. For different use cases, only those atomic models may be used instead of an entire monolithic model.

Evaluation

A simulation is conducted to investigate the interdependencies between the atomic models of the system. The model is solved using a desktop PC with Intel i7 processor (2.7 Ghz) and 16 GB memory. The simulation run consists of 10,000 simulation points. As the rates that determine the occurrence in the SAN result from yearly simulations, those 10k points in time represent the timesteps of one year (8760 hours). An evaluation of the redispatch service model is performed approximately once per hour of operation of the modeled grid. Hence, those 10k points over a year are mapped to the hours of one year. To map the results of the simulation to the resilience state-space diagram, the service provided by and the operational state of the IC must be extracted from the results. In this paper, the service provided by the IC is measured in terms of the branch being loaded less than 60%, to account for situations that might endanger the compliance with the N-1 criterion as described for German TSOs in (Barrios Büchel et al. 2015; Agricola et al. 2012). The operational state of the IC is determined by the markings of the SANs relevant to the example, which are the ICT, SD grid R and B, and overload atomic models.

Model parameters

As systems with specification similar to our example are currently not widely developed or deployed, there are few sources to gather realistic, relevant parameters. The numbers shown in Table 1 were chosen as baseline either by present settings or simulations and assumptions stated in previous sections. ICT and PS are modeled independent of the temporal resolution. Both systems operate at different timescales, which is a hurdle for modeling a joint system containing both systems. However, SANs offer the opportunity to decouple the systems from the timescales. The system processes are characterised by rates of activities, such as failure rates, which are derived from “per time” values and are independent of a temporal resolution. A simple example is a manual switch action, which has a delay and very low failure rate. In such a case, the rate of the activity would represent the delay in performing the switching action manually. Simulation parameters such as failure rates of SCADA systems were taken from (Erickson et al. 2000; Çetinkaya 2001; Jensen et al. 2010). Generator rise, fall and overload rates were derived from load flow calculations performed on SimBench distribution grids (Meinecke et al. 2020) shown in Table 2. The overload rates reflect the probability of a line being overloaded. More precisely, it indicates {oo=P[line loading>60%]} over one year with time steps of 15 minute resolution in SimBench distribution grid time series (Spalthoff et al. 2019).

Table 1 Fixed simulation parameters
Table 2 Failure rates in ICT and branch overload rates

Results

In this section, the performance of the IC during the simulations is studied. Simulations were performed to evaluate the performance of the ICT system in terms of its availability to the PS, availability of IC and branch loading of branch RB in the redispatch process. The simulation results can be seen in the Figs. 8, 9, 10. The output converged to a confidence level of 95% with a relative confidence interval of 1.23E−3.

Fig. 8
figure8

Availability of the ICT services needed for redispatch

Fig. 9
figure9

Probability of failure of redispatch process and the chosen state change thresholds at 0.985 and 0.970

Fig. 10
figure10

Probability of all branches being kept in safe range only using redispatch as a counter measure and the chosen state change thresholds at 0.995 and 0.985

ICT availability Figure 8 shows the availability of ICT-enabled redispatch service to the PS during the simulations. The figure represents simulation output for a different server failure rate from Table 2 conducted by monitoring the place ICT available in Fig. 6. The ICT system is equipped with a redundant server along with a primary server for redispatch. The ICT system is said to be unavailable only if both servers have failed. In this case the curve drops, indicating a lower availability. The curve does not drop to 0 in the aforementioned case as the values are averaged over each simulation iteration. Over the course of the curve, the spikes indicate a higher availability as one server is available during that time. The curves for each system exist in a small range that indicate the actual availability of an ICT system susceptible to failures. Systems II-IV show results for a system with scaled values of failure and overload rates. As system II has the highest failure rate, it has the lowest availability as well. System I depicts the result for a valid real world SCADA server. As can be seen in Fig. 8, the availability lies between 99.5% and 100%. This represents the actual availability of SCADA servers as demonstrated in (Jensen et al. 2010). The models can be adapted to any type of failure. Accordingly, the rate of the failure and the change in state variable will have to be integrated into the model.

Result: The model can determine ICT system availability under various failure types.

Interconnector Figure 9 shows the state of the IC indicated by its availability. The analysis is based on load balance situation in grid R, and availability of the ICT system, as the IC is controllable generation at R. The curves in the graph follow a similar trajectory pattern to those in Fig. 8. This is because ICT availability has a direct influence on the controllability of the IC. Hence, when the availability of ICT decreases, the availability of service provided by IC decreases too. Even though the trajectories are similar, the actual availability is lower, i.e., the IC has a lower availability when compared to the ICT system. This is because the result in Fig. 9 also considers load balance situation in grid R. The lower availability is a consequence of the volatility of sources (change of generator output) at grid R. As availability of ICT decreases, the availability of redispatch as a service also decreases. Hence, the loads at R or B are less likely to be balanced. For example, in system IV, at the 3752th hour of operation, ICT has an availability between 98% and 98.5% while the IC offers an availability between 97% and 97.5%. There are also steeper increases and decreases in the latter curve. This behaviour is due to the load balance situation in grid R. The graph is divided into three parts, each indicated by a different colour. Each threshold at 0.970 and 0.985 indicates operational state change described further in “State representation” section.

Result: The impact of ICT system failures can be traced through interconnectors onto PS.

Branch loading Figure 10 shows the probability of branch RB’s load remaining below 60%. The results are obtained by an analysis of the place Branch load ok in the SAN shown in Fig. 5. Figure 10 shows results from the benchmark SimBench and the SAN model. With SimBench being a pure PS model, it is only possible to capture branch loading with ideal ICT, i.e., ICT that always successfully delivers its service such as redispatch and SimBench without redispatch. Neither case is representative of reality, where ICT faces failures. Such a case with imperfect ICT is shown by the output of the SAN model where an ICT system, such as the one defined in this paper, is susceptible to failures. The branch loading in Fig. 10 is initially less than 60%, hence all curves begin at probability 1. Due to ICT failures propagating through ICs, availability of ICs decrease. Therefore, when redispatch is required, the lower availability of the IC hinders the process which causes the branch to be overloaded for extended periods of time. For example, in system IV, the availability of the IC falls to 97%. As a consequence, the controllable generators cannot be appropriately controlled to relieve the load on branch RB. Hence, the probability of branch loading being below 60% decreases.

The spikes in the curves indicate that load and generation profiles vary and relieve branches without redispatch as well. However, as the values plotted are averages, the curves do not return to normal levels. They stay in the lower ranges due to higher overload rates. This means that due to volatile loads and generation, the branches often load above 60%. As described in “Overload model” section, such situations occur due to intensive load flows. In System II, the overload rates are very low, i.e., the need for extensive load flows is low. This shows the impact of overload rate parameter. Even though the IC has a relatively low availability due to the high ICT failure rate, the branch load stays in the safe range due to the low overload rate. This implies that the impact seen on the PS is not only a result of ICT failures. Figure 10 shows information that cannot be captured from SimBench by explicitly considering the ICT system state. It is also divided into three parts, each indicated by a different colour. Each threshold at 0.985 and 0.995 indicates service level state change described further in “State representation” section.

Result: The model captures the impact of ICT failures on PS, unveiling that failure propagation is not necessarily intuitive.

Discussion Using the model developed in this work, interdependencies between interconnected PS and ICT systems can be highlighted and investigated. Analysing the Figs. 8, 9, 10 together, knowledge about system performance under various ICT availabilities can be inferred. The figures show the cascading of failures from ICT onto PS through ICs, the impact of which is captured by the models. The models use parameters from real world system such as failure rates, overload rates, and generator rise and fall rates to demonstrate its validity under the assumptions (granularity, well-definedness) made for modeling purposes. These assumptions enable the specification of aggregated state variables for ICs necessary for the SAN model while also maintaining a lower state-space size. In case the assumptions were not made, the state-space size would grow as the number of state variables also increases. However, the integration of features that are now omitted (e.g., technology-specific ICT details) is possible by adding more state variables. This would improve the degree of detail and thus the validity, but also shift the focus from the resilience relevant features to subsystem specific features. It can be noted that the numbers on the ordinate of the graphs are very high probabilities. This is intended to show the focus on modeling the unforeseen high-impact low-probability events in CPESs (Trakas et al. 2016).

State representation

With the results of the simulation, the next step is to evaluate the performance of the system with respect to service delivered and operational state on the resilience state-space diagram. As shown in Fig. 11, the two-dimensional state-space is divided into a 3x3 matrix to provide necessary abstraction and limiting the number of regions (Sterbenz et al. 2010). The service level is classified into either acceptable, impaired or unacceptable while the operational state is classified into either normal, partially degraded and degraded. While each region may contain multiple states, in this limiting case, each region represents just one state. Formally, the operational state \(\mathbb {N}\) is defined by a set of markings {m1,m2,...,mk} from the k SANs and service \(\mathbb {P}\) is determined by {p1,p2,...,pl} from l services of interest. Both dimensions may be multi-variate with a well defined mapping between \(\mathbb {N}\) and \(\mathbb {P}\). In this paper, \(\mathbb {N}\) is defined by

$$ \{{m_{1},m_{2}}\} = \{\text{ICT availability}, \text{branch loading}\} $$
(1)
Fig. 11
figure11

State evolution of the redispatch service for systems I, II, III and IV on the resilience state-space diagram

and service p1 considered is branch loading relief. The state-space diagram provides an additional layer of abstraction on top of the SANs to illustrate an aggregated set of markings. The aggregated set of markings determine the system state. The graphs in Figs. 8, 9, 10 show the aggregation of those markings over a time interval. The thresholds in Figs. 9 and 10 are used to identify the state change. The horizontal \(\mathbb {N}\) axis represents the markings that define the operational state of the IC obtained from the SAN model of grid R and ICT availability at the same grid. The vertical \(\mathbb {P}\) axis is the likelihood the branch overloads due to the unavailability of redispatch service. From Fig. 10, 0.995 marks the threshold between acceptable and impaired and 0.985 marks the threshold between impaired and unacceptable. From Fig. 9, 0.985 is chosen as the threshold for the change from normal operation to partially degraded to illustrate that the high dependency of the redispatch service on the IC as it is important for the operation and countermeasures needed to be taken. The threshold of 0.970 marks the state change from partially degraded to severely degraded. These thresholds are shown in the Fig. 11.

Consider systems I and IV. System I is based on data from existing SCADA and SimBench systems, reflecting today’s distribution grids, while system IV is based on baseline values for future CPESs. The performances of these systems are shown on the state-space diagrams in Fig. 11. The IC in system I stays in normal operation state as shown in Fig. 9, while the service delivered is shown by the branch loading in Fig. 10. From Fig. 9, it can be deduced that the IC in system I is in normal operation state initially. It remains in normal operation state after a reduction in ICT availability. This is because the IC suffered some disturbances due to load balance in grid R but not enough to cause a state change. The service level suffers from minimal disturbances as the branch does not overload often and hence remained in the acceptable range.

The IC in System IV is initially in normal operation state as shown in Fig. 11. Since ICT availability reduces due to failures, the IC operational state changes to partially degraded. This IC state change impacts redispatch and hence the service level weakens to impaired and eventually unacceptable. Towards the end, the operational state changes to degraded from partially degraded. System II state trajectory shows an IC that is resilient. As the IC operational state degrades from normal operation to severely degraded, the service provided remains acceptable. The operational state sees a degradation due to the impact of ICT failures on the IC. However, the branch loading is not impacted to a great extent and hence does not indicate a state change. The branch state does not change since the load and generation profiles in that scenario does not result in a load-demand mismatch often, indicated by the lower overload rate. Using the trajectories of the IC state, several metrics of interest can be derived. One such metric of resilience (R) is determined by calculating the area underneath the curve to find R (Sterbenz et al. 2014). A resilient IC would have a lower value of R as it indicates the system state does not degrade too much. Such an IC would be resistant to quick changes in operational state and would deliver an acceptable level of service consistently. Comparing the trajectories of ICs in systems II and IV in Fig. 11, R for system IV is evidently higher than that of system II. This indicates that the resilience of the IC in system II is higher than that of IV as it provides an acceptable level of service even when in severely degraded operational state.

Result: With the mapping of IC’s state and service to the resilience state-space diagram, the resilience of ICs under various conditions can be quantified, compared and assessed.

Conclusion and future work

The main contributions of this work include the introduction of the concept of interconnectors in CPESs, an approach to model the so-called interconnectors and the depiction of the interconnector state on the resilience state-space diagram. The concept of interconnectors is defined and modeled using SANs. Using the models, it is possible to identify those parameters that have an impact on system performance. The models show the high degree of interdependence between PS and ICT subsystems. Based on a well-defined system, system features are adapted to a discrete quantification to specify the feature’s current state. With the mapping of system state to the resilience state-space diagram, the resilience of CPESs can be investigated. Redispatch is used as a case study where atomic models depict individual processes of interest in each subsystem. The process of redispatch itself is depicted in a top-level model in the hierarchical structure to highlight the requirements from both systems. The evaluation of the model reveals how failures occurring in one system, in this case an imperfect ICT, can propagate and impact the other system, namely the PS. Thus, using the model and its results representation on the state-space diagram, a basis for resilience assessment is defined. The state-space diagram can be evaluated for various resilience metrics as well as the comparison between similar implementations of systems. In future work, more detail will be included in the models. Technology and implementation of specific details such as latency and diversity can be added by modifying the atomic models. Similarly, PS topology information, protection system measures can be added into the PS atomic models. The ICT atomic model can be expanded to include more specific challenges faced by the system but precaution must be taken to avoid a state-space explosion. Additionally, the resilience metric obtained from the state-space requires further research, as the SAN model results reveal its drawbacks. For example, if a hypothetical system provides unacceptable service with a normal operation state after having an initial state like system I in Fig. 11, by the definition of resilience metric as area under the curve, R is the same as system II, i.e., 0 and hence, is resilient. Explicitly considering the subsystems of a CPES and their interconnections allows a more precise assessment of its resilience.

Availability of data and materials

Not applicable.

References

  1. Agricola, A-C, Höflich B, Richard P, Völker J, Rehtanz C, Greve M, Gwisdorf B, Kays J, Noll T, Schwippe J, Seack A, Teuwsen J, Brunekreeft G, Meyer R, Liebert V (2012) dena-verteilnetzstudie. ausbau- und innovationsbedarf der stromverteilnetze in deutschland bis 2030. (kurz: dena-verteilnetzstudie) Technical report Deutsche Energie-Agentur GmbH (dena). https://www.dena.de/fileadmin/dena/Dokumente/Pdf/9106_Studie_dena-Netzstudie_II_deutsch.PDF. Accessed 30 Sept 2020.

  2. Balakrishnan, K (2018) Exponential Distribution: Theory, Methods and Applications. Routledge, London.

    Google Scholar 

  3. Barrios Büchel, H, Natemeyer H, Winter S (2015) Leistungsflüsse und netzauslastung im europäischen Üb ertragungsnetz bis 2050. Technical report, RWTH Aachen University Institut für Hochspannungstechnik (IFHT). https://www.bmu.de/fileadmin/Daten_BMU/Pools/Forschungsdatenbank/fkz_um_11_41_130_energieinfrastruktur_europa_bf.pdf. Accessed 30 Sept 2020.

  4. Çetinkaya, E (2001) Reliability analysis of scada systems used in the offshore oil and gas industry. Master’s thesis, Missouri University of Science and Technology. https://scholarsmine.mst.edu/masters_theses/2040. Accessed 30 Sept 2020.

  5. Chen, T, Sanchez-Aarnoutse J, Buford J (2011) Petri net modeling of cyber-physical attacks on smart grid. IEEE Trans Smart Grid 2(4):741–749.

    Article  Google Scholar 

  6. Chiaradonna, S, Lollini P, Di Giandomenico F (2007) On a modeling framework for the analysis of interdependencies in electric power systems In: 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07), 185–195.. IEEE, New York City.

    Chapter  Google Scholar 

  7. Erickson, K, Miller A, Stanek E, Dunn-Norman S (2000) Survey of scada system technology and reliability in the offshore oil and gas industry. MMS TA&R Program Program SOL 1435-01-99-RP3995:38–43.

  8. Gaonkar, S, Keefe K, Lamprecht R, Rozier E, Kemper P, Sanders W (2009) Performance and dependability modeling with möbius. ACM SIGMETRICS Perform Eval Rev 36(4):16–21.

    Article  Google Scholar 

  9. Jensen, M, Sel C, Franke U, Holm H, Nordström L (2010) Availability of a scada/oms/dms system - a case study In: 2010 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT Europe), 1–8.. IEEE, New York City.

    Google Scholar 

  10. Kamps, K, Möhrke F, Zdrallek M, Awater P, Schwan M (2018) Modeling of smart grid technologies for reliability calculations of distribution grids In: 2018 Power Systems Computation Conference (PSCC), 1–7.. IEEE, New York City.

    Google Scholar 

  11. Kunz, F, Zerrahn A (2013) The benefit of coordinating congestion management in germany In: 2013 10th International Conference on the European Energy Market (EEM), 1–8.. IEEE, New York City.

    Google Scholar 

  12. Laprie, J-C, Kanoun K, Kaâniche M (2007) Modelling interdependencies between the electricity and information infrastructures In: International Conference on Computer Safety, Reliability, and Security, 54–67.. Springer, Berlin.

    Chapter  Google Scholar 

  13. Longo, F, Ghosh R, Naik V, Rindos A, Trivedi K (2017) An approach for resiliency quantification of large scale systems. ACM SIGMETRICS Perform Eval Rev 44(4):37–48.

    Article  Google Scholar 

  14. Martins, L, Girao-Silva R, Jorge L, Gomes A, Musumeci F, Rak J (2017) Interdependence between power grids and communication networks: A resilience perspective In: DRCN 2017-Design of Reliable Communication Networks; 13th International Conference, 1–9.. VDE, Frankfurt.

    Google Scholar 

  15. Matz, S, Votta L, Malkawi M (2002) Analysis of failure and recovery rates in a wireless telecommunications system In: Proceedings International Conference on Dependable Systems and Networks, 687–693.. IEEE.

  16. McCollum, D, et al. (2017) Sdg7: Ensure access to affordable, reliable, sustainable and modern energy for all a guideto sdg interactions: From science to implementation ed d griggs et al (paris: International council for science).

  17. Meinecke, S, Drauz S, Klettke A, Sarajlic D, et al. (2020) Simbench documentation - electric power system benchmark models. Technical Report EN-1.0.0 University of Kassel, Fraunhofer IEE, RWTH Aachen University, TU Dortmund University. www.simbench.net. Accessed 30 Sept 2020.

  18. Minkel, J (2008) The 2003 northeast blackout-five years later. Sci Am 13:1–2.

    Google Scholar 

  19. Rajarajan, M, Piper F, Wang H, Kesidis G (2012) Security and Privacy in Communication Networks: 7th International ICST Conference, SecureComm 2011, London, September 7-9, 2011, Revised Selected Papers, Vol. 96. Springer, Berlin, Heidelberg.

    Book  Google Scholar 

  20. Rinaldi, S, Peerenboom J, Kelly T (2001) Identifying, understanding, and analyzing critical infrastructure interdependencies. IEEE Control Syst Mag 21(6):11–25.

    Article  Google Scholar 

  21. Sanders, W, Meyer J (2000) Stochastic activity networks: Formal definitions and concepts In: School Organized by the European Educational Forum, 315–343.. Springer, Berlin/Heidelberg.

    Google Scholar 

  22. Spalthoff, C, Sarajlić C, Kittl D, Drauz S, Kneiske T, Rehtanz C, Braun M (2019) Simbench: Open source time series of power load, storage and generation for the simulation of electrical distribution grids In: International ETG Congress, 2019, Esslingen, Germany.. VDE, Frankfurt.

    Google Scholar 

  23. Sterbenz, J, Hutchison D, Çetinkaya E, Jabbar A, Rohrer J, Schöller M, Smith P (2010) Resilience and survivability in communication networks: Strategies, principles, and survey of disciplines. Comput Netw 54(8):1245–1265.

    MATH  Article  Google Scholar 

  24. Sterbenz, J, Hutchison D, Çetinkaya E, Jabbar A, Rohrer J, Schöller M, Smith P (2014) Redundancy, diversity, and connectivity to achieve multilevel network resilience, survivability, and disruption tolerance invited paper. Telecommun Syst 56(1):17–31.

    Article  Google Scholar 

  25. Stimson, W (2017) Forensic Systems Engineering: Evaluating Operations by Discovery. John Wiley & Sons, Hoboken, NJ.

    Google Scholar 

  26. Theristis, M, Papazoglou I (2013) Markovian reliability analysis of standalone photovoltaic systems incorporating repairs. IEEE J Photovolt 4(1):414–422.

    Article  Google Scholar 

  27. Torell, W, Avelar V (2004) Mean time between failure: Explanation and standards. White Paper 78:6–7.

    Google Scholar 

  28. Trakas, D, Hatziargyriou N, Panteli M, Mancarella P (2016) A severity risk index for high impact low probability events in transmission systems due to extreme weather In: 2016 IEEE PES Innovative Smart Grid Technologies Conference Europe (ISGT-Europe), 1–6.. IEEE, New York City.

    Google Scholar 

  29. Van den Bergh, K, Couckuyt D, Delarue E, D’haeseleer W (2015) Redispatching in an interconnected electricity system with high renewables penetration. Electr Power Syst Res 127:64–72.

    Article  Google Scholar 

  30. Vugrin, E, Castillo A, Silva-Monroy C (2017) Resilience metrics for the electric power system: A performance-based approach. https://doi.org/10.2172/1367499.

  31. Wäfler, J, Heegaard P (2013) Interdependency modeling in smart grid and the influence of ict on dependability In: Meeting of the European Network of Universities and Companies in Information and Communication Engineering, 185–196.. Springer, Berlin.

    Google Scholar 

  32. Watson, J-P, Guttromson R, Silva-Monroy C, Jeffers R, Jones K, Ellison J, Rath C, Gearhart J, Jones D, Corbet T, Hanley C, Walker L (2014) Conceptual framework for developing resilience metrics for the electricity, oil, and gas sectors in the united states. https://doi.org/10.2172/1177743.

Download references

Funding

This work was supported by the German Research Foundation DFG as part of the project “Multi-Resilience” with the project identification number 360352892 of the priority program DFG SPP 1984 - Hybrid and multimodal energy systems: System theory methods for the transformation and operation of complex networks. Publication costs were covered by the DACH+ Energy Informatics Conference Organizers, supported by the Swiss Federal Office of Energy.

Author information

Affiliations

Authors

Contributions

The authors ADP and JH contributed equally to this work. ADP yielded the ICT perspective and implemented the SAN, while JH provided the PS perspective and ran the simulations on the SimBench grid. MB and HdM provided guidance throughout the work. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Amit Dilip Patil.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Patil, A., Haack, J., Braun, M. et al. Modeling interconnected ICT and power systems for resilience analysis. Energy Inform 3, 17 (2020). https://doi.org/10.1186/s42162-020-00120-w

Download citation

Keywords

  • Interdependency
  • Joint system modeling
  • Cyber-physical energy systems
  • Resilience