A stochastic activity net (SAN) formalism, described further in “Stochastic activity nets” section, is adopted as the modeling approach to assess the resilience of ICs in smart grids. In Section Hierarchical model, SANs are applied to an example of redispatch in the system described in “System modeling assumptions” section.

### Stochastic activity nets

SANs (Sanders and Meyer 2000) are an extension to the stochastic petri net formalism, allowing specification of pre- and post- conditions of an event, called an *activity*. The condition for an activity to occur is coded in an *input gate* while the impact of an activity occurring (called “firing”) may be determined by an *output gate* condition, facilitating highly expressive models. This expressiveness along with SANs being a state-based simulation approach supports the determination of the state of a system before and after an event occurs. The determination of system state via discrete event simulation enables its mapping to the resilience state-space diagram. This factor combined with the ability to represent multiple systems independent of a specific implementation in a single model makes SANs a suitable approach when compared to approaches such as co-simulation. SANs, like petri nets, contain *places* that indicate state variables by markings. They also include reward formalism from stochastic reward nets. The reward formalism used in this paper is averaged over a simulation iteration. Other modeling techniques such as stochastic petri nets, reward nets and markov chains lack the expressiveness afforded by SANs provided by the pre- and post- conditions. The Möbius tool (Gaonkar et al. 2009) is used to develop the model. The tool allows the definition of atomic models that depict the subsystems of CPESs. The atomic models may have shared places i.e., places that appear in two or more models. These models are composed together to form the overall system.

### Hierarchical model

The model presented in this paper follows a hierarchical approach, as shown in Fig. 4. The models are split into a hierarchy, which is intended to show the operational state of the subsystems, the service provided by them and highlight the interdependencies between them. The atomic models in the hierarchy are tractable and can be modified accordingly to assess system performance under various conditions. The hierarchical approach also ensures scalability, as simulations are performed on specific atomic models. Therefore, only those places relevant to the simulation are utilised. In the hierarchy, the PS and ICT atomic models characterise the operational state of the respective systems. The lower level models are aggregated together into a composed model through shared places, which characterises the redispatch service. The shared places hence underline the interdependencies between the systems. In the Fig. 4, PS model ={*P*_{1},*P*_{2},...,*P*_{i}}, is the set of PS-specific atomic models, where each *P* describes a model for one of *i* SD grids. Similarly, the Server model ={*S*_{1},*S*_{2},...,*S*_{j}}, is the set of models of redundant servers, where each *S* describes a model of servers at one of *j* control centers. The intermediate level in the hierarchy shows an Overload model ={*O*_{1},*O*_{2},...,*O*_{k}} which is a set of models where each model *O* describes the state of a branch between two SD grids. The cardinality *k* of this set can be determined by the number of branches that exist between two SD grids. The state of the ICT system is determined in the intermediate level model, ICT model ={*I*_{1},*I*_{2},...,*I*_{l}}, where each model *I* describes the availability of the ICT system when threatened by a certain type of failure. Multiple models *I* may be used to show *l* types of failures, the impact of resilience mechanisms like redundancy and compute the availability. At the top level, the Redispatch model ={*R*_{1},*R*_{2},...,*R*_{m}} is the set of *m* dispatchers that relieve the loading on *k* branches by performing redispatch. Note that PS model ⊂ Overload model and PS model ⊂ Redispatch model, as places are shared between PS model and the two higher level models. The models are evaluated based on baseline values of valid systems described further in “Results” section. The following subsection describes atomic models used for redispatch.

### Atomic models

In this subsection, the hierarchical atomic models based on the System modeling assumptions are described.

#### Power system model

Each SD grid has a generation and a demand and is capable of decentralised control (see “System modeling assumptions” section). The load and demand levels are not explicitly determined as a result of the granularity. Each SD grid either produces a *“deficit”* level of power, i.e., the generation is lower than the actual demand leading to loads being shed or consumption being reduced in general, *“sufficient”*, where the demand and the generation are in balance and *“surplus”* where the actual generation exceeds the demand leading to surplus power in the grid. Each SD grid is modeled in the same manner as shown in Fig. 5.

The activities model the events where the load balance at an SD grid changes. The activity firing times are exponentially distributed to model the time lapse for a rise (or fall) in the sum of generation and load. The distribution is determined by finding the smallest residual sum of squares for the time-series and a selection of probability distributions. It reflects the rates converging to a stable (constant) non-negative value for a given grid when the number of time-steps increases, which allows the rates to be described by exponential distribution (Balakrishnan 2018). These activities have a very high firing rate in this proof-of-concept example, determined by the load flow calculations on the system described in “Subsystems” section. Power plants that have lower ramping rates (e.g., conventional coal fired plants) would decrease the activity rate. Further research could investigate the impact of decreased activity rates and other distribution functions. There are two important use cases to be analysed. Firstly, the application of the concept of curative redispatch to lower voltage levels with many small generation units at a very high temporal resolution. And secondly, the integration of conventional power plants with low ramping rates.

#### Overload model

The process of redispatch is initiated when a branch is loaded higher than 60%. Overloading of branches occurs when the load flow is higher than expected due to generation on one end and load on the other end of a branch. This can either be the result of the behaviour of load and generation as in the timeseries or of inappropriate IC control. Such a situation is modeled in Fig. 5, where two places, i.e., *Branch load ok* and *Branch overloaded* determine the state of the branch between the two SD grids. For the example in this paper, the focus is on branch *RB* in Fig. 1.

The model depicts a situation where the load balance in *R* is *surplus* and in *B* is *deficit*, the branch *RB* is overloaded. The branch may also be overloaded when load balance in *B* is *surplus* and in *R* is *deficit*. In either situation, the activity *Branch overload* fires. The exponential distribution of the activity firing time is chosen in the same manner as rise and fall activities of the PS model. The rate is determined by load flow simulations on the SimBench (See “Subsystems” section). A marking in the place *Branch overloaded* indicates the branch is overloaded. The loading on the branch is relieved when load reduces. The activity *Branch relieved* fires when the load balance situation in SD grids *R* and *B* is neither *deficit* nor *surplus*, i.e., it is *sufficient*. This activity’s firing time is similarly exponentially distributed as activities in PS model. Consequently, a marking in the place *Branch load ok* indicates the branch loading is normal.

#### ICT model

The state of ICT system that supports redispatch is modeled by the SAN as shown in Fig. 6. The model is divided into two parts: one details the ICT system availability and the other details a resilience mechanism, i.e., redundancy in servers. The ICT system components from sensors to servers in the control center function normally or have failed, i.e., partial failures are not considered. ICT components do not have a uniform failure rate throughout their lifespan (Torell and Avelar 2004). The rates used in this paper model the useful life phase of the failure rate curve in (Torell and Avelar 2004), which features a quasi-constant failure rate leading to exponentially distributed failure times. As repair times for failures may vary, mean time to repair (MTTR) is used to calculate the repair rate (Matz et al. 2002; Theristis and Papazoglou 2013). Since MTTR is a constant, the repair rate *μ* is also a constant. Hence, exponentially distributed times are utilised for failure and repair activities. The rates of the failure activities, *λ*, is given by: *λ*=*λ*_{1}×*λ*_{2}...×*λ*_{n}, where *λ*_{n} denotes the failure rate of the *nth* component (Stimson 2017). In this example, the service redispatch is performed by a server at the control center, where a primary server is backed up by a redundant server. The server state is depicted in *server model* while the state of the ICT system service is shown in the *ICT model*. In this model, the impact of server failure is studied on the process of redispatch. These models may be appropriately modified to assess impact of implementation-specific parameters and failures on system performance under various conditions. The respective state variables can be added, such as sensor state or controller state.

The ICT system model identifies the availability of ICT for the service, as shown in Fig. 6. A transmission failure occurs if the server or networking devices fail to send signals, for example, due to a hardware or software failure. Such a failure could occur due to random failures, systematic failure, failed software patches, environmental damages or even malicious attacks (such as denial of service). These failures could cause the redispatch process to stall or fail. The SAN can be expanded to several failures impacting ICT by adding more activities and places. However, the goal remains the same, i.e., to determine if the ICT system is available to the PS or not. Redundancy of servers is characterised by the number of markings in the server model. The presence of markings in place *server ok* indicates there is atleast one server available. In this paper, the exact origin of the failure is not considered, but the resulting impact on the ICT system service is of interest. Based on the presence of a marking in either *OK* or *Transmission failure*, the ICT system is either *ICT available* or *Not available*. The activity *status check*, which is exponentially distributed as it is an event that occurs continuously at fixed time intervals and hence has a constant rate, is used to determine the state of the ICT system (Rajarajan et al. 2012). The ICT system status check is set to run every hour (Rajarajan et al. 2012).

#### Redispatch model

The model shown in Fig. 7 describes the process of redispatch involving ICs for branch *RB*. The places *R surplus*, *R deficit* and *ICT available* determine the state of the respective IC, i.e., controllable generation at *R*. Similarly, places *B surplus*, *B deficit* and *ICT available* determine the controllable generation at *B*. The places connected to input gate *IG1* indicate the requirements for redispatch to occur. The activity *Redispatch* is an instantaneous activity indicated by the thin vertical rectangle in Fig. 7, i.e., it fires instantly after the gate checks if the places required for redispatch have a marking. An instantaneous activity is used to highlight the need of redispatch to relieve the branch loading. Redispatch may occur when grid *R* faces a surplus and grid *B* faces a deficit or vice versa. If either is the case, the activity fires, as redispatch is required to relieve branch *RB*. When the activity fires, redispatch is successful, which results in grid *R* and *B* having sufficient load balance and branch *RB* being relieved. In future work, the reliability of redispatch process could be a topic of interest, and failure paths may be added to show failures during the redispatch process.

The overload and redispatch submodels are higher level models compared to PS and ICT atomic models, and are specific to the redispatch service. These submodels can be adapted to other PS services as well as ICT implementations to assess IC performance when specific parameters are impacted. Furthermore, multiple services may be added together to form a complete model of an entire ICT-reliant PS. The SAN models shown consist of a lower number of places due to the example service chosen for this paper. Additionally, state-based modeling approaches face the problem of state-space explosion which the method presented in this paper circumvents. Using a hierarchical atomic model approach limits the number of states in each submodel. To scale the approach to larger systems, the model hierarchy may be expanded as well as more atomic models may be constructed and reused for multiple intermediate level models. For different use cases, only those atomic models may be used instead of an entire monolithic model.