Currently, five relevant research topics have been addressed: surrogate modeling, energy system simulation, correlation modeling, model composition and uncertainty analysis. State of the art and related work regarding these topics will be presented in this section.

The surrogate modeling has its origins in and is part of statistical design of experiments, which is a domain-independent tool for describing observed behavior of a system. There is a lot of literature on the subject and for a deeper insight the reader is referred to Response Surface Methodology of Myers et al. (Myers et al., 2016), which represents a comprehensive state of the art. A variety of practical applications of surrogate models can be found. In the research project D-Flex,^{Footnote 1} the integration of renewable energy resources based on a centrally controlled load and generation management approach was compared to a decentralized approach. The evaluation was model-based and carried out as part of a scenario analysis. To estimate the required computation time, a benchmark scenario with 70,000 units to be simulated was defined and run for a simulated time of 1 day, 1 week and 1 month. The latter took almost 5 days to complete. Surrogate models successfully reduced the calculation time and thus made it possible to carry out the planned evaluations with several scenarios for 1 year each within time. Dalal et al. investigated outage scheduling for components of the power system (Dalal et al., 2018). Outage scheduling is necessary to organize maintenance and replacement activities of components. They presented a framework to assess outage schedules and proposed an optimization method to create a schedule for a list of required outages. In outage scheduling, several future scenarios are evaluated in terms of feasibility. The authors assessed the evaluation of a large number of possible scenarios as impracticable and therefore machine learning was utilized to generate a surrogate model that evaluates these scenarios.

An increasing number of distributed energy resources means that not only a few large energy generators have to be controlled, but also many small ones. In addition, there are also consumers whose consumption can be controlled in terms of time or quantity. The coordination of all these units requires a functioning communication network. The combination of the power grid and information and communication technology is called smart grid. Since testing technologies and methodologies in the real power grid is not feasible, simulation is the tool of choice. The development of technologies and algorithms for smart grids can practically not be carried out in the real power grid. Steinbrink et al. (Steinbrink et al., 2017) gave an overview over the state-of-the-art simulation-based approaches, which is summarized in the following. To simulate the individual components of the smart grid, simulation models of these components are required, some of which can be very complex. These models are often built by domain experts for their favorite simulation environment. To couple all these simulation models and environments, co-simulation can be used, i.e. each simulator only needs to implement the interface for the co-simulation framework, which handles communication among different simulators. Steinbrink et al. conclude that co-simulation is one of three suitable tools for smart grid simulation. They also point out, that future research needs to include the development of surrogate models to improve simulation performance. The other two simulation tools, multi domain simulation as well as real time simulation and hardware-in-the-loop, will not be discussed here.

To model dependencies between two random variables, different correlation functions can be used. Linear dependency is often described with the Pearson correlation, which returns a value between − 1 and 1. Positive correlation implies, that both variables simultaneoulsy attain high or low values. On the other hand, negative correlation implies: When the first variable attains a high value, the second will attain a low value. But it is also possible to have non-linear dependencies and the Pearson correlation is too restricted to model these. Such dependencies of two or more random variables can be fully described with multivariate or joint distribution functions. When high numbers of random variables with dependencies are expected, a partial correlation network can be useful. Partial correlation is the dependency between two random variables without the influence of other variables. A network of partial correlation visualizes the dependencies and is, e.g., used in psychological science (Epskamp & Fried, 2018).

The idea of model composition is basically to utilize dependencies between two random variables, which can be from the same model or from other models. Blank proposed a method to assess the reliability of coalitions of renewable power units for the provision of ancillary services in her PhD thesis (Blank, 2015). The composition of such a coalition is connected to the planning of how much power the units produce and when. The units considered by Blank are wind and photovoltaic systems that are located spatially close to each other. This is relevant for risk analysis, as it can be assumed that dependencies exist between the forecasting errors of the units. These dependencies can be modeled by correlations or, in non-linear cases, by joint distributions. Another approach for model composition is the so-called cokriging method. Han et al. (Han et al., 2010) adapted the general idea of cokriging and used it for variable fidelity modeling, i.e. combining two datasets describing the same model. The assumption is, that one of these datasets comes from a high-fidelity model, which is expensive to compute and therefore this dataset contains not very many samples. The other dataset is produced by a low-fidelity model and has much more samples than the first one. Cokriging interpolates between these two datasets and therefore aggregates the low-fidelity and the high-fidelity model.

In his PhD thesis, Steinbrink (Strinbrink, 2017) developed a modular concept for uncertainty quantification in smart grid co-simulation. A prototypical implementation for the co-simulation framework mosaik^{Footnote 2} was also provided with certain energy simulation systems in different sizes. To quantify the uncertainty of a model, sample data from simulation steps is needed and higher number of samples can improve the accuracy of the uncertainty quantification. The author also utilizes simple interpolation models as surrogates to reduce the number of required samples compared to a Monte Carlo sampling approach. Wilson et al. (Wilson et al., 2018) investigated a computer model of the UK’s electricity supply with regard to uncertainty. This model calculates electricity price projections from 2010 to 2030 and uses uncertain inputs, such as future energy demand. The uncertainty of the computer model caused by these and other influencing factors should be quantified. Since the evaluation of the model took up to 1 hour, the authors used a Bayesian linear model as a surrogate model. In this way, the number of necessary computer model evaluations could be greatly reduced, even if this added another source of uncertainty. This surrogate model was used together with a probability distribution over the inputs to study the uncertainty of the overall model. The authors conclude that surrogate models are a useful approach for the quantification of uncertainties, especially if the number of evaluations of the original model is to be kept low for time reasons.

Surrogate models are an established tool to speed-up simulations and there are also many applications in the energy domain, that make use of surrogate models. There is also work done that addresses the composition of different units or models by determining interdependencies and correlations. But there is still missing a methodology to use these dependency information in order to build a surrogate model, that comprises two or more simulation models. The contribution of this PhD project is to link these topics and to derive a methodology which closes this gap.