Skip to main content

Evaluation of neural networks for residential load forecasting and the impact of systematic feature identification


Energy systems face challenges due to climate change, distributed energy resources, and political agenda, especially distribution system operators (DSOs) responsible for ensuring grid stability. Accurate predictions of the electricity load can help DSOs better plan and maintain their grids. The study aims to test a systematic data identification and selection process to forecast the electricity load of Danish residential areas. The five-ecosystem CSTEP framework maps relevant independent variables on the cultural, societal, technological, economic, and political dimensions. Based on the literature, a recurrent neural network (RNN), long-short-term memory network (LSTM), gated recurrent unit (GRU), and feed-forward network (FFN) are evaluated and compared. The models are trained and tested using different data inputs and forecasting horizons to assess the impact of the systematic approach and the practical flexibility of the models. The findings show that the models achieve equal performances of around 0.96 adjusted R2 score and 4–5% absolute percentage error for the 1-h predictions. Forecasting 24 h gave an adjusted R2 of around 0.91 and increased the error slightly to 6–7% absolute percentage error. The impact of the systematic identification approach depended on the type of neural network, with the FFN showing the highest increase in error when removing the supporting variables. The GRU and LSTM did not rely on the identified variables, showing minimal changes in performance with or without them. The systematic approach to data identification can help researchers better understand the data inputs and their impact on the target variable. The results indicate that a focus on curating data inputs affects the performance more than choosing a specific type of neural network architecture.


Energy systems face challenges due to climate change, distributed energy resources, and political agenda. For instance, in Denmark, By 2030 carbon emissions should be reduced by 70%, with the goal by 2050 being carbon footprint neutrality (Danish Energy Agency 2022a; Ma and Jørgensen 2018). To achieve this goal, the Danish government has introduced initiatives to accelerate the energy system transition to a total reliance on renewable energy sources. Among the initiatives are state-of-art energy islands, investments in technologies, such as Power-to-X and Carbon Capture, and a green transition of the industry (Danish Energy Agency 2022b). However, the changes to the energy system will lead to an increasing number of distributed energy resources (DERs), introducing new challenges, such as grid balancing (Ma et al. 2017, 2019a; Billanes et al. 2017). In addition, the electrification of vehicles and heating of households through heat pumps increases the overall electricity consumption (Ma et al. 2021; Fatras et al. 2021). These challenges are significant to distribution system operators (DSOs) who are responsible to the electricity grids (Ma et al. 2016; Christensen et al. 2021). Furthermore, DSOs face many other challenges, e.g., the resilience of the grid after natural disasters (Hu et al. 2021), an increasing number of DERs (Sauter et al. 2017), or the security of supply (Ma et al. 2019b), and cost of the grid maintenance and upgrade (Gören et al. 2022).

There are three types of electricity consumers: residential, commercial and industrial consumers (Billanes et al. 2018), and in many cases, they are located separated. Households make up around 12% of the total energy accounts and close to 13% of the emission accounts of Denmark (Statistics Denmark 2022). During peak consumption hours, households account for 35% of the total electricity load (Andersen et al. 2017). Furthermore, the adoption of DERs such as photovoltaics, electric vehicles, and heat pumps influence households’ electricity consumption patterns that potentially results in grid overloads (Christensen et al. 2019).

Thus, it is important for DSOs to understand the state of their grid on the short- and long-term to ensure operational quality, maintenance, and identifying areas in the grid for renovations or investments. Some research has experimented with accurate forecasts on a short- to long-term horizon by applying machine learning (ML) and deep learning (DL) methods to the problem. Several types of neural networks, ML algorithms, and hybrids have been tested with excellent results. Furthermore, the electricity load forecasts have been tested with various independent variables and applications (Vanting et al. 2021).

However, in the literature, the independent variables are not systematically identified beforehand, often leading to the questions: why were the variables chosen in the first place, and how do they relate to the target variable? Moreover, the argument for specific supporting data does not appear until the features are analyzed for selection criteria such as correlation analysis (Friedrich and Afshari 2015; Pindoriya et al. 2010; Vonk et al. 2012). Additionally, the related literature does not explain the composition of the electricity load, i.e., the sources of electricity consumption in the aggregated load data, which may lead to a better understanding of the performance of the proposed models. Based on the challenges the DSOs face regarding the distribution grid, this study seeks to improve the prediction accuracy of load forecasts using a systematic data identification approach.

To fill the research gap, this paper aims to identify variables related to residential area aggregated electricity load systematically. The identified variables will be used to forecast the aggregated electricity consumption of two residential areas in Denmark. The systematic identification and subsequent selection will be made using the CSTEP framework (Ma 2022), which maps data within an ecosystem in several dimensions. The identification ensures that any possible data is accounted for and a strong foundation for supporting data is available, which was missing in related works. The impact of the systematic identification on the model performance will be assessed by testing and evaluating multiple types of neural networks based on related works. Moreover, the data is analyzed using the K-Means clustering algorithm to investigate the composition of the electricity load before it is aggregated.

Furthermore, to determine the impact of different electricity consumption sources, such as heat pumps and electric heating, the performance of the selected neural networks will be compared on subsets of the data set containing households with and without electric-based heating. The types of neural networks are based on the applications in the literature. The most popular models included in this paper are feed-forward networks (FFN), recurrent neural networks (RNN), and Long Short-Term Memory (LSTM) networks. Additionally, because the related publications have rarely applied Gated Recurrent Units (GRU), it will also be used in this experiment. Finally, to test the flexibility of the neural networks, each tuned model will be used to predict a single-step (1 h) and 24-step (24 h) of the electricity load.

This paper is structured as follows. First, the literature related to electricity load forecasting is presented. Afterward, the data processing and analysis is described in the methodology section, including the systematic identification and selection using the CSTEP framework. Thirdly, the forecasting results of the models are presented, compared, and analyzed. Finally, the impact of the systematic identification approach is discussed based on the results of the forecasts.

Related works

Electricity load forecasting using machine learning algorithms and deep neural networks has been a major area of research in the last decade. The increasing amount of data available and rising interest in artificial intelligence research has led researchers to experiment with different types of networks, algorithms, and hybrids to achieve high accuracies or low errors for their forecasts (Vanting et al. 2021).

Based on the literature, electricity load forecasting can be placed into three horizons: short-, medium-, and long-term (Gebreyohans et al. 2018; Solyali 2020). Short-term forecasting is applied when predicting minutes, sometimes referred to as very short-term forecasting, and up to 1 week, as seen in Samuel et al. (2020); Houimli et al. 2020; Yong et al. 2020). Medium-term forecasts start from 1 week and go up several months to a year (Shirzadi et al. 2021; Salama et al. 2009; Gungor et al. 2020). Finally, long-term horizons are forecasts focused on predicting more than a year, sometimes several decades, depending on the data (Parlos and Patton 1993; Ekonomou 2010; Ghods and Kalantar 2008). Other than the length, each forecasting horizon is characterized by several parameters, including the independent variables, applications of the forecast, and models used for the prediction.

Long-term forecasts leverage socioeconomic data as independent variables and are usually applied to problems concerning larger areas, such as states, provinces, and countries (Elkamel et al. 2020; Tanoto et al. 2011). Furthermore, weather data are used on long-term forecasts for the electricity load of states and countries (Gao et al. 2019). In the literature, weather data includes outdoor temperature, humidity, wind speed and direction, precipitation, and solar irradiation. Moreover, electricity load forecasting on medium-term is applied to larger areas such as countries, states, and residential areas. Variables include weather, electricity prices, and socioeconomic data (Salama et al. 2009; Ilseven and Gol 2017). Short-term forecasts are applied to electricity grids and microgrids, power and substations, residential and office buildings, cities, provinces, and countries, using weather data and temporal features as independent variables (Li et al. 2021; Xu et al. 2019; Panapongpakorn and Banjerdpongchai 2019; Ahmad and Chen 2018; Ruiming 2008). Short-term forecasts are essential to determine if the load exceeds the capacity of a transformer, which can prevent power outages (Dung and Phuong 2019; Giamarelos et al. 2021; Al-Rashid and Paarmann 1996).

Additionally, the short-term forecast can indicate windows for flexibility to achieve sector coupling, leading to a more efficient energy system (Yan et al. 2012; Pramono et al. 2019; Xypolytou et al. 2017). The model selection varies within in each forecasting horizon, meaning a single type of model cannot be identified. Instead, researchers have tested several statistical methods, machine learning algorithms, and different types and combinations of neural networks to reach accurate predictions, leading to a highly diverse research field with a wide range of applications and independent variables.

In the literature, several types of neural networks have been applied. One network type is the recurrent neural network (RNN), designed to work with sequential data. The strength of an RNN is that it can take information from prior inputs together with the input at a given timestamp to better decide on the output. Furthermore, one of the more popular networks is the Long Short-Term Memory (LSTM) network, a type of RNN specifically designed to deal with long data sequences. It was first introduced in 1997 by Schmidhuber and Hochreiter and improved upon the regular RNN by dealing with the vanishing gradients problem (Hochreiter and Schmidhuber 1997). Gated Recurrent Units (GRUs) (Cho et al. 2014), which are another type of specialized RNN similar to the LSTM network, have also been applied to short-term load forecasting (Ribeiro et al. 2020; Zhu et al. 2019). Finally, a fully connected feed-forward network has also been a popular choice to forecast electricity load in the literature. Researchers have experimented with different configurations and combinations of networks and algorithms to improve forecast accuracy. While many apply regular neural networks, some combine several into hybrid ones, as seen in Panapongpakorn and Banjerdpongchai (2019) and Pramono et al. (2019). Others transform the forecast into an image recognition problem and use state-of-the-art convolutional neural networks to predict the load (Li et al. 2017; Sadaei et al. 2019).


This paper systematically identifies and selects data relevant to forecasting the electricity load of residential areas to build a strong foundation of supporting data to improve the performance metrics of the forecasting model. To identify the possible features, the CSTEP framework proposed in Ma (2022) is used to analyze and evaluate an ecosystem by mapping the features to the five influential dimensions: Cultural, Societal, Technology, Economy and Finance, and Policies and Regulation. For this paper, the CSTEP framework is extended with different data variables dimensions to include supporting, embedded, exogenous variables and the impact of the variables on the electricity load. Supporting variables include sensor readings and statistical data, i.e., weather and climate measurements or electricity prices. Embedded variables are data that can be embedded in the target variable or other data sources, for example, temporal features or the sun's position. Exogenous variables are considered data that cannot be directly given as an input to a model but still impact the target or supporting variables. Finally, the impact on the target variable describes how each dimension and the different types of variables affect the increasing or decreasing electricity consumption of residential areas.

So far, no literature has systematically identified and selected the relevant data using the CSTEP framework. Researchers often rely on correlation analysis of features or tree-based methods for determining feature importance to decide on independent variables for multivariate forecasting. Before identifying the CSTEP variables, the electricity load is analyzed to examine the composition of the aggregated load. This step aims better to understand the performance of the model during inference.

Furthermore, this can help make the black box of neural networks more transparent by understanding the inputs better. The analysis of the electricity load will be done using descriptive statistics and by clustering the daily load profiles of each household in the area to investigate the different load patterns. The algorithm applied for the clustering is K-Means using dynamic time warping as the distance method. Afterward, the identified CSTEP variables are examined for data availability and sourced for the subsequent data analysis. Afterward, the electricity load is used to conduct feature engineering of temporal features and lagged electricity load. Finally, all selected features undergo a feature selection process using correlation coefficients and tree-based methods for feature importance.

After the data processing and analysis section, the evaluation and selection of neural networks are conducted based on related works and the research gap. This paper tests the performance of four separate neural networks on the aggregated electricity load. Baseline models of a feed-forward network (FFN), recurrent neural network (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU) are established and used as the starting point to tune hyperparameters and select the optimal architecture. Each tuned model is trained on the aggregated load data with and without including the selected CSTEP variables and used to forecast a single hour and 24 h. Then, each model is also trained on aggregated electricity consumption data containing households exclusively with heat pumps or electric heating.

To assess the performance of the models in this paper, four different metrics will be used, presented in the equations below.

Mean Absolute Error

$$MAE= \frac{1}{n} \sum_{i=1}^{n}\left|{x}_{i}^{pred}-{x}_{i}^{true}\right|$$

Mean Absolute Percentage Error

$$MAPE=\frac{100\mathrm{\%}}{n}\sum_{i=1}^{n}\left|\frac{{x}_{i}^{true} - {x}_{i}^{pred}}{{x}_{i}^{true}}\right|$$

Root Mean Squared Error

$$RMSE={\left[\sum_{i=1}^{n}\frac{{\left({x}_{i}^{pred}-{x}_{i}^{true}\right)}^{2}}{n} \right]}^\frac{1}{2}$$

Adjusted R2 Score

$${R}^{2}=1-\frac{\sum_{i=1}^{n}{\left({x}_{i}^{true}-{x}_{i}^{pred}\right)}^{2}}{\sum_{i=1}^{n}{\left({x}_{i}^{true}-{\overline{x} }^{true}\right)}^{2}}$$
$$Adj{R}^{2}=1-\frac{\left(1-{R}^{2}\right)\times \left({n}_{samples}-1\right)}{\left({n}_{samples}-{p}_{variables}-1\right)}$$

The CSTEP framework

The CSTEP framework consists of five critical business ecosystems dimensions, which are: climate, environment, and geographic situation; Societal culture and demographic environment; Technology (Infrastructure, technological skills, technology readiness); Economy and finance; Policies and regulation. Each dimension has several sub-dimensions with specific explanations as defined in Table 1 in Ma (2022). Additionally, the dimensions can be viewed on a macro and micro level based on the focuses of the business ecosystems. For instance, the sub-dimensions of Climate, environmental and geographic situation can be divided into a macro level considering the general weather conditions and natural features of a place (climate and geographic situation). Meanwhile, the micro level considers the living, working and production environment or conditions (environmental situation). The macro and micro levels of a dimension differ depending on the perspective of either the ecosystem or the individual stakeholder, focusing on either the general or specific levels of the business ecosystem (Ma 2022).

Analysis of electricity load

The electricity load data used in this paper is collected from two residential areas in Denmark in connection with a national project called Flexible Energy Denmark (Flexible Energy Denmark 2019). The data ranges from January 1st, 2019, to May 15th, 2022, and includes 211 households after processing and cleaning the data. From the residential areas, the data set includes households without photovoltaic panels, electric heating or heat pumps, and non-electric vehicle (EV) owners who use home-charging. Households with any of these characteristics are separated from the pure electricity consumption with central heating or district heating.

These data are sourced by using the Danish building registry that collects information about all buildings in Denmark by law (Bygnings- og Boligregistret 2022). For EV owners, a different method had to be used, as this information is not registered anywhere. Instead, each household’s data was analyzed to detect possible EV owners by clustering the load to identify outliers using K-Means. Subsequently, the load was searched for minimum–maximum consumption ranges that exceed 7.2 kWh, which is a typical consumption pattern for EV charging. By separating the households that have adapted these DERs, the impact of their load on the ability to accurately forecast can be investigated.

Figure 1 shows each household’s average daily consumption profiles, where the red line indicates the average load within each cluster. The most typical consumption profiles can be seen in Cluster 2 and Cluster 5. Clusters 0 and 3 can be considered outlier profiles, while Clusters 4 and 1 are somewhere in between with equally many households, as seen in the distribution of clusters in Fig. 2.

Fig. 1
figure 1

Clusters of daily load profiles

Fig. 2
figure 2

Distribution of daily profile clusters

The average consumption pattern over a year for the two residential areas can be seen from Fig. 3. In Denmark, household consumption usually increases during winter and decreases when the summer nears. Many factors can influence the consumption pattern, such as the sun, amount of light, temperature, rain, and wind. From the figure, a very distinct spike can also be seen towards the end of Christmas, a reoccurring pattern. These factors lead to several supporting data, for instance, the position of the sun, the weather, the length of days during the year, and special days, such as religious or national holidays.

Fig. 3
figure 3

Average yearly consumption pattern

Figure 4 shows the average aggregated daily load of the two residential areas. The pattern shows a slight increase during morning hours and a peak at 17:00. The period in the afternoon is essential to forecast correctly, as this is where the grid is challenged by high electricity loads that approach the grid's capacity. Each residential area is connected to a similar type of transformer with a capacity of 400 kWh.

Fig. 4
figure 4

Average daily load profile of the aggregated load

Identification of CSTEP variables

As described earlier, any supporting data for the electricity load will be identified and mapped using the CSTEP framework. Table 1 shows the relevant variables identified for this research experiment. The variables are based on applications in related literature and from domain experts. The supporting variables include sensor readings or statistical data, such as weather and electricity prices. The embedded variables include data such as holidays, day lengths, demographics, and building information. The exogenous variables are data that cannot directly be used as an input for a model but add additional information about the other variables. The variables in this dimension can help explain irregularities or unexpected results. The final column describes how each CSTEP dimension’s identified data impacts the target variable, which in this case is the electricity consumption of households.

Table 1 Systematically identified CSTEP variables

After the systematic identification, each variable is investigated for availability and feasibility. Using openly available sources, the following CSTEP variables have been collected:

  • Holidays (Denmark)

  • Day lengths

  • Sun azimuth

  • Sun altitude

  • Electricity prices

While many researchers insist on the importance of weather data to support the electricity load forecast (Vanting et al. 2021; Friedrich and Afshari 2015), it is not necessarily meaningful to include it in this experiment. The aggregated electricity load data is collected from two residential areas with some distance between them, meaning local weather data is unavailable. There may be a correlation between some weather data and the electricity load. However, causation cannot directly be determined in such an instance.

Feature selection and analysis

After selecting CSTEP variables, the data is analyzed with the aggregated electricity load using correlation coefficients and feature importance. The coefficients are calculated using Pearson’s R, and the feature importance is the gain from gradient boosted trees using the Python library XGBoost. Figure 5 shows a correlation heatmap of the coefficients of each variable. There are no strongly correlated features with the electricity load, but a slight negative relationship with day lengths and a slight positive relationship with the sun’s azimuth.

Fig. 5
figure 5

Correlation heatmap of CSTEP variables

Looking at the relative feature importance of each variable in relation to the electricity load, the sun’s azimuth is calculated to be the most important feature, as seen in Fig. 6. The gain signifies the relative contribution of the feature over all decision-trees in the gradient boosting model.

Fig. 6
figure 6

Gradient-boosted feature importance

At this point, each feature has also been analyzed individually for any irregularities. The analysis resulted in a decision to discard the electricity price variable due to a substantial increase in the price in 2022. This increase would only be visible in the test data set, potentially resulting in unexpected predictions, as the increase is not reflected in the electricity consumption. The variable is visualized in Fig. 7.

Fig. 7
figure 7

Historical electricity prices

In summary, the target variable of the electricity load is analyzed using K-Means clustering to identify different load profiles. The load profiles will give a better understanding of the input data to make the black box of neural networks more transparent. Furthermore, supporting independent variables have been systematically identified, selected, and analyzed using correlation coefficients and feature importance of gradient-boosted trees. Finally, each independent variable was analyzed for missing or broken data, potential irregularities, and seasonal patterns and trends, resulting in discarding the electricity price as an independent variable.

Model selection

The model selection is based on neural networks from related works, which are a fully connected feed-forward neural network (FFN), a recurrent neural network (RNN), and a long short-term memory network (LSTM). Finally, to fill a gap in the literature, a gated recurrent unit (GRU) is also included in the experiments of this paper.

Baseline performance and models

First, a baseline performance of the forecasting problem is conducted using a simple multivariate linear regression model to predict the electricity load based on the CSTEP variables as input. The baseline performance resulted in the metrics seen in Table 2. These baseline metrics are considered the minimum to beat by the proposed models.

Table 2 Baseline performance

Secondly, each selected model is trained and evaluated on the data once without any hyperparameter tuning or feature engineering to assess the base performance of each neural network. From here, the baselines will be iteratively improved by tuning training, data, and model parameters. Table 3 presents the baseline metrics of each model using the CSTEP variables as independent variables for the electricity load. At this point, all models perform equally without any feature engineering or hyperparameter tuning.

Table 3 Baseline neural networks

Model tuning

Each model from Table 3 will undergo a tuning process, where several parameters are tested in different combinations. To do this, the experiment tracking tool Weights and Biases is leveraged to find the best size and combination of the tunable parameters (Biewald 2020). An iterative random search process can be conducted by setting up a training loop that tests all four models, ending with a greedy search. The tunable parameters are seen in Table 4 below. Each tunable parameter has several values that are chosen uniformly and randomly. The feature engineering includes lags from 1 to 168 h, and the temporal features have been encoded cyclically using sine and cosine transformations.

Table 4 Tunable hyperparameters

After running several tests and calculating metrics for each model, the best parameters could be found. Table 5 summarizes the tuned parameters for each model. These four tuned models are subsequently trained on data ranging from January 1st, 2019, to May 15th, 2021, and evaluated on the test data from May 15th, 2021, to May 15th, 2022. Each model will be trained four times, resulting in 16 different prediction results: a 1-h forecast using CSTEP variables, a 1-h forecast without CSTEP variables, a 24-h forecast using CSTEP variables, and a 24-h forecast without CSTEP variables.

Table 5 Results of the hyperparameter tuning


One-hour forecast

The prediction results of the 1-h forecasts with and without the identified CSTEP variables are presented in Table 6. Overall, the metrics look similar for each model. For example, the lowest error was found using the feed-forward network with CSTEP variables at 3.9064 kWh mean absolute error and the highest adjusted R2 score of 0.9681. However, the same model without the CSTEP variables gives the highest error and lowest adjusted R2 score, while no substantial difference is seen in the recurrent neural networks. This change in performance can indicate that the FFN is more dependent on the CSTEP variables than the recurrent networks.

Table 6 One-hour forecast metrics

Figure 8 visualizes each model’s first week of hourly predictions with the actual load during the period. The models mostly capture the peaks and valleys with some larger errors, especially between the midday and afternoon peaks. Because these predictions look similar, it may be more interesting to investigate the performances on specifically challenging days to assess better the models, such as Christmas, which usually sees very high peaks in the afternoon to evening hours and different consumption patterns throughout the day. Figure 9 presents the forecast during Christmas 2021, where there is a greater difference in the models’ ability to forecast hourly. The actual load is shaped differently than on a regular day. December 23rd and 25th have much flatter peaks, where the morning and afternoon are similar, and the 24th with a high afternoon to evening peak. The FFN, RNN, and LSTM models cannot capture these peaks as well as on a regular day. However, the GRU predicts the high increase of the afternoon peak surprisingly well. This factor could be another performance metric to consider when assessing the performance of different neural network architectures, as this cannot be seen from the error metrics and adjusted R2 scores.

Fig. 8
figure 8

First week of 1-h forecasts

Fig. 9
figure 9

One-hour forecasts during Christmas

24-hour forecast

Table 7 presents the prediction results of the 24-h forecasts using CSTEP variables and excluding the CSTEP variables. Generally, the errors are higher than the 1-h forecasts, which is expected due to the multi-step predictions giving higher uncertainties at each timestep. However, the FFN is slightly more accurate out of the four models. Furthermore, the FFN’s performance changes when excluding the CSTEP variables is not as visible in the 24-h forecasts compared to the 1-h forecast.

Table 7 24-h forecast metrics

The forecasts for the first 24 h of the test data set are visualized in Fig. 10 below, where the point of the multi-step forecast starts on May 15th 23:00. There is no substantial difference in the first day of prediction for all four models. The ability to predict 24 h accurately using the same model architecture as for the 1-h forecasts means that the models are flexible in their application. To further assess the ability of the 24-h forecast models, they will also be investigated during Christmas 2021. Figure 11 presents the forecasts on Christmas day with the first prediction starting at midnight on the 24th of December 2021. The 24-h forecasts generally underestimate the actual load but follow the pattern correctly. The GRU neural network performs the best during this period, coming much closer to the peak load than the other models. Error metrics and R2 scores are critical indicators to assess the performance of models. However, they are not the only factor to base performances on for electricity load forecasting. Looking solely at the error metrics, one would choose the FFN model as it shows the lowest overall error. However, DSOs might think it necessary to predict as accurately as possible on specific days when the grid is nearing capacity, such as Christmas. Because of this, the GRU model might be the better model to use.

Fig. 10
figure 10

First 24-h forecast

Fig. 11
figure 11

24-h forecast during Christmas

Comparison with electric-based heating

All four models are tested on a data set of household electricity consumption containing heat pumps and electric heating to determine the importance of analyzing the composition of the aggregated electricity load and investigating the prediction performance of electrically heated households. It must be noted that the sample size has decreased compared to the original dataset, from 211 to 22. Because of the smaller sample size, the initial data set was sampled to have the same size, and all models were applied to the subset to compare them better.

Table 8 presents the error metrics of the models applied to electric-based heating household load and the sampled non-electric-based heating electricity consumption. The results give several insights. Firstly, the sample size of the aggregated load data affects the prediction ability of the models. For instance, the subset of the data set with a sample size of 22 has an adjusted R2 of 0.8273 for the FFN model, while the same model on the full data set reaches a score of 0.9681. This change is seen across all models, indicating the sample size of the aggregated load to be an essential factor. Secondly, multiple metrics are crucial to correctly assess neural networks’ performance. Due to the increased average hourly load for electric-based heating households, the absolute and squared errors change relative to the load. For the sampled data set, the average hourly load is around 0.37 kWh, whereas the electric-based heating households have an average load of around 0.96 kWh. Thirdly, while there is a difference in absolute and squared errors, the adjusted R2 score does not substantially change when predicting electric-based heating and district heating households. Finally, the addition of CSTEP variables impacts the performance differently depending on the model.

Table 8 Comparison metrics with electric heating load

The FFN model sees a slight performance increase when removing the CSTEP variables. The error of the RNN model increases without the CSTEP variables. The LSTM model has the worst performance, but the error slightly decreases when removing the supporting variables. Finally, the GRU model sees almost no change in performance with or without the CSTEP variables.


This paper systematically identified and analyzed data to forecast the aggregated electricity load of residential areas using the CSTEP framework. The data were used as inputs with feature-engineered variables to predict the next hour and 24 h. Four different neural networks are tuned, trained, and evaluated on the data sets with and without the CSTEP variables to assess the impact of the systematic identification process. It is found that 1-h forecasts perform equally well when looking at the error metrics and the adjusted R2 score; however, further investigations into the predictions show the GRU model capturing the actual load better. An additional factor can be included in model performance assessment by examining the models on certain days such as Christmas, which usually sees very high consumption peaks. Finally, 24-h forecasts are also conducted to examine the flexibility of the models. Overall, the metrics show minimal variation across the models, but comparing the predictions through visualizations indicates where the models may differ.

Furthermore, to determine how the composition of the aggregated load data affects the forecast, a separate data set containing households with heat pumps or electric heating was used to predict. It was found that the number of households in the aggregated load affects the forecast, meaning a smaller sample size increases the forecast error. To validate this, 22 households were sampled from the initial data set to match the electric heating households and subsequently compared to each other. Here, no substantial differences were found in the adjusted R2 scores; however, MAE, MAPE, and RMSE metrics differed due to the increased average load of households with heat pumps or electric heating. The systematically identified CSTEP variables did not increase the forecast significantly; however, they gave the authors an increased understanding and explainability of the target variable. The complexity of the consumption pattern can be better understood by considering as many factors as possible. This understanding can lead to explaining why there are increases or decreases in the electricity load during specific periods, changing behavioral patterns of residents, or to identify peaks and valleys in the load pattern.

Furthermore, the FFN model saw an increase in error after removing the CSTEP variables, indicating that the recurrent neural networks rely less on the supporting variables. The popular neural network architectures in the literature are LSTMs and FFNs; however, this paper has shown that GRUs perform very well on performance metrics and when visualizing the predictions to understand where the model predicts well. Furthermore, this paper demonstrated that choosing the optimal neural network architecture is not as important as curating good data inputs, which was shown by testing the models on different load profiles with and without electric heating or heat pumps. Moreover, it was found that the aggregated load's sample size impacts the forecast's accuracy, with smaller sample sizes giving more volatile consumption patterns.

The systematic identification and selection of supporting data were valuable with certain neural network types, such as the RNN and FFN. As described in the literature, the LSTM and GRU networks are specialized in long data sequences due to their ability to remember patterns, which could explain that they do not have to rely on the CSTEP variables as much. The results of this study were not very encouraging because the test of the systematic identification process did not significantly impact the performance metrics as expected. However, the process gave a better understanding of the complex electricity load forecasting problem. The data sets used for this study were cleaned and filtered to consist of households without DERs and only the households’ pure electricity consumption, meaning no electricity-based heating installations. The results prove that the sample sizes of the aggregation play a large part in the forecasting accuracy.

Moreover, the challenge of predicting different consumption patterns, such as households with heat pumps or electric heating, was rejected because the adjusted R2 was found to be close to equal for both load patterns. However, this study achieved excellent results in forecasting the electricity load for the next hour and next 24 h, which is underlined by the satisfying low errors and high adjusted R2 scores. Furthermore, after visualizing the predictions, it was shown that the models could get very close to the actual load.


The purpose of the study was to test a systematic data identification and selection process to forecast the aggregated electricity load of two Danish residential areas. In the literature, the data selection process often relied on correlation analysis of the supporting data. However, this paper added an initial step to building a robust data foundation forecast using the CSTEP framework. Forecasting with neural networks is a major research field, and this paper tested and compared different types of neural networks from the literature. The research has shown that the systematic identification of variables has potential but does not substantially affect the models' performance metrics. However, the process did give a greater understanding of the target variable, which can help curate better data in the future. Testing multiple neural networks results indicate that choosing the optimal architecture is not as impactful as having good data inputs. The findings of this study will be of interest to researchers who seek to make their data processing and analysis more systematic by applying the CSTEP framework.

Moreover, the findings will underline the importance of curated data for researchers and the industry, e.g., DSOs. The limitation of this study is the data availability for the target variable and some of the supporting data. The target variable had a small sample size for electric-based heating households, which meant that the original data set had to be sampled to be of equal size. Larger sample sizes would give a more evident answer to the differences between load patterns. Furthermore, the CSTEP variables were limited by sources of external data, such as the weather data. A majority of researchers use weather data for their forecasting models, but for this research, it was not feasible due to the location of weather stations. Finally, the results of this study are based on electricity consumption from Danish residential areas, meaning they are not directly generalizable to all parts of the world.

Despite these limitations, the study shows the models’ flexibility on different consumption patterns, multiple types of independent variables, and by forecasting one hour to 24 h ahead. Further research should be conducted using the CSTEP framework to systematically identify independent variables to better assess the method's impact on the forecasting problem. Furthermore, the findings suggest that better performance metrics are needed to compare the predictions of neural networks, as the intricacies could only be seen by visually inspecting the forecast. For future work a more complex selection of models with more complicated data sets to test the forecasting ability further is planned. Finally, to improve on the limitations of this study, a larger sample size of residential houses should be used.

Availability of data and materials

Not applicable.



Distribution system operator


Distributed energy resources


Machine learning


Deep learning


Feed-forward network


Recurrent neural network


Long short-term memory


Gated recurrent unit


Mean absolute error


Mean absolute percentage error


Root mean squared error


  • Ahmad T, Chen H (2018) Utility companies strategy for short-term energy demand forecasting using machine learning based models. Sustain Cities Soc 39:401–417.

    Article  Google Scholar 

  • Al-Rashid Y, Paarmann LD (1996) Short-term electric load forecasting using neural network models. In: Cameron G, Hassoun M, Jerdee A, Melvin C (eds) Proceedings of the 1996 IEEE 39th midwest symposium on circuits & systems part 3 (of 3). IEEE, Piscataway, pp 1436–1439

  • Andersen FM, Baldini M, Hansen LG, Jensen CL (2017) Households’ hourly electricity consumption and peak demand in Denmark. Appl Energy 208:607–619

    Article  Google Scholar 

  • Biewald L (2020) Experiment tracking with weights and biases. Accessed 01 June 2022

  • Billanes JD, Ma Z, Jørgensen BN (2017) Consumer central energy flexibility in office buildings. J Energy Power Eng 11(10):621–630

    Google Scholar 

  • Billanes JD, Ma Z, Jørgensen BN (2018) The bright green hospitals case studies of hospitals’ energy efficiency and flexibility in Philippines. In: 2018 8th international conference on power and energy systems (ICPES). pp 190–195

  • Bygnings- og Boligregistret (2022) BBR information. Accessed 20 June 2022

  • Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H et al (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. arXiv preprint. arXiv:14061078

  • Christensen K, Ma Z, Værbak M, Demazeau Y, Jørgensen BN (2019) Agent-based decision making for adoption of smart energy solutions. In: IV international congress of research in sciences and humanities science and humanities international research conference (SHIRCON 2019). Lima, Peru. IEEE

  • Christensen K, Ma Z, Jørgensen BN (2021) Technical, economic, social and regulatory feasibility evaluation of dynamic distribution tariff designs. Energies 14(10):2860

    Article  Google Scholar 

  • Danish Energy Agency (2022a) Danish climate policies. Accessed 20 June 2022

  • Danish Energy Agency (2022b) Klimaaftale for energi og industri mv. 2020. In: Agency DE, editor. p 16

  • Dung NT, Phuong NT (2019) Short-term electric load forecasting using standardized load profile (SLP) and support vector regression (SVR). Eng Technol Appl Sci Res 9(4):4548–4553

    Article  Google Scholar 

  • Ekonomou L (2010) Greek long-term energy consumption prediction using artificial neural networks. Energy 35(2):512–517.

    Article  Google Scholar 

  • Elkamel M, Schleider L, Pasiliao EL, Diabat A, Zheng QP (2020) Long-term electricity demand prediction via socioeconomic factors-a machine learning approach with Florida as a case study. Energies.

    Article  Google Scholar 

  • Fatras N, Ma Z, Jørgensen BN (2021) System architecture modelling framework applied to the integration of electric vehicles in the grid. Springer International Publishing, Cham, pp 205–209

    Google Scholar 

  • Flexible Energy Denmark (2019) About the flexible energy Denmark project. Accessed 20 June 2022

  • Friedrich L, Afshari A (2015) Short-term forecasting of the Abu Dhabi electricity load using multiple weather variables. In: Yan J, Shamim T, Chou SK, Li H (eds) 7th international conference on applied energy, ICAE 2015. Elsevier Ltd, pp 3014–3026

  • Gao Y, Fang C, Ruan Y (2019) A novel model for the prediction of long-term building energy demand: LSTM with attention layer. In: Sustainable built environment conference 2019 Tokyo: built environment in an era of climate change: how can cities and buildings adapt? SBE 2019 Tokyo, 1st edn. Institute of Physics Publishing

  • Gebreyohans G, Saxena NK, Kumar A (2018) Long-term electrical load forecasting of Wolaita Sodo Town, Ethiopia using hybrid model approaches. In: 2018 IEEE 8th power India international conference (PIICON). pp 1–6

  • Ghods L, Kalantar M (2008) Methods for long-term electric load demand forecasting; a comprehensive investigation. In: 2008 IEEE international conference on industrial technology, IEEE ICIT 2008. Chengdu

  • Giamarelos N, Zois EN, Papadimitrakis M, Stogiannos M, Livanos NAI, Alexandridis A (2021) Short-term electric load forecasting with sparse coding methods. IEEE Access 9:102847–102861.

    Article  Google Scholar 

  • Gören G, Dindar B, Gül Ö (2022) Artificial neural network based cost estimation of power losses in electricity distribution system. In: 2022 4th global power, energy and communication conference (GPECOM). pp 455–460

  • Gungor O, Garnier J, Rosing TS, Aksanli B (2020) LENARD: lightweight ensemble learner for medium-term electricity consumption prediction. In: 2020 IEEE international conference on communications, control, and computing technologies for smart grids, SmartGridComm 2020. Institute of Electrical and Electronics Engineers Inc.

  • Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  • Houimli R, Zmami M, Ben-Salha O (2020) Short-term electric load forecasting in Tunisia using artificial neural networks. Energy Syst 11(2):357–375.

    Article  Google Scholar 

  • Hu Z, Ye C, Wang S, Tan C, Tian J, Li Y (2021) Guaranteed load prediction for distribution network considering extreme disasters. In: 2021 IEEE sustainable power and energy conference (iSPEC). pp 1375–1380

  • Ilseven E, Gol M (2017) Medium-term electricity demand forecasting based on MARS. In: 2017 IEEE PES innovative smart grid technologies conference Europe, ISGT-Europe 2017. Institute of Electrical and Electronics Engineers Inc., pp 1–6

  • Li L, Ota K, Dong M (2017) Everything is image: CNN-based short-term electrical load forecasting for smart grid. In: 2017 14th international symposium on pervasive systems, algorithms and networks & 2017 11th international conference on frontier of computer science and technology & 2017 third international symposium of creative computing (ISPAN-FCST-ISCC). pp 344–351

  • Li Y, Chen Y, Shi Y, Zhao X, Li Y, Lou Y (2021) Research on short-term load forecasting under demand response of multi-type power grid connection based on dynamic electricity price. In: 2021 IEEE international conference on power, intelligent computing and systems, ICPICS 2021. Institute of Electrical and Electronics Engineers Inc., pp 141–144

  • Ma Z (2022) The importance of systematical analysis and evaluation methods for energy business ecosystems. Energy Inform 5(1):2.

    Article  Google Scholar 

  • Ma Z, Jørgensen BN (2018) A discussion of building automation and stakeholder engagement for the readiness of energy flexible buildings. Energy Inform 1(1):54.

    Article  Google Scholar 

  • Ma Z, Sommer S, Jørgensen BN (2016) The smart grid impact on the Danish DSOs’ business model. In: 2016 IEEE electrical power and energy conference (EPEC). IEEE, pp 1–5

  • Ma Z, Friis HTA, Mostrup CG, Jørgensen BN (2017) Energy flexibility potential of industrial processes in the regulating power market. In: Proceedings of the 6th international conference on smart cities and green ICT systems. SCITEPRESS—Science and Technology Publications, Lda, Porto, Portugal, pp 109–115

  • Ma Z, Værbak M, Rasmussen RK, Jørgensen BN (2019a). Distributed energy resource adoption for campus microgrid. In: 2019a IEEE 17th international conference on industrial informatics (INDIN). IEEE, pp 1065–1070

  • Ma Z, Broe M, Fischer A, Sørensen TB, Frederiksen MV, Jøergensen BN (2019b) Ecosystem thinking: creating microgrid solutions for reliable power supply in India’s power system. In: 2019 1st global power, energy and communication conference (GPECOM). pp 392–397

  • Ma Z, Christensen K, Jorgensen BN (2021) Business ecosystem architecture development: a case study of electric vehicle home charging. Energy Inform 4:37.

    Article  Google Scholar 

  • Panapongpakorn T, Banjerdpongchai D (2019) Short-term load forecast for energy management systems using time series analysis and neural network method with average true range. In: 1st international symposium on instrumentation, control, artificial intelligence, and robotics, ICA-SYMP 2019. Institute of Electrical and Electronics Engineers Inc., pp 86–89

  • Parlos AG, Patton AD (1993) Long-term electric load forecasting using a dynamic neural network architecture. In: 1993 joint international power conference on Athens Power Tech: planning, operation and control in today’s electric power systems, APT 1993. Institute of Electrical and Electronics Engineers Inc., pp 816–820

  • Pindoriya NM, Singh SN, Singh SK (2010) Forecasting of short-term electric load using application of wavelets with feed-forward neural networks. Int J Emerg Electr Power Syst.

    Article  Google Scholar 

  • Pramono SH, Rohmatillah M, Maulana E, Hasanah RN, Hario F (2019) Deep learning-based short-term load forecasting for supporting demand response program in hybrid energy system. Energies.

    Article  Google Scholar 

  • Ribeiro AMNC, Do Carmo PRX, Rodrigues IR, Sadok D, Lynn T, Endo PT (2020) Short-term firm-level energy-consumption forecasting for energy-intensive manufacturing: a comparison of machine learning and deep learning models. Algorithms 13(11):1–19.

    Article  MathSciNet  Google Scholar 

  • Ruiming F (2008) A hybrid rough sets and support vector regression approach to short-term electricity load forecasting. In: IEEE power and energy society 2008 general meeting: conversion and delivery of electrical energy in the 21st century, PES. Pittsburgh, PA

  • Sadaei HJ, de Lima e Silva PC, Guimarães FG, Lee MH (2019) Short-term load forecasting by using a combined method of convolutional neural networks and fuzzy time series. Energy 175:365–377.

    Article  Google Scholar 

  • Salama HAE, El-gawad AFA, Sakr SM, Mohamed EA, Mahmoud HM (2009) Applications on medium-term forecasting for loads and energy scales by using artificial neural network. In: 20th international conference and exhibition on electricity distribution, CIRED 2009. 550 CP ed. Prague

  • Samuel IA, Ekundayo S, Awelewa A, Somefun TE, Adewale A (2020) Artificial neural network base short-term electricity load forecasting: a case study of a 132/33 kv transmission sub-station. Int J Energy Econ Policy 10(2):200–205.

    Article  Google Scholar 

  • Sauter P, Karg P, Pfeifer M, Kluwe M, Zimmerlin M, Leibfried T et al (2017) Neural network-based load forecasting in distribution grids for predictive energy management systems. In: International ETG congress 2017. pp 1–6

  • Shirzadi N, Nizami A, Khazen M, Nik-Bakht M (2021) Medium-term regional electricity load forecasting through machine learning and deep learning. Designs.

    Article  Google Scholar 

  • Solyali D (2020) A comparative analysis of machine learning approaches for short-/long-term electricity load forecasting in Cyprus. Sustainability.

    Article  Google Scholar 

  • Statistics Denmark (2022) Energy and air emission accounts. Accessed 20 June 2022

  • Tanoto Y, Ongsakul W, Marpaung COP (2011) Levenberg-Marquardt recurrent networks for long-term electricity peak load forecasting. Telkomnika 9(2):257–266.

    Article  Google Scholar 

  • Vanting NB, Ma Z, Jørgensen BN (2021) A scoping review of deep neural networks for electric load forecasting. Energy Inform 4(2):49.

    Article  Google Scholar 

  • Vonk BMJ, Nguyen PH, Grand MOW, Slootweg JG, Kling WL (2012) Improving short-term load forecasting for a local energy storage system. In: 2012 47th international universities power engineering conference, UPEC 2012. London

  • Xu B, Sun Y, Wang H, Yi S (2019) Short-term electricity consumption forecasting method for residential users based on cluster classification and backpropagation neural network. In: 11th international conference on intelligent human-machine systems and cybernetics, IHMSC 2019. Institute of Electrical and Electronics Engineers Inc., pp 55–59

  • Xypolytou E, Meisel M, Sauter T, IEEE (2017) Short-term electricity consumption forecast with artificial neural networks—a case study of office buildings. In: 2017 IEEE Manchester Powertech

  • Yan X, Chen H, Zhang X, Tan C (2012) Energy storage sizing for office buildings based on short-term load forecasting. In: 2012 IEEE 6th international conference on information and automation for sustainability, ICIAFS 2012. Beijing. pp 290–295

  • Yong B, Shen Z, Wei Y, Shen J, Zhou Q (2020) Short-term electricity demand forecasting based on multiple LSTMs. In: Ren J, Hussain A, Zhao H, Cai J, Chen R, Xiao Y et al (eds) 10th international conference on brain inspired cognitive systems, BICS 2019. Springer, pp 192–200

  • Zhu J, Yang Z, Guo Y, Zhang J, Yang H (2019) Short-term load forecasting for electric vehicle charging stations based on deep learning approaches. Appl Sci.

    Article  Google Scholar 

Download references


Not applicable.

About this supplement

This article has been published as part of Energy Informatics Volume 5 Supplement 4, 2022: Proceedings of the Energy Informatics. Academy Conference 2022 (EI.A 2022). The full contents of the supplement are available online at


This paper is part of the ANNEX 81 project (Project IEA EBC ANNEX 81 Data-Driven Smart Buildings, funded by EUDP Denmark, Case no. 64019-0539) by the Danish funding agency (the Danish Energy Technology Development and Demonstration (EUPD) program, Denmark) and part of the Lighthouse South project (project title—AI-based forecasting for sector coupling of the electricity grid and district heating grid) by the European Regional Development Fund.

Author information

Authors and Affiliations



NBV conducted the first draft, ZM and BNJ revised and edited the final manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Nicolai Bo Vanting.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vanting, N.B., Ma, Z. & Jørgensen, B.N. Evaluation of neural networks for residential load forecasting and the impact of systematic feature identification. Energy Inform 5 (Suppl 4), 63 (2022).

Download citation

  • Accepted:

  • Published:

  • DOI: