Skip to main content

Application of improved DBN and GRU based on intelligent optimization algorithm in power load identification and prediction


Non intrusive load monitoring belongs to the key technologies of intelligent power management systems, playing a crucial role in smart grids. To achieve accurate identification and prediction of electricity load, intelligent optimization algorithms are introduced into deep learning optimization for improvement. A load recognition model combining sparrow search algorithm and deep confidence network is designed, as well as a gated recurrent network prediction model on the grounds of particle swarm optimization. The relevant results showed that the sparrow search algorithm used in the study performed well on the solution performance evaluation metrics with a minimum value of 0.209 for the inverse generation distance and a maximum value of 0.814 for the hyper-volume. The accuracy and recall values of the optimized load identification model designed in the study were relatively high. When the accuracy was 0.9, the recall rate could reach 0.94. The recognition accuracy of the model on the basis of the test set could reach up to 0.924. The lowest classification error was only 0.05. The maximum F1 value of the harmonic evaluation index of the bidirectional gated recurrent network optimized by particle swarm optimization converged to 90.06%. The loss function had been optimized by particle swarm optimization, and both the convergence value and convergence speed had been markedly enhanced. The average absolute error and root mean square error of the prediction model were both below 0.3. Compared to the bidirectional gated recurrent model before optimization, the particle swarm optimization strategy had a significant improvement effect on prediction details. In addition, the research method had superior recognition response speed and adaptability in real application environments. This study helps to understand the load demand of the power system, optimize the operation of the power grid, and strengthen the reliability, efficiency, and sustainability of the power system.


The Power System (PSY) is a key component of modern social development, and the development of information technology, communication technology, and automation technology has driven the transformation and evolution of the PSY. PSY is developing towards cleanliness, intelligence, flexibility, and sustainability. Implementing energy management and improving energy utilization efficiency through advanced means has become a research hotspot (Rafati et al. 2022; Himeur et al. 2022). Smart grid monitoring technology can achieve real-time monitoring and control of various links in the PSY, ensuring the operational efficiency and reliability of the power grid. Non Intrusive Load Monitoring System (NILM) is a system that utilizes non-invasive sensor technology to monitor and analyze electrical loads and Energy Consumption (EC) without the need for any modifications or contact with equipment. Load identification and prediction are the core components of NILM, and understanding the usage status of electricity loads helps to reasonably dispatch power supply and optimize power distribution. It is also necessary to coordinate the power generation of various power supply departments for ensuring the stable supply of the PSY (Chen and Wang 2022; Lu et al. 2023). However, in the face of complex load electricity environments, changing household electricity habits, and sparse EC data, the existing NILM load identification and prediction still have low accuracy and poor stability (Kaselimi et al. 2022). To achieve accurate and stable load identification and prediction of NILM, the study chooses Deep Learning (DL) with obvious advantages as the technical foundation. On the one hand, the Sparrow Search Algorithm (SSA) in swarm intelligence Optimization Algorithms (OA) is utilized to optimize the traditional Deep Belief Network (DBN) and achieve power load identification. On the other hand, the study utilizes Particle Swarm Optimization (PSO) algorithm to improve the Gated Recurrent Unit (GRU) of Recurrent Neural Network (RNN), and designs a bidirectional weighted PSO-GRU Load Forecasting (LF) model. The design of the research enriches the theoretical research basis as well as the practical application of intelligent OA and DL, which can improve the technical level of DBN and GRU in the field of identification and prediction. Moreover, this study achieves energy conservation, emission reduction, and PSY scheduling by effectively optimizing energy management and adjusting user electricity consumption behavior.

The study mainly includes four parts. First, a review of the current research status of NILM around the world is conducted. Then, the construction of SSA-DBN recognition and bidirectional weighted PSO-GRU prediction model is explained. The performance testing is conducted on the designed recognition and prediction model. Finally, the research experimental results are summarized.

Related work

The construction of smart grids is an essential direction for the digital transformation and reform of the PSY, and NILM is an important approach for the construction of smart grids. Scholars around the world have carried extensive research on it. Current research often interprets NILM tasks as a multi classification problem, which sometimes makes it difficult to effectively identify unknown loads that have not participated in training. Kang et al. designed an adaptive NILM method that utilizes fast Fourier transform to complete harmonic current characteristic analysis. The pre-trained convolutional auto-encoder Neural Networks (NN) was used for obtaining voltage current trajectory features. The TOPSIS algorithm compared the similarity of feature vectors to achieve load monitoring. The relevant outcomes verified the recognition accuracy of this method, with a maximum accuracy of 97%, which could achieve load recognition of embedded systems (Kang et al. 2022). Due to the limitations of transmission costs and network bandwidth, NILM often used low-frequency data, which posed significant challenges to the accuracy of monitoring and identification. Yin et al. investigated the correlation between household electricity consumption habits and load state decomposition methods, and presented a non-invasive load identification model on the grounds of Gaussian mixture and hidden Markov model. After validation with the dataset, it had been confirmed that this method significantly improved the accuracy of device recognition (Yin et al. 2022). Laouali et al. designed a NILM framework based on low-frequency power data, using the DL architecture of random approximation convex shell data selection method, hybrid Convolutional Neural Network (CNN), and bidirectional Long Short-Term Memory (LSTM). The experiment showed that the F1 values of the four devices on the test dataset were between 0.95 and 0.99, and the accuracy values were between 0.88 and 0.98 (Laouali et al. 2022). NILM was the process of decomposing individual EC based on the total EC. Lee M H et al. established an energy decomposition model on the grounds of RNN, LSTM, and gated cyclic units. It estimated the performance of the design method using hidden Markov models. The experiment showed that the method had enhanced in indicators such as F1 value, accuracy, Mean Absolute Error (MAE), and Root Mean Square Error (RMSE), and was highly consistent with actual power and electricity consumption (Lee and Moon 2023).

The accuracy of NILM recognition on the grounds of a single feature is relatively low. Chen et al. chose to integrate the advantages of multiple features and used a matrix heatmap to identify loads on the grounds of voltage current trajectory features, odd harmonic phase and amplitude, and fundamental amplitude. The relevant outcomes showed that this method could reach non-invasive monitoring of household load, with a recognition accuracy of 96.24% (Chen et al. 2023). Yin and Ma designed a voltage current representation attention mechanism for NILM tasks. This method improved the classification function of NNs and had been validated by the public dataset plug load instrument recognition dataset, confirming the excellent performance of this method (Yin and Ma 2023). To identify flexible loads, Kianpoor et al. designed an adaptive integrated filtering framework on the grounds of LSTM filtering technology and characterization ability, which learns the long-term dependency relationship of flexible loads through total power and achieves load decomposition. The experiment was on the grounds of a residential example in British Columbia, and the results showed that the maximum reduction in power consumption estimation error for different appliances by this method could reach 57.4% (Kianpoor et al. 2023). Considering that the input features of NILM only include total power, Luo et al. used sliding windows to extract different frequency bands and characterize different sliding windows for multi-dimensional feature input. Then it used a NN composed of convolution and bidirectional LSTM to complete LF. In addition, the study also designed an optimized loss function suitable for binary classification problems to achieve information sharing. Verified by real household energy datasets, it was shown that this method had improved F1 value, MAE, and signal aggregation error indicators by nearly 20%, 25.15%, and 17.83%, respectively (Luo et al. 2023).

Nowadays, NILM is still mostly limited to developing models using regression forms. Kim G et al. identified appliance equipment activation based on historical profiles and converted the NILM results into appropriate information for service utilization to develop pre-trained NILM models (Kim and Park 2023). Djordjevic et al. conducted NILM based on steady-state current harmonic analysis. Firstly, the steady-state changes of current harmonic vectors were used to classify household appliances, and then household appliances were identified based on the current harmonic components. Experimental results showed that this method could accurately distinguish the linear and nonlinear classification of electrical switching power supplies (Djordjevic and Simic 2023). To improve the accuracy of NILM load recognition, Liu Z et al. designed a NILM load recognition model based on adaptive PSO algorithm and CNN. PSO algorithm was used to determine the network layer and convolution kernel of CNN. The experimental results showed that the overall recognition accuracy of the method was 97.26%, and the F-1 value was 96.92% (Liu et al. 2023). Liu J et al. fused an improved clustering algorithm and a three-layer Bayesian network to design a NILM equipment operation state recognition model, and simulation experiments verified the effectiveness of the method (Liu et al. 2024). Existing NILM methods mainly focused on the identification of NILM. Given this, Lu et al. designed a NILM identification model for unknown loads based on Siamese network. This Siamese network contained fixed CNN and retrained Backpropagation NN (BPNN), and the public dataset verified the practicality and scalability of the method (Lu et al. 2024).

In summary, there have been many studies on NILM load identification and prediction, indicating that DL has good application advantages in power load monitoring research. However, facing sparse individual household EC data and various external factors, NILM's load identification and prediction still have shortcomings in stability and accuracy. This study is on the grounds of DL technology and improves the load identification and prediction of electricity by introducing intelligent OAs.

Improvement of DBN and GRU for power load identification and prediction on the grounds of intelligent OAs

The identification and prediction of loads through NILM is meaningful for the operation planning, power generation scheduling, and energy regulation of the power grid. This study conducts research on load recognition and prediction on the grounds of DL and intelligent OAs.

Construction of a power load identification model integrating SSA and DBN

DL is a common machine learning method that can solve complex tasks by simulating the working principles of human brain NNs (Gharehchopogh et al. 2023). DL contains many Hidden Layers (HL) and can learn high-level abstract features. It has the advantage of automatically learning features and has been extensively utilized in many fields, promoting the development of artificial intelligence technology (Abdulhammed 2022; Bhosle and Musande 2023). This study introduces DL into the field of load recognition to deeply explore electrical features and underlying correlations.

The DL framework chosen for the study is DBN. DBN belongs to a multi-layer stacked generative NN structure, which includes multiple Restricted Boltzmann Machines (RBMs). RBM is a probability generation model consisting of independent visible and HLs, including observation data and abstract feature nodes. Different nodes are connected to form a fully connected graph, and RBM is used as part of DBN to extract high-level abstract features and provide input for other tasks. The schematic diagram of RBM structure and its solution method is shown in Fig. 1.

Fig. 1
figure 1

Schematic diagram of RBM structure and solution method

The energy value function \(E\left( {v,h} \right)\) of RBM is expressed in Eq. (1). In Eq. (1), \(v,h\) serve as visible and HLs. \(c\) and \(b\) represent the bias vectors of \(v,h\), respectively, with \(c \in {\mathbb{R}}^{I}\) and \(b \in {\mathbb{R}}^{J}\). \(W\) represents the connection weight between \(v,h\) and \(E\left( {v,h} \right) = - \sum\limits_{i = 1}^{I} {c_{i} } v_{i} - \sum\limits_{j = 1}^{J} {b_{j} } h_{j} - \sum\limits_{j = 1}^{J} {\sum\limits_{i = 1}^{I} {W_{ji} } v_{i} b_{j} }\), as shown in Eq. (1).

$$E\left( {v,h} \right) = - \sum\limits_{i = 1}^{I} {c_{i} } v_{i} - \sum\limits_{j = 1}^{J} {b_{j} } h_{j} - \sum\limits_{j = 1}^{J} {\sum\limits_{i = 1}^{I} {W_{ji} } v_{i} b_{j} }$$

The learning of RBM is divided into training and generation stages. In the training stage, the distribution of input data is learned by iteratively updating the connection weights, and the probability density distribution is calculated using the Gibbs sampling method of Markov Chain Monte Carlo. The probability density distributions \(p\left( v \right)\) and \(p\left( h \right)\) of visible and HLs are shown in Eq. (2).

$$\left\{ {\begin{array}{*{20}c} {p\left( v \right) = \sum\limits_{h} {p\left( {v,h} \right) = \frac{{\sum\nolimits_{h} {e^{{ - E\left( {v,h} \right)}} } }}{{\sum\nolimits_{v,h} {e^{{ - E\left( {v,h} \right)}} } }}} } \\ {p\left( h \right) = \sum\limits_{v} {p\left( {v,h} \right) = \frac{{\sum\nolimits_{v} {e^{{ - E\left( {v,h} \right)}} } }}{{\sum\nolimits_{v,h} {e^{{ - E\left( {v,h} \right)}} } }}} } \\ \end{array} } \right.$$

The generation stage of RBM generates new data on the grounds of the learned parameters, that is, by inputting the visible layer node state and sampling the HL for reconstructing the visible layer data in reverse. When a vector of visible or HLs is given, the probability of activation of neurons in the visible or HLs is calculated using Eq. (3).

$$\left\{ {\begin{array}{*{20}c} {p\left( {h_{j} = 1\left| v \right.} \right) = \sigma \left( {b_{j} + \sum\limits_{i = 1}^{I} {W_{ji} v_{i} } } \right)} \\ {p\left( {v_{j} = 1\left| h \right.} \right) = \sigma \left( {c_{j} + \sum\limits_{i = 1}^{I} {W_{ji} h_{i} } } \right)} \\ \end{array} } \right.$$

The traditional method for solving RBM bias and weight gradient requires a large amount of computation, so the Gibbs sampling method’s contrast divergence algorithm is used for solving. The weight and bias calculation of RBM is shown in Eq. (4), where \(p_{0} = p\left( {h\left| v \right.} \right)\) serves as the expected value of the model at \(p_{0} = p\left( {h\left| v \right.} \right)\). \(E_{k}\) represents the expected value.

$$\left\{ {\begin{array}{*{20}l} {\widetilde{W}_{{ji}} = \eta \left( {E_{o} \left[ {v_{i} h_{j} } \right] - E_{k} \left[ {v_{i} h_{j} } \right]} \right)} \\ {\widetilde{b}_{j} = \eta \left( {E_{o} \left[ {h_{j} } \right] - E_{k} \left[ {h_{j} } \right]} \right)} \\ {\widetilde{c}_{i} = \eta \left( {E_{o} \left[ {v_{i} } \right] - E_{k} \left[ {v_{i} } \right]} \right)} \\ \end{array} } \right.$$

DBN consists of input layer, Output Layer (OL), and multi-layer RBM structure. The greedy unsupervised training method is used to assign weight parameters between each layer. The learning structure and composition of DBN are shown in Fig. 2. The DBN’s training contains two stages: pre-training and fine-tuning. The pre-training stage is completed by unsupervised training and learning using the RBM structure, and the training is repeated until the network converges. During the fine-tuning stage, the backpropagation algorithm is used to fine tune the entire network and optimize its Classification Performance (CP). Overall, DBN has an auto-encoder structure that can automatically learn feature representations, overcoming the gradient vanishing of traditional deep networks through layer by layer pre-training and fine-tuning, and overall performance is good. The calculation of weights for each layer of DBN is completed in a layer by layer recursive process, and a small initial weight value leads to slow training speed of the model in the initial stage. Consequently, the study introduces the intelligent OA, SSA, for optimizing the initial weight assignment of DBN. Firstly, SSA is used to initialize the weights of DBN, and the performance index of DBN is defined as the fitness function. Then, the behaviors of sparrow’s foraging, following, chasing and escaping in SSA algorithm are mapped to the adjustment of the DBN structure, and the SSA searching results help DBN to find the appropriate weights with the parameter configurations of each layer of RBM.

Fig. 2
figure 2

Schematic diagram of DBN learning structure and composition

SSA is an OA on the grounds of local search, which gradually approaches the global optimal solution by continuously searching for local solutions. The algorithm is easy for implementing, and has significant advantages in solving discretization problems (Gad et al. 2022). Sparrow Population (SP) individuals are usually divided into discoverers and joiners. Discoverers are responsible for searching for food and offering foraging areas and directions for other SPs. Joiners obtain food under the guidance of discoverers. Discoverers and joiners can dynamically change, and usually the Fitness Value (FV) of discoverers is higher than that of joiners. However, joiners can also monitor discoverers competing for food for increasing predation rates. Additionally, SPs also engage in vigilant behavior, abandoning food and flying to safe areas when they become aware of danger. The SSA workflow is shown in Fig. 3.

Fig. 3
figure 3

SSA workflow diagram

The Position Update (PU) of the discoverer is shown in Eq. (5), where \(X_{i,j}^{t}\) represents the position of the \(i\)-th sparrow in the \(j\)-th dimension at the \(t\)-th iteration. \(iter_{\max }\) represents the iterations' maximum. \(\alpha\) and \(Q\) represent random constants and standard normal distribution random numbers between 0 and 1, respectively. \(R_{2}\) and \(ST\) serve as the alarm value and alert threshold. As \(R_{2} \ge ST\), sparrows fly from wide area search to safe areas to forage. \(L\) represents the matrix of \(1 \times d\).

$$X_{i,j}^{t + 1} = \left\{ {\begin{array}{*{20}l} {X_{i,j}^{t} \cdot \exp \left( {\frac{ - i}{{\alpha iter_{\max } }}} \right)} & if\,R_{2} < ST \\ {X_{i,j}^{t} + Q \cdot L} & if\,R_{2} \ge ST \\ \end{array} } \right.$$

The PU of followers is shown in Eq. (6), where \(X_{p}\) serves as the optimal position of the discoverer. \(X_{worst}\) serves as the worst global position. \(A\) serves as a matrix of \(1 \times d\), with \(A^{ + } = A^{T} \left( {AA^{T} } \right)^{ - 1}\). When \(i>n/2\), the SP is necessary to go to food rich areas for food.

$$X_{{i,j}}^{{t + 1}} = \left\{ {\begin{array}{*{20}l} {Q \cdot \exp \left( {\frac{{x_{{worst}}^{t} - x_{{i,j}}^{t} }}{{i^{2} }}} \right)} & {if{\text{ }}i > n/2} \\ {X_{p}^{{t + 1}} + \left| {X_{{i,j}}^{t} - X_{p}^{{t + 1}} } \right| \cdot A^{ + } \cdot L} & {if{\text{ }}R_{2} \le n/2} \\ \end{array} } \right.$$

Some sparrows are responsible for vigilance, and the PU is shown in Eq. (7). In Eq. (7), \(\beta\) and \(K\) represent constants of control step length and sparrow flight direction. \(f_{i}\) represents the FV. \(f_{w}\) and \(f_{g}\) serve as the global best and worst FVs. \(\varepsilon\) represents the minimum constant for preventing the denominator from being unique. Introducing SSA into the DBN structure to optimize the initial weights can enhance the DBN model.

$$X_{i,j}^{t + 1} = \left\{ {\begin{array}{*{20}l} {X_{best}^{t} + \beta \cdot \left| {X_{i,j}^{t} - X_{best} } \right|} & if \, f_{i} > f_{g} \\ {X_{i,j}^{t} + K \cdot \left( {\frac{{\left| {X_{i,j}^{t} - X_{worst}^{t} } \right|}}{{\left( {f_{i} - f_{w} } \right) + \varepsilon }}} \right)} & if \, f_{i} = f_{g} \\ \end{array} } \right.$$

Design of bidirectional adaptive power LF model on the grounds of PSO and GRU

The LF of the PSY is on the grounds of historical load data and other related factors, using methods such as time series analysis or machine learning for LF. The prediction of load is influenced by climate, weather, and other factors, which have significant uncertainty. When predicting, historical data, temperature, and date information need to be considered as input data for LF. On the grounds of the predicted load results, appropriate adjustments can be made to the PSY, with the aim of minimizing LF errors and ensuring the accuracy of LF.

Artificial Neural Network (ANN) is an autonomous learning model that mimics the information processing of biological neural systems. ANN has strong learning and adaptive abilities. Recurrent ANN has a memory mechanism that can effectively handle the temporal relationships of sequential data. Repeated propagation gradients with time steps can better capture long-term dependencies. The study chooses a recurrent ANN for load prediction, relying on a DL framework called GRU. GRU is a variant of traditional RNN, which has stronger modeling and long-term dependency processing capabilities. GRU includes two gate control functions: an Update Gate (UG) and a Reset Gate (RG). The UG controls the degree to which the Previous Hidden State (PHS) is brought into the current input. The RG determines the degree to which information from the previous state possesses an impact on the current state (ArunKumar et al. 2022). GRU updates the UG and RG on the grounds of the current input and the PHS. The expression for the UG \(z_{t}\) is shown in Eq. (8). In Eq. (8), \(t_{m}\) denotes the input information of the UG excitation function. \(w_{m}\) and \(u_{m}\) represent UG weights. \(\sigma\) represents the activation function. \(h\) represents the HL state. \(t\) represents the position layer in the model. \(x_{t}\) represents the input information at the current time.

$$\left\{ {\begin{array}{*{20}c} {t_{m} = w_{m} x_{t} + u_{m} h_{t - 1} } \\ {z_{t} = \sigma \left( {t_{m} } \right)} \\ \end{array} } \right.$$

The expression for resetting gate \(r_{t}\) is shown in Eq. (9). In Eq. (9), \(t_{n}\) denotes the input information of the RG excitation function. \(w_{n}\) and \(u_{n}\) are resetting the gate weight.

$$\left\{ {\begin{array}{*{20}c} {t_{n} = w_{n} x_{t} + u_{n} h_{t - 1} } \\ {r_{t} = \sigma \left( {t_{n} } \right)} \\ \end{array} } \right.$$

The final output \(h_{t}\) of GRU is calculated using Eq. (10). \(\widetilde{{h_{t} }}\) represents the updated value, determined by \(r_{t}\), \(h_{t - 1}\), and the current position together.

$$h_{t} = \left( {1 - z_{t} } \right)h_{t - 1} + z_{t} \widetilde{{h_{t} }}$$

However, the traditional GRU model has a low utilization rate of future data information. Therefore, this study has improved the GRU model and designed a bidirectional GRU (BiGRU) model with a network structure shown in Fig. 4.

Fig. 4
figure 4

Schematic diagram of bidirectional loop GRU structure

BiGRU consists of two directional GRU NN models, which predict the entire load information through forward and backward prediction. The forward calculation process is consistent with the GRU model calculation process, while the reverse calculation process is shown in Eq. (11). In Eq. (11), \(z_{t}^{a}\) and \(r_{t}^{a}\) represent reverse gating. \(w_{z}^{a}\), \(u_{z}^{a}\), \(w_{r}^{a}\), and \(u_{r}^{a}\) represent the corresponding weights.

$$\left\{ {\begin{array}{*{20}c} {z_{t}^{a} = \sigma \left( {w_{z}^{a} x_{t} + u_{z}^{a} h_{t + 1} } \right)} \\ {r_{t}^{a} = \sigma \left( {w_{r}^{a} x_{t} + u_{r}^{a} h_{t + 1} } \right)} \\ \end{array} } \right.$$

The bidirectional weighted merging strategy combines the bidirectional HL states, which input information from the past and future time periods of the predicted point. The merging strategy adopted in the study is weighted sum, and the calculation process of the HL state \(h_{t}\) is shown in Eq. (12). In Eq. (12), \(k\) serves as the weighted proportion. \(h_{t}^{f}\) and \(h_{t}^{b}\) represent the forward and backward HL states, respectively.

$$h_{t} = k \times h_{t}^{f} + \left( {1 - k} \right) \times h_{t}^{b}$$

Finally, the prediction result \(y_{t}\) of the BiGRU model is shown in Eq. (13), where \(w_{y}\) represents the weight from the HL to the OL of the system.

$$y_{t} = \sigma \left( {h_{t} \times w_{y} } \right)$$

The improvement strategy of the BiGRU model has improved the prediction results, but the BiGRU model has many parameters, including forward and reverse calculation weights, HL to OL weights, and weighting ratio coefficients. The value of the parameters possesses a crucial impact on the model. Therefore, the study introduces the PSO algorithm for global optimization and training the parameters of the BiGRU model. In addition, using PSO for optimizing the loss function, the workflow of the BiGRU model on the grounds of PSO optimization is shown in Fig. 5. The first is to initialize the particle swarm and randomly generate a set of particles. Different particles represent different candidate solutions, namely GRU weight parameters. Then, the initial velocity of the particles is randomly generated, and the corresponding fitness value is calculated based on the particle's position. The fitness function is a performance metric of a neural network for training data. Then the individual positions and velocities are updated according to the comparison between the particle positions and the individual optimal positions. The updating process is repeated until the convergence condition is satisfied. PSO can continuously optimize the weights of the GRU model to improve the network performance.

Fig. 5
figure 5

Workflow diagram of BiGRU model on the grounds of PSO

PSO is a biomimetic intelligent OA inspired by the foraging behavior of bird flocks in nature. PSO considers the problem to be optimized as particles in the solution space, simulating collaboration and information exchange between individuals, and searching for the optimal solution. The PSO algorithm is easy for implementing, with fast convergence speed, and is utilized for solving optimization problems (Pradhan et al. 2022; Alsaidy et al. 2022). The working mechanism and process of PSO are shown in Fig. 6.

Fig. 6
figure 6

Working mechanism and flowchart of PSO

Each particle is randomly assigned an initial position and velocity, and the position and velocity are adjusted on the grounds of the information of the global and individual optimal positions. The update process of particle position \(x_{i}^{m}\) and velocity \(v_{i}^{m + 1}\) is shown in Eq. (14), where \(w\) serves as the inertia weight. \(c_{1}\) and \(c_{2}\) serve as the acceleration factors of individuals and populations, respectively. \(r_{1}\) and \(r_{2}\) represent random numbers, \(\alpha\) represents constraint factors, and \(m\) represents iteration times. \(p_{i}\) and \(s\) represent the optimal particle positions under local and global search, respectively, and determine the individual and global optima, which are the key to the PSO algorithm.

$$\left\{ {\begin{array}{*{20}c} {v_{i}^{m + 1} = wv_{i}^{m} + c_{1} r_{1} \left( {p_{i}^{m} - x_{i}^{m} } \right) + c_{2} r_{2} \left( {s_{{}}^{m} - x_{i}^{m} } \right)} \\ {x_{i}^{m + 1} = x_{i}^{m} + \alpha v_{i}^{m + 1} } \\ \end{array} } \right.$$

The influence of the update speed of particle position is usually limited to a certain range during the training process. Meanwhile, the inertia weight \(w\) possesses a significant impact on the optimization level of particles. The study adopts a dynamic adjustment strategy to optimize \(w\), and the calculation is shown in Eq. (15). In Eq. (15), \(x\) represents the number of iterations.

$$w = w_{\max } - \frac{{w_{\max } - w_{\min } }}{{x_{\max } }} * x$$

Performance testing and effectiveness evaluation of power load identification and prediction models

Aiming at verifying the performance of the SSA-DBN load identification model and the PSO-BiGRU model, a performance test and application analysis experiment were designed, and the results were discussed.

Experimental environment and dataset settings

Experimental environment settings: The DL framework is PaddlePaddle, the operating system is Windows 10, the processor is i9-9900 k, and the algorithm development language is Python 3.6. It chooses publicly available electricity load datasets as experimental datasets, including REDD dataset, Pecan Street dataset, and The Almanac of Minimally Power dataset and survey (AMPds) dataset. The REDD dataset mainly provides electricity load data at the second, minute, or hour level for different types and functions of electrical equipment. The Pecan Street dataset is electricity consumption data for residential and commercial buildings in the Texas area of the United States, including instantaneous power consumption, electricity consumption, and electricity load curve information. The AMPds dataset is a mixed household and commercial electricity load dataset that includes electricity load data for various types of buildings. The dataset contains additional information related to electricity consumption, such as building features, household background, and electricity consumption behavior. It selects data that meets the experimental requirements and divides it into training and testing sets in a 9:1 ratio. The OA selects single peak test functions Sphere and Schwefel, multi-peak test functions Rastigin and Ackley, and rotating multi-peak functions Griewank and Easom for testing and analysis.

Performance testing of SSA-DBN load identification model

The experiment selects common intelligent OAs, Firefly Algorithm (FA), Whale Optimization Algorithm (WOA), and Artificial Bee Colony Algorithm (ABC), for performance comparison. The normalized statistical results of various indicators are shown in Table 1. Table 1 shows that the SSA selected in the study performs better overall in each indicator. The performance of the training and testing sets is superior to other algorithms, and the experimental re-producibility is high. Generation distance and inverse generation distance can measure the average distance and coverage in the approximate solution and the true front, and the smaller the value, the closer the algorithm is to and covers the true front. The minimum distance indicators of SSA are 0.379 and 0.209, respectively, indicating that SSA exhibits good solving performance. The maximum Hyper volume of SSA is 0.814, which covers more real frontier solutions compared to other algorithms. Finally, a comprehensive evaluation is conducted on the distribution of solution sets for different algorithms. As can be seen from the values of the Spacing and Spread indicators, the highest SSA values are 0.787 and 0.765, indicating good population diversity and distribution in the solution set.

Table 1 Comparison of optimization quality of different intelligent OA

The K-nearest Neighbor algorithm (KNN), BPNN, traditional DBN, and SSA-DBN are used to classify and recognize the same load data. Firstly, it compares the Precision Recall (P-R) and Receiver Operating Characteristic Curve (ROC) of different recognition algorithms. The relevant outcomes are shown in Fig. 7. In Fig. 7a, the P-R curve of SSA-DBN is situated at the top of the coordinate axis, with high accuracy and recall, balancing the values of conflicting indicators. When the accuracy of SSA-DBN is 0.9, the recall rate can reach the highest level of 0.94. In Fig. 7b, the maximum Area Under the Curve (AUC) value of the ROC curve of SSA-DBN is 0.904, which is 0.180 higher than the traditional DBN model. Overall, SSA-DBN has better accuracy and comprehensiveness in recognition and classification.

Fig. 7
figure 7

Comparison of P-R and ROC curves for different classification algorithms

The average recognition accuracy, classification error training, and classification results of the model training set are shown in Fig. 8. In Fig. 8a, the SSA-DBN model has a higher accuracy in identifying the load of different categories of household appliances on the grounds of the recognition features set in the study. In Fig. 8b, the classification error of the SSA-DBN model reduces with the increase of fine-tuning times, with a minimum level of around 0.050. In Fig. 8c, the recognition accuracy of the SSA-DBN model can reach up to 0.924, which is superior to other models under the same experimental conditions.

Fig. 8
figure 8

Recognition accuracy and classification error of SSA-DBN

Performance testing of improved PSO-BiGRU LF model

The F1 values and loss function curves of different models are analyzed, and LSTM, traditional GRU, and BiGRU are selected for comparative experiments. The relevant outcomes are shown in Fig. 9. Figure 9a shows that the F1 value of the PSO-BiGRU model is higher than other models, with the maximum value converging to 90.06%, which is significantly better than the lowest model LSTM’s 69.08%. There is a nearly 15% numerical improvement compared to the GRU and BiGRU models before optimization. The F1 value is the harmonic mean of accuracy and recall, indicating that the BiGRU model has strong predictive classification ability. In Fig. 9b, the loss function curve of the PSO-BiGRU model converges to a minimum value of around 0.18, with the fastest convergence speed. PSO has a good optimization effect on the loss function and a high degree of model fitting.

Fig. 9
figure 9

F1 values and loss function curves for different models

Comparing the prediction errors of different LF models, the statistical results of MAE and RMSE for different datasets are shown in Fig. 10. In Fig. 10a, the MAE values of the optimized PSO BiGRU model are lower than those of other models on different datasets, and the BiGRU model is lower than LSTM and GRU models. The bidirectional loop strategy and PSO parameter optimization measures designed in the research have achieved significant results. In Fig. 10b, RMSE is more sensitive to changes in error compared to MAE. In contrast, the median RMSE values of the PSO-BiGRU model are all below 0.30, indicating that the PSO-BiGRU model can exhibit lower errors in LF.

Fig. 10
figure 10

Comparison of prediction errors of various models

The electricity load data of residents in a certain community has been collected. The collection frequency is half an hour, and the continuous collection period is 30 days. A total of 1379 sets of data are obtained, and the data is divided into training and testing sets in an 8:2 ratio based on the experiment. Joint REDD, Pecan Street, and AMPds datasets are selected for applied analysis. The comparison between predicted values and actual values is shown in Fig. 11. Figure 11 shows that compared to the BiGRU model, the PSO-BiGRU model has a more fitting prediction curve for real data. Although the approximate trends of the two models differ slightly from the true values, the PSO-BiGRU model performs better in handling the details of the prediction results, and the data direction and numerical errors are more accurate. This indicates that PSO possesses an essential influence on optimizing the weights and combination proportion coefficients of the BiGRU model.

Fig. 11
figure 11

Comparison between predicted and true values

A qualitative analysis of the research-designed method is carried out to measure its effectiveness and performance in practical applications. The results of the recognition response speed and adaptability evaluation are shown in Fig. 12. From Fig. 12a, the computational efficiency and response speed of the research method are optimal, and real-time recognition can be achieved in the practical application session, which facilitates the prediction of the load state. From Fig. 12b, the PSO-BiGRU model is adaptable and suitable for practical application scenarios.

Fig. 12
figure 12

Response speed and adaptability evaluation

Finally, the analysis of confidence interval estimation for load forecasting is carried out. Given a constant \(\alpha\) between 0 and 1, the confidence level of the true load value is set to \(1 - \alpha\). The confidence interval is determined based on the inverse function of the predicted value and the probability distribution function \(\widehat{G}\left( x \right)\), which represents the interval containing the true value with probability \(1 - \alpha\). Then, the prediction interval coverage is calculated to evaluate the confidence interval estimation effect, as shown in Table 2. In Table 2, the predicted interval coverage is higher than the corresponding confidence level, indicating that the research design is based on the effectiveness of different confidence interval estimates.

Table 2 Calculated results for confidence levels of 80%, 85%, 90%, and 95%


Aiming at enhancing the accuracy of load recognition and prediction, this study first designed a DL framework for load recognition by combining SSA and DBN. Then, on the grounds of GRU, the PSO algorithm was introduced to achieve bidirectional weighted improved LF. The relevant outcomes demonstrated that the SSA selected in the study had excellent solving ability in the solution space, and the distribution index values of various solution sets were better than other OAs, which was beneficial for the parameter optimization of deep confidence networks. The accuracy, recall, and AUC value of SSA-DBN were superior to other classification models, with a maximum AUC value of 0.904. SSA-DBN had a high average recognition accuracy for different categories of electrical appliances on the grounds of the training set, with a minimum classification error level of 0.050. The CP of the test set was optimal. The PSO-BiGRU model had been optimized using the PSO algorithm, resulting in better convergence performance and refinement of prediction results. The loss function curve converged to a minimum value of around 0.18. Compared to other models, the F1 value of the PSO-BiGRU model converged to 90.06%, and both MAE and RMSE were below 0.4, indicating the best overall performance.

The successful application of this method can help the PSY achieve effective EC management and optimized use, ensuring the stable operation of the PSY. Accurate load identification and forecasting helps power companies to better plan and manage power networks, adjust generation, optimize transmission lines and prepare for peak loads in advance. It also further promotes the mature application of NILM to realize the automation, intelligence and sustainable development of PSY. However, in the actual smart grid environment, data acquisition and processing still bring uncertain errors to load identification and prediction. NILM involves a large amount of energy data, and its user privacy and data security need to be strengthened. In addition, its high-performance hardware support poses challenges for load recognition and prediction. Load identification involves a large number of electrical appliances, and there are differences in the characteristics of different types of appliances. Different regions and suppliers use different monitoring systems and technical standards. Cross platform data integration and standardization processing, as well as unified recognition of multiple features, will be the future research direction.

Availability of data and materials

All the data is in the text.


  • Abdulhammed OY (2022) Load balancing of IoT tasks in the cloud computing by using sparrow search algorithm. J Supercomput 78(3):3266–3287

    Article  Google Scholar 

  • Alsaidy SA, Abbood AD, Sahib MA (2022) Heuristic initialization of PSO task scheduling algorithm in cloud computing. J King Saud Univ-Comput Inf Sci 34(6):2370–2382

    Google Scholar 

  • ArunKumar KE, Kalaga DV, Kumar CMS, Kawaji M, Brenza TM (2022) Comparative analysis of Gated Recurrent Units (GRU), long Short-Term memory (LSTM) cells, autoregressive Integrated moving average (ARIMA), seasonal autoregressive Integrated moving average (SARIMA) for forecasting COVID-19 trends. Alex Eng J 61(10):7585–7603

    Article  Google Scholar 

  • Bhosle K, Musande V (2023) Evaluation of deep learning CNN model for recognition of devanagari digit. Artif Intell Appl 1(2):114–118

    Google Scholar 

  • Chen J, Wang X (2022) Non-intrusive load monitoring using gramian angular field color encoding in edge computing. Chin J Electron 31(4):595–603

    Article  Google Scholar 

  • Chen T, Qin H, Li X, Wan W, Yan W (2023) A non-intrusive load monitoring method based on feature fusion and SE-ResNet. Electronics 12(8):1909–1922

    Article  Google Scholar 

  • Djordjevic S, Simic M (2023) Nonintrusive identification and type recognition of household appliances based on the harmonic analysis of the steady-state current. Electr Eng 105(5):3319–3328

    Article  Google Scholar 

  • Gad AG, Sallam KM, Chakrabortty RK, Ryan MJ, Abohany AA (2022) An improved binary sparrow search algorithm for feature selection in data classification. Neural Comput Appl 34(18):15705–15752

    Article  Google Scholar 

  • Gharehchopogh FS, Namazi M, Ebrahimi L, Abdollahzadeh B (2023) Advances in sparrow search algorithm: a comprehensive survey. Arch Comput Methods Eng 30(1):427–455

    Article  Google Scholar 

  • Himeur Y, Alsalemi A, Bensaali F, Amira A, Al-Kababji A (2022) Recent trends of smart nonintrusive load monitoring in buildings: a review, open challenges, and future directions. Int J Intell Syst 37(10):7124–7179

    Article  Google Scholar 

  • Kang JS, Yu M, Lu L, Wang B, Bao Z (2022) Adaptive non-intrusive load monitoring based on feature fusion. IEEE Sens J 22(7):6985–6994

    Article  Google Scholar 

  • Kaselimi M, Protopapadakis E, Voulodimos A, Doulamis N, Doulamis A (2022) Towards trustworthy energy disaggregation: a review of challenges, methods, and perspectives for non-intrusive load monitoring. Sensors 22(15):5872–5900

    Article  Google Scholar 

  • Kianpoor N, Hoff B, Østrem T (2023) Deep adaptive ensemble filter for non-intrusive residential load monitoring. Sensors 23(4):1992–2017

    Article  Google Scholar 

  • Kim G, Park S (2023) Pre-trained non-intrusive load monitoring model for recognizing activity of daily living. Appl Intell 53(9):10937–10955

    Article  Google Scholar 

  • Laouali I, Ruano A, Ruano MDG, Bennani SD, Fadili HE (2022) Non-intrusive load monitoring of household devices using a hybrid deep learning model through convex hull-based data selection. Energies 15(3):1215–1236

    Article  Google Scholar 

  • Lee MH, Moon HJ (2023) Nonintrusive load monitoring using recurrent neural networks with occupants location information in residential buildings. Energies 16(9):3688–3709

    Article  Google Scholar 

  • Liu Z, Wang Y, Ma Z, Cao M, Liu M, Yang X (2023) A non-intrusive load recognition method combining adaptive PSO algorithm and CNN model. J Intell Fuzzy Syst 45(6):10921–10935

    Article  Google Scholar 

  • Liu J, Wang C, Xu L, Wang M, Xu Y (2024) Enhancing residential electricity safety and management: a novel non-intrusive load monitoring-based methodology for accurate appliance operational state identification. Appl Sci 14(2):503–519

    Article  Google Scholar 

  • Lu J, Zhao R, Liu B, Yu Z, Zhang J, Xu Z (2023) An overview of non-intrusive load monitoring based on VI trajectory signature. Energies 16(2):939–960

    Article  Google Scholar 

  • Lu L, Kang JS, Meng F (2024) Non-intrusive load identification based on retrainable siamese network. Sensors 24(8):2562–2581

    Article  Google Scholar 

  • Luo J, Liu S, Cai Z, Xiong C, Tu G (2023) A multi-task learning model for non-intrusive load monitoring based on discrete wavelet transform. J Supercomput 79(8):9021–9046

    Article  Google Scholar 

  • Pradhan A, Bisoy SK, Das A (2022) A survey on PSO based meta-heuristic scheduling mechanism in cloud computing environment. J King Saud Univ-Comput Inf Sci 34(8):4888–4901

    Google Scholar 

  • Rafati A, Shaker HR, Ghahghahzadeh S (2022) Fault detection and efficiency assessment for HVAC systems using non-intrusive load monitoring: a review. Energies 15(1):341–356

    Article  Google Scholar 

  • Yin L, Ma C (2023) Interpretable incremental voltage-current representation attention convolution neural network for non-intrusive load monitoring. IEEE Trans Industr Inf 19(12):11776–11787

    Article  Google Scholar 

  • Yin B, Li Z, Xu J, Li L, Yang X, Du Z (2022) Non-intrusive load monitoring algorithm based on household electricity use habits. Neural Comput Appl 34(18):15273–15291

    Article  Google Scholar 

Download references


There is no funding.

Author information

Authors and Affiliations



In this paper, to achieve accurate identification and prediction of electricity load, intelligent optimization algorithms are introduced into deep learning optimization for improvement. A load recognition model combining sparrow search algorithm and deep confidence network is designed, as well as a gated recurrent network prediction model on the grounds of particle swarm optimization. JW analyzed the data, XT and WD helped with the constructive discussion. JW, DZ, and QC made great contributions to manuscript preparation. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Wenyuan Deng.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, J., Tang, X., Zhou, D. et al. Application of improved DBN and GRU based on intelligent optimization algorithm in power load identification and prediction. Energy Inform 7, 36 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: