 Research
 Open access
 Published:
An optimisationbased energy disaggregation algorithm for low frequency smart meter data
Energy Informatics volumeÂ 2, ArticleÂ number:Â 13 (2019)
Abstract
An algorithm for the nonintrusive disaggregation of energy consumption into its enduses, also known as nonintrusive appliance load monitoring (NIALM), is presented. The algorithm solves an optimisation problem where the objective is to minimise the error between the total energy consumption and the sum of the individual contributions of each appliance. The algorithm assumes that a fraction of the loads present in the household is known (e.g. washing machine, dishwasher, etc.), but it also considers unknown loads, treating them as a single load. The performance of the algorithm is then compared to that obtained by two state of the art disaggregation approaches implemented in the publicly available NILMTK framework. The first one is based on Combinatorial Optimization, the second one on a Factorial Hidden Markov Model. The results show that the proposed algorithm performs satisfactorily and it even outperforms the other algorithms from some perspectives.
Introduction
The introduction of smart meters makes possible to collect energy consumption readings at finegrained spatiotemporal resolution (i.e., measurements with granularity in the order even of a few seconds, for single households), thus enabling the extraction of detailed information about individual energy usage habits. In turn, such knowledge allows for the construction of more accurate mathematical models to characterize individual and collective energy consumption behaviors. Energy enduse disaggregation aims at breaking down the total energy consumption measured at household level into the contributions of single electrical appliances. The use of such disaggregated information is twofold: on one side, it can be leveraged to develop predictive models capable of forecasting future energy consumption behaviours, on the other side it can be directly provided to customers, so that householdâ€™s components gain a detailed knowledge of their energy usage. For instance, through an App developed in the context of the enCOMPASS project^{Footnote 1}, customers can visualize their hourly consumption, as well as charts on their energy enduses patterns across major enduse categories (e.g., washing machine, dishwasher, clothes dryer, fridge) and they can be alerted of occurring consumption anomalies. Furthermore, personalized hints for reducing energy consumption can be directly delivered to the users. These stimuli are aimed at fostering the adoption of energy saving actions, such as replacing lowefficient appliances into highefficient ones and reducing energy waste (e.g. turning off lights when rooms are empty).
In this paper we present a novel algorithm for enduse energy disaggregation that evolves the features of a previous work by Piga et al. (2016) accounting for the coarse granularity of standard smart metering systems (a data point every 15 min) and for the presence of unknown loads. To this purpose, we first briefly introduce the main approaches discussed in literature for solving the energy disaggregation problem, then we introduce our algorithm, and finally we evaluate its performance by comparing it against two state of the art disaggregation algorithms applied to a publicly available dataset.
State of the art of energy use characterization
There is a rich literature on automatic disaggregation methods (known as NonIntrusive Appliance Load Monitoring â€“ NIALM â€“ algorithms) (Batra et al. 2014) aimed at decomposing the aggregate household energy consumption data collected from a single measurement point into devicelevel consumption data, requiring limited or even no interaction with the user.
The first algorithm for NIALM was proposed by (Hart 1992). Hartâ€™s approach is based on the segmentation of the aggregate power signal into successive steps, which are then matched to the appliance signatures. However, this method is not able to detect multistate appliances and it is neither able to decompose power signals made of simultaneous on/off events on multiple appliances. Since Hartâ€™s contribution, the NIALM problem has been extensively studied in the literature. The survey papers by Zoha et al. (2012) and by Zeifman & Roth (2011) give a complete review on the stateoftheart of NIALM methods.
Note that the vast majority of the studies on NIALM algorithms validate the proposed solutions using publicly accessible datasets of real energy consumption measurements. The most widely used datasets made available in the last years are reported in Table 1. Alternatively, synthetic load consumption traces generated by open source software such as Loadprofilegenerator^{Footnote 2} can be adopted.
An optimisation based algorithm for low frequency disaggregation
Motivation
The algorithm here presented is based on the approach described in (Piga et al. 2016), which exploited the assumption that the power demand profiles of each appliance are piecewise constant over time. The disaggregation problem was treated as a leastsquare error minimization problem, with an additional (convex) penalty term aiming at enforcing the disaggregated signals to be piecewise constant over time. However, the assumption of piecewise constant pattern behaviour is less likely to hold when considering the coarse energy measurement granularity made available by standard smart metering system (i.e., 15 min resolution). Moreover, the approach in Piga et al. (2016) could not be applied in presence of unknown loads. We have therefore evolved the load disaggregation algorithm in Piga et al. (2016) to take into account the presence of unknown electrical devices. In the following, we formalize the final version of the energy enduse disaggregation problem as a quadratic programming (QP) model.
Quadratic programming model for energy disaggregation
We now define the problem inputs (sets and parameters), the output variables, the objective function and the problem constraints. The problem is formulated as a Mixed Integer Quadratic Program as follows.
The input data sets to the problem are:

T, the set of time epochs (t=1,2,â€¦,T);

A, the set of appliances;

L_{a}, the set of energy consumption levels of appliance a, with aâˆˆA.
The input parameters are:

c_{t}, the aggregate energy consumption during time epoch tâˆˆT;

m_{a}, the maximum daily energy consumption of appliance aâˆˆA;

d_{a}, the maximum daily usage duration (i.e., maximum number of consecutive epochs in which the appliance is on) of appliance aâˆˆA;

w_{a}, the minimum daily usage duration (i.e., minimum number of consecutive epochs in which the appliance is on) of appliance aâˆˆA;

u_{a,t}, is a binary parameter, set to 1 if appliance aâˆˆA can be turned on at time tâˆˆT;

Î±_{a}, is the multiplicative weight of appliance aâˆˆA.
The model includes the following variables:

x_{a,l,t}, is a binary variable set to 1 if appliance aâˆˆA operates at consumption level lâˆˆL_{a} during time epoch tâˆˆT;

y_{a,t}, is a binary variable set to 1 if appliance aâˆˆA changes consumption level at time epoch tâˆˆT;

o_{a,t}, is a binary variable set to 1 if appliance aâˆˆA is on at time epoch tâˆˆT;

f_{a}, is a binary variable set to 1 if appliance aâˆˆA is on during at least one time epoch during the considered time horizon;

wm is an integer variable indicating the last epoch of activity of the washing machine;

cd is an integer variable indicating the first epoch of activity of the clothes dryer.
The objective function minimizes the sum of two contributions: the first one is the quadratic error (i.e., the difference between the observed aggregated measurement and the sum of the reconstructed consumption of every appliance, at every time epoch), the second one is a penalty for every change of consumption level experienced by each appliance during the optimization horizon.
By tuning the weights Î±_{a}, the penalty attributed to a nonpiecewiseconstant energy consumption of certain appliances can be strengthened or relaxed. Note that the quadratic term accounts for the consumption of all the unknown loads. Note also that, if the contributions of unknown appliances to the aggregated energy consumption pattern are significant, the minimization of such quadratic term would lead appliances in set A to be pushed to on state most of the time. To avoid such drawback, constraints that limit the length of the activity period and the maximum consumption of the appliances in set A must be inserted (as discussed in the following paragraphs).
The problem includes the following set of constraints.
Constraint 1 imposes that each appliance operates at a single energy consumption level during each time epoch. Constraints 2, 3 set variable y_{a,t}=1 if appliance aâˆˆA changes consumption level at epoch tâˆˆT. Constraint 4 imposes that the daily energy consumption of appliance aâˆˆA does not exceed the daily limit. Constraint 5 sets variable o_{a,t}=1 if appliance aâˆˆA is on at epoch tâˆˆT. Constraint 6 imposes that the maximum usage duration of appliance aâˆˆA does not exceed d_{a}. Constraint 7 ensures coherence between the values of variable x_{a,l,t} and of variable f_{a}. Constraint 8 imposes that the daily energy consumption of appliance aâˆˆA (if activated) is not lower than the daily lower limit w_{a}. This way, the disaggregation of load curves of appliances such as dishwasher, washing machine and clothes dryer takes into account the minimum duration of a washing/drying cycle. Constraint 9 sets variable wm to the last epoch of activity of the washing machine (if the washing machine is activated during the day). Constraint 10 sets variable cd to the first epoch of activity of the clothes dryer (if the cloth dyer is activated during the day). Constraint 11 imposes that the clothes dryer is turned on after the end of the operational period of the washing machine. Constraint 12 imposes that each appliance belonging to set \(\tilde {A}\) works at the highest energy consumption level for at least one time epoch, if activated during the day. In our formulation, set \(\tilde {A}\) contains the dishwasher, the washing machine, and the clothes dryer. The energy consumption profiles of a typical operation cycle of such appliances normally include one or multiple peak consumption periods, corresponding e.g. to water heating or spinning. Therefore, this constraint imposes that at least one peak consumption epoch is included in the disaggregated consumption profile of such appliances. Finally, constraint 13 imposes that the sum of the disaggregated energy consumption profiles does not exceed the total energy usage measured by the smart meter located at the userâ€™s premises.
Parameter training and QP model solution
We now discuss how each input set and parameter of the QP model introduced in the previous subsection is dimensioned.
Set T: the number of epochs depends on the duration of the scheduling horizon and on the resolution of the aggregated consumption measurements collected by the smart meters. As an example, assuming that the scheduling horizon is 24 hours and the granularity of consumption measurements is 15 mins, the number of epochs is 96, thus we can define set T={1,2,â€¦,96}.
Set A: the set of main electrical appliances installed in a building. Those appliances may include: dishwasher, washing machine, clothes dryer, oven, electric vehicle, heat pump, air conditioner.
Set L_{a}: we assume that each appliance can operate at a predefined number of consumption levels. The number of levels and the energy consumption per epoch associated to each level can be determined by collecting statistics over historical individual consumption data (if available) or over publicly available datasets containing load consumption curves of the main categories of electrical appliances (see Table 1). Note that set L_{a} always contains the element 0 (corresponding to the appliance off state). In the following, we report the algorithm we used to extract consumption levels from consumption curves of individual appliances, when available to be used as training data.

1
Create an histogram by defining a set of energy consumption bins and computing the number of measurements falling into each bin, where the number of bins is a predefined system parameter (e.g., 50 bins of width 100 Watt in the range 05 kW);

2
Identify the histogram peaks with prominence greater than p measurements, where p is a predefined system parameter and depends on the total number of available measurements, i.e. on the temporal window covered by the training dataset.

3
Retrieve the extremes [b_{low},b_{high}] of the energy consumption bins associated to the selected peaks, calculate the corresponding energy consumption level as (b_{high}âˆ’b_{low})/2+b_{low}.
Parameter c_{t}: aggregate energy consumption measurements are collected by smart meters installed at the usersâ€™ premises. Note that, in case disaggregated consumption measurements collected via smart plugs are available, those are subtracted from c_{t} and disaggregation is performed excluding the directly monitored appliances from set A.
Parameters m_{a}, d_{a} and w_{a}: as for set L_{a}, the maximum daily energy consumption and minimum/maximum duration of the operational period of each appliance can be calculated either based on historical individual consumption patterns or on publicly available datasets. In our implementation, maximum and minimum durations were computed by identifying the epochs of activity of every appliance within the training dataset, computing the minimum (resp. maximum) number of consecutive activity epochs in the dataset and setting the values of w_{a} and d_{a} accordingly. To set the value of m_{a}, the average energy consumption c_{aver} during the activity epochs was calculated and we set m_{a}=c_{aver}Â·d_{a}.
Parameter u_{a,t}: this parameter can be used to prevent some appliances from being turned on at certain time periods. For example, if absence from home is inferred by motion detectors, the off state of oven, dishwasher, washing machine and clothes dryer (unless they support automatic deferral of their operational period) can be enforced^{Footnote 3}.
Parameter Î±_{a}: the value of the coefficients used to impose piecewise linear behaviour of the consumption curve was tuned depending on the appliance type and time granularity. For appliances that exhibit pronounced energy consumption fluctuations even in realtively short time intervals (e.g. washing machines and dishwashers, depending on the phase of the washing cycle such as water heating, spinning), Î±_{a} is set to 0, whereas for appliances that do not show abrupt variations during the charging period (e.g., the recharge of an electric vehicle, especially if the charger does not support multiple charging rates) Î±_{a} is set to a higher positive value. Moreover, the coarser the time granularity, the lower the value of Î±_{a}, since consumption variations during consecutive time periods are more frequently expected (e.g., if the measurements granularity is 30 mins, a washing machine that runs a washing program of 1 hour duration is expected to have a lower consumption during the initial 30 mins of the cycle and a higher consumption during the next 30 mins, when the spinning typically occurs, but if the granularity is 5 min, then we can reasonably expect a piecewise linear consumption along time epochs). Moreover, as the main objective is the minimization of the quadratic error, weights were chosen so that the term \({\sum \nolimits }_{t \in T, a \in A}{\alpha _{a} \cdot y_{a,t}}\) was at least one order of magnitude lower than the term \({\sum \nolimits }_{t \in T}{(c_{t}  {\sum \nolimits }_{a \in A, l \in L_{a}}{l \cdot x_{a,l,t}})^{2}} \) (i.e., if multiple solutions minimizing the objective function exist, the one ensuring minimum value of \({\sum \nolimits }_{t \in T, a \in A}{\alpha _{a} \cdot y_{a,t}}\) is selected).
Performance assessment
We trained and validated our algorithm using the UKDALE 2015 dataset (see Table 1) containing consumption measurements of 6 houses for different time periods. Three out of those (building 3, 4 and 6) were monitored for a period shorter than two months, thus we excluded them from our analysis. For the remaining 3 buildings, we considered the following periods: building 1 from April 1, 2013 to May 31, 2013, building 2 from May 1, 2013 to June 30, 2013, building 5 from July 1, 2014 to August 31, 2014.
In the numerical assessment, we considered a scenario where performed the disaggregation of the 5 top consuming appliances, which are identified beforehand based on the individual consumption during the training period. Note that the type of such appliances may vary from household to household, but is generally restricted to a subset of the following list: dishwasher, washing machine, fridge, freezer, electric oven, cloth dryer, air conditioner, space heater.
The performance of the disaggregation algorithm described in the previous Section, referred to as ILP in the following, is compared to that obtained by two state of the art disaggregation approaches implemented in the publicly available NILMTK framework by Batra et al. (2014). The first one is based on Combinatorial Optimization (CO), the second one on a Factorial Hidden Markov Model (FHMM) (see (Batra et al. 2014) for further details on their implementation and on the choice of their input parameters). The training of these two algorithms and of our algorithm was performed using as training set the first month of disaggregated measurements for each of the three buildings we selected from the UKDALE dataset.
The CO and FHMM models are implemented in Python, whereas the ILP model has been implemented in AMPL and solved with the Gurobi solver, running on a Linux machine with 2 Ã— Intel Xeon E52620 v4 2.1GHz (20/32 cores have been allocated) and 16 GB of RAM. A computational time limit of 180 seconds per instance was imposed.
Performance metrics
The following performance metrics, proposed in (Batra, et al., 2014), have been used to compare the performance of the three disaggregation algorithms: The Fraction of Total Energy Assigned Correctly (FTEAC), defined as:
where the estimated consumption of appliance a at time t in the case of the ILP algorithm is computed as \(\hat {z}_{a,t}={\sum \nolimits }_{l \in L_{a}}x_{a,l,t} \cdot l\), whereas in the case of the CO and FHMM algorithms is obtained as output of the NILMTK implementation. Conversely, z_{a,t} is the true consumption of appliance aâˆˆA at time tâˆˆT obtained from the UKDale dataset.
The Normalized Error in Assigned Energy (NEAE) for each appliance a, defined as:
The Root Mean Square Error (RMSE) for each appliance a, defined as:
The True/False Positive Rate (TPR/FPR) for each appliance a, defined as:
Where:
The Accuracy (ACC) and Precision (PRE) for each appliance a, defined as:
Testing and validation
In Fig. 1 we report seven different performance indicators obtained by validating the three algorithms on the UKDALE dataset, for different epoch granularities ranging from 5 to 60 min. As the appliances belonging to the top consuming set differ from building to building, global metrics are computed by averaging the results obtained for the 5 appliances belonging to each building.
It can be noted that the fraction of energy consumption correctly assigned by the ILP algorithm to the top 5 consuming appliances is slightly lower than that assigned by the CO and FHMM algorithms. However, the normalized error achieved by the ILP algorithm is always consistently smaller than the one obtained by the two benchmark algorithms, while the root mean square error achieved by the ILP algorithm is slightly lower than that obtained by CO and FHMM.
The true positive rate of the ILP algorithm remains lower than that of the CO and FHMM algorithms, with FHMM outperforming CO. However, an increase in the true positive rate of the ILP algorithm is observed at coarse granularities (45 and 60 min epochs). The relatively poor performance of the ILP in terms of true positive rate is compensated by the very low false positive rate, which is much smaller than that achieved by the benchmark algorithms. This means that, though the ILP algorithm sometimes does not detect some activity periods of the appliances, it almost never fails in detecting off periods, whereas the CO and FHMM algorithms often incorrectly turns on appliances). Overall, the ILP algorithm achieves accuracy and precision ranges comparable to those of the benchmarks, slightly outperforming the benchmarks and showing remarkably smaller interquantile ranges at coarse measurement granularities^{Footnote 4}.
While there is not a single algorithm that clearly dominates the other ones, the low false positive rate and the relatively good precision and accuracy seems to be features of some importance when feedback is provided to real users, as higher false positives might eventually reduce the user confidence in the algorithm output.
Conclusions
In this paper we have described a novel algorithm for the disaggregation of the overall energy consumption pattern of a household into the single enduses of each appliance. The proposed algorithm is based on the solution of a quadratic programming problem with mixed integer constraints. In this paper we report the training and the validation of the algorithm on one well known publicly available dataset and its performance has been evaluated for different granularities of the aggregated energy consumption measurements, showing that graceful degradation of the disaggregation results is achieved and that still accurate results can be obtained also in the case of data with 15 min resolution, that is a common data temporal resolution available in most commercial smartmetering solutions, where submetering devices are not or cannot be installed.
Availability of data and materials
The source code and references to the datasets can be found at https://github.com/encompassprojecteu/disaggregator.
Notes
Free download available at www.loadprofilegenerator.de(accessed on March 31, 2019)
As in the dataset used for our numerical assessment no information on presence/absence of house dwellers was included, u_{a,t} was set to 1 by default.
Note that the basic version of the algorithm in (Piga et al. 2016) achieves lower accuracy than the ILP algorithm with the considered dataset, mainly because the too coarse granularity of the measurements above 15 min resolution violates the assumption of piecewise linearity of consumption measurements required in (Piga et al. 2016) and because the fraction of energy consumed by unknown appliances (which are not modelled in (Piga et al. 2016)) is erroneously attributed to the top 5 consuming appliances.
References
Batra, N, Kelly J, Parson O, Dutta H, Knottenbelt W, Rogers A, Singh A, Srivastava M (2014) Nilmtk: Anopen source toolkit for nonintrusive load monitoring In: Proceedings of the 5th International Conference on Future Energy Systems, 265â€“276.. ACM, Cambridge.
Hart, G (1992) Nonintrusive appliance load monitoring. Proc IEEE 80(12):1870â€“1891.
Piga, D, Cominola A, Giuliani M, Castelletti A, Rizzoli A. E (2016) Sparse optimization for automated energy enduse disaggregation. IEEE Trans Control Syst Technol 24(3):1044â€“1051.
Zeifman, M, Roth K (2011) Nonintrusive appliance load monitoring: Review and outlook. IEEE Trans Consum Electron 57(1):76â€“84.
Zoha, A, Gluhak A, Imran M, Rajasegarar S (2012) Nonintrusive load monitoring approaches for disaggregated energy sensing: A survey. Sensors 12(12):16838â€“16866.
About this supplement
This article has been published as part of Energy Informatics Volume 2 Supplement 1, 2019: Proceedings of the 8th DACH+ Conference on Energy Informatics. The full contents of the supplement are available online at https://energyinformatics.springeropen.com/articles/supplements/volume2supplement1
Funding
This work has been partially supported by the Horizon 2020 project enCOMPASS (Grant N. 723059). Publication of this supplement was funded by Austrian Federal Ministry for Transport, Innovation and Technology.
Author information
Authors and Affiliations
Contributions
All authors have contributed equally to this paper. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisherâ€™s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Rottondi, C., Derboni, M., Piga, D. et al. An optimisationbased energy disaggregation algorithm for low frequency smart meter data. Energy Inform 2 (Suppl 1), 13 (2019). https://doi.org/10.1186/s4216201900898
Published:
DOI: https://doi.org/10.1186/s4216201900898