Skip to main content

Generating synthetic load profiles of residential heat pumps: a k-means clustering approach


The creation of synthetic heat pump load profiles is essential for energy system modeling and simulations. This paper proposes a methodology to create synthetic heat pump load profiles based on the k-means algorithm and a data set from water-to-water heat pumps from Hamelin, Germany. The quality of the generated load profiles is shown according to load factors, load distribution curves and the Pearson correlation coefficient, and is also applied on two exemplary geographies in Germany. We publish our work open-source and provide a web-based heat pump load profile generator.


Germany plans to install up to 500,000 heat pumps annually from 2024 onwards. In doing so, the government aims to transition the heating sector to carbon-free, electricity-based heating (DW 2023). While the installation of heat pumps comes with several benefits, such as the high energy efficiency, the reduced reliance on gas, and the absence of direct carbon emissions, the adoption of the technology might also entail certain pitfalls. For instance, Protopapadaki and Saelens (2017) show that heat pump penetrations of more than 20-30% could cause severe issues in distribution grids. Hence, it is essential to analyze the impact of increasing heat pump loads on our energy system and to develop heat pump operation strategies. However, the absence of widely available heat pump load data has led to limitations in existing heat pump-related studies: often, heat pump load profiles are derived by simulation models like TRNSYS (Maranghi et al. 2023). While this approach is feasible for small-scale, building-level investigations, it is not viable for large-scale studies with a large number of heat pump loads. Gunkel et al. study the impact of heat pumps on national peak load hours. The authors use electricity consumption data of 720,000 households to discover that heat pump installations lead to 14% more peak load hours in Denmark than electric vehicles. Albeit the study yields relevant insights on the impact of heat pumps on the national electricity demand and peak loads, the underlying data is not available as open-source data set and hence cannot be used for further studies. In 2022, Schlemminger et al. have published the first high-quality and high-resolution data set of heat pump and household loads for 38 households from Hamelin, Germany (Schlemminger et al. 2022), which has since served as the data basis for multiple studies in the field. For instance, Yang et al. present a model to manage and coordinate loads in order to reduce distribution grid operation costs. They use the data from Schlemminger et al. to model daily load peaks (Yang et al. 2023). Zhu et al. use the combined household and heat pump data from the data set to model a household in Hamburg and present a carbon reduction- and savings-aware operation mechanism for a combined PV-BES-EV system (photovoltaics-battery storage-electric vehicle). Their study highlights the problems of using a limited heat pump load data set, as the utilization of the Hamelin-based data set to model a Hamburg-based household increases the degree of simulation inaccuracy.

To overcome spatial and temporal restrictions of open-source data sets, researchers have worked on methods to generate synthetic load profiles, especially for residential customers (Pinceti et al. 2019; El Kababji and Srikantha 2020). Further studies deal with the synthetic generation of industrial and commercial heat load profiles. Jesper et al., for instance, apply a k-means clustering method on 797 annual gas load profiles to create synthetic industrial and commercial heat load profiles. However, to the extent of our knowledge, there exist no studies and open-source tools for the creation of synthetic heat pump load profiles. In this study, we aim to fill this research gap by introducing a k-means-based clustering model to create synthetic regional heat load profiles based on regional weather data and the data set of Schlemminger et al. (2022).

The contributions of this work are summarized in the following. This paper aims to present the first model for synthetic heat pump load profile generation, applying a k-means clustering approach. We validate our model based on metrics from existing literature (Li et al. 2020). We also contribute a novel method to determine the optimal number of clusters, by finding a trade-off between load profile diversity and accuracy. Furthermore, we publish our model and a web-based heat pump load profile generator open-source to enable research models in the field of energy informatics to integrate heat pump energy consumption, thereby helping to overcome the lack of publicly available heat pump load profilesFootnote 1.

The remainder of the paper is structured as follows. First, related work is presented and the implications for this study are discussed. Second, the methodology of our approach and the respective evaluation metrics are introduced. Third, we present the case study on which our methodology is applied. Fourth, we evaluate our results according to the introduced metrics and discuss the optimal selection of clusters. Finally, we summarize our work in the conclusion and give an outlook on further research questions.

Related work

Most synthetic load profile generation studies are focused on household load profiles. Pillai et al. use artificial neural networks to create normalized residential load profiles based on weather data (Pillai et al. 2014). The authors show the opportunities of synthetic load profile creation, especially for simulations in regions without adequate data, which would otherwise have to rely on inaccurate methods, such as working with a constant load assumption. While Pillai et al. focus on standardized load profiles for whole regions based on temperature profiles, Fischer et al. (2015) introduce a stochastic model to create synthetic residential household load profiles with high resolution, implementing socio-economic features as well as seasonal effect. In a later work, the model is extended with space heating and hot water load profiles (Fischer et al. 2016). In their study, especially the importance of diverse load profiles is addressed. In a further study, Li et al. use an iterative process, based on geographic locations and load compositions, to create bus-level load time series (Li et al. 2020). The authors also discuss the validation of their results in detail, which serves as the basis for the evaluation of our case study. In another relevant study, Jesper et al. use a k-means clustering approach to create synthetic industrial and commercial heat load profiles, based on 797 annual natural gas profiles. In their sutdy, the correlation between heat loads and ambient temperature is used to create synthetic heat profiles. Another relevant contribution in the field is made by Ruhnau et al., introducing the “When2Heat” data set, which includes national heat pump load profiles and coefficents of performance (COP) for 16 cold-temperature climate countries in the European union (Ruhnau et al. 2019). The authors underline the need for open energy data for electricity market simulations. However, the data set is targeted at nation-wide studies, thereby being less appropriate for simulations on household level and in smaller grid-level aggregations.

While the depicted studies made valuable contributions to the field of synthetic load profile generation, to the extent of our knowledge no past works are focused on the synthetic generation of household level heat pump load profiles for varying geographies. Hence, we introduce our k-means based model in the following section.


In this section, we describe our methodology to create synthetic heat pump load profiles based on the underlying data set. We employ a k-means clustering approach based on the overall methodology introduced by Jesper et al. to create synthetic heat profiles. First, we describe the overall functionality of the k-means algorithm. Then, we depict the necessary steps to use the k-means algorithm to obtain synthetic load profiles. Finally, we describe suitable synthetic heat profile validation metrics from the existing literature.

K-means algorithm: The k-means algorithm was introduced by MacQueen in 1967 (MacQueen 1967). The clustering algorithm is highly computationally efficient and easily implementable. Hence, many studies in the field of energy informatics and other domains are relying on the k-means algorithm (Panapakidis and Christoforidis 2017; Azad et al. 2014; Jessen et al. 2022). The algorithm iteratively partitions a data set into K clusters, with the aim of minimizing the sum of squared Euclidean distances from every observation to chosen cluster centroids \(\mu _i\). Every cluster centroid has the same dimension as the observations. In this study, we aim to cluster temperature profiles day-wise with an hourly resolution. Hence, every cluster centroid \(\mu _i\) and observation \(x_j\) is represented as a 24-dimensional vector, whereby each dimension represents one hour. After the initiation of cluster centroids and the respective assignment of observations to a specific cluster by minimization of the Euclidean distance, each cluster centroid is iteratively redefined as \(\mu _{i*}\), by calculating the mean of all observations \(M_I\) that are assigned to cluster centroid \(\mu _i\):

$$\begin{aligned} \mu _{i*}=\frac{1}{M_I} \sum _{m=1}^M x_m \end{aligned}$$

Overall, the k-means algorithm is based on the following steps:

  1. 1

    Initialization: we randomly choose K cluster centroids.

  2. 2

    Assignment of observations: every observation \(x_j\) is assigned to its nearest cluster centroid \(\mu _i\), based on the Euclidean distance.

  3. 3

    Update of centroids: new cluster centroids \(\mu _{i*}\) are calculated based on Eq. 1.

  4. 4

    Termination: the algorithm is terminated when there are no further changes in partitions. Otherwise, the algorithm is repeated again from step 2 onwards.

Synthetic heat pump profile clustering model: We use the k-means algorithm to cluster daily temperature profiles in K clusters, based on the previously introduced procedure. Thereby, for every day d and every household h, the respective heat pump load profile \(P_{d,h}\) belongs to the respective cluster \(P_{d,h} \rightarrow k\). Therewith, every cluster k has per household its own set of associated heat pump load profiles: \(k_h : \{P_{d,h}, ...\}\). As there are multiple possible temperature measurements, we calculate the correlation between all measurements and the target heat pump load to select the most representative temperature profile as basis for the clustering process.

We can use the fact that every temperature cluster has its own set of associated daily heat pump load profiles to create synthetic profiles (e.g., for new geographic locations) with the following steps:

  1. 1

    K-means initialization: We apply the k-means algorithm on our underlying data set and create k clusters and the associated set of heat pump load profiles.

  2. 2

    Temperature processing: We transform the temperature profile of the desired geographic location and time horizon into 24-dimensional vectors.

  3. 3

    Clustering: We map the daily temperature profiles of the target geography to the previously determined clusters.

  4. 4

    Heat pump profile creation: For every day, we randomly draw a heat pump load profile from the set associated with the respective cluster.

Generally, we are able to apply this procedure for every household h in the set of all underlying households H, to create a synthetic household load profile. Thereby, we can implement different usage patterns and sizes of households and heat pumps. Through the random selection process in step 4, every synthetically generated heat pump load profile based on the same household is unique. To create a synthetic heat pump data set of N households, we can randomly draw N households from our underlying set of households H.

Validation metrics: Our synthetically generated load profiles can be validated from two perspectives. First, they should follow the distribution of the underlying data set (Snoke et al. 2018). Second, they should vary over different iterations, exhibiting a desired degree of diversity. For the first point, we are evaluating key characteristics of the synthetic data for a given test period and validation metrics suggested by Li et al. (2020). In detail, we regard load factors over time, which depict the ratio of mean loads and peak loads, as well as load distribution curves, which depicts the percentage of loads in relation to the mean load. For a detailed introduction of the metrics we refer to Li et al. (2020). Furthermore, we compare the deviation between weekly real and synthetic and heat pump electricity consumption over the regarded test period, and the correlation of synthetic and real profiles, as in Fischer et al. (2016).

Besides the quality of the synthetically generated load profiles, we want to evaluate the diversity of the generated load profiles. Fischer et al. underline the importance of diversity in synthetic load profiles to avoid aggregations of peak loads (Fischer et al. 2016). To analyze the diversity of our generated profiles, we compare the mean variance of L synthetically generated heat pump load profiles. Then, for every time step t in the test period, the variance over all synthetically generated profiles \(\sigma _L^2\) is calculated. Finally, we calculate the mean variance MV over all time steps and synthetic load profiles:

$$\begin{aligned} \textrm{MV} = \frac{1}{T} \sum _{t=1}^{T} \sigma _L^2 \end{aligned}$$

For robust results, we compare the aggregation of synthetic and real loads for all underlying households H in the respective test period, \(P_{agg, gen}\) and \(P_{agg, real}\). To analyze the diversity of generated load profiles, we create a high number of L and evaluate the MV of \(P_{agg, gen}\).

Cluster number selection: Previous studies, such as Jesper et al., use the elbow method to find the optimal number of K clusters that yields additional input and low distortion within clusters. However, the elbow method is solely focused on the temperature clustering itself. In this study, we adopt a broader view when selecting the optimal number of clusters: an increase in the number of clusters reduces the number of associated heat pump profiles per cluster and thereby makes the clustering model more deterministic and less diverse. When the number of clusters K equals the number of days in the data set D, every synthetically generated heat pump load profile for a day d and houeshold h is similar. Hence, we argue to compare the accuracy of synthetic profiles with the reached diversity to find the optimal cluster number K.

Case study

We apply the methodology introduced previously on the data set of Schlemminger et al. (2022). The data set consists of household and heat pump load measurements for 38 single-family homes in Hamelin, Germany, for the period of time between May 2018 and the end of 2020. The households have an average annual household load of 2829 kWh and a heat pump load of 4993 kWh. To the best of our knowledge, the data set is the first of its kind, providing high-quality and high temporal resolution heat pump load profiles. The households in the data set are equipped with water-to-water heat pumps and an additional 6kWh heating rod as backup. Furthermore, the houses are equipped with a 300 liter storage tank. In addition, the households have solar thermal systems installed, which mainly take over the production of hot water during summer. Although this alters the heat pump load profile in summer months compared to heat pump households without additional solar thermal systems, we argue that this constitutes an acceptable pitfall, as most critical load peaks occur in winter months, where the solar thermal systems remain inactive. We also note that the main type of installed heat pumps in Germany are air-to-water heat pumps (BWP 2023), which can exhibit different load profiles and react differently on cold temperatures. However, we argue that due to comparable coefficients of performance over different heat pump types, our results can also indicate various other heat pump types, especially in high load winter weeks (Çakır et al. 2013).

We utilize the 21 out of the 38 household load profiles that have no missing data in the period between January 2019 and December 2020 to train our clustering algorithm. We then use the period from May 2018 to the end of 2018 for model testing by creating synthetic heat pump loads and comparing them to the aggregated real loads.

The data set of Schlemminger et al. (2022) includes various temperature features. To ensure that there is a sufficient connection between temperature and heat pump loads, we conduct an initial correlation analysis between the aggregated heat pump load and the temperature features. We observe that the temperature and apparent temperature have the highest correlation with the aggregated heat pump load, as depicted in Fig. 1. Hence, we base our clustering model on the apparent temperature.

Fig. 1
figure 1

Correlation of temperature features with aggregated heat pump load


In this chapter we evaluate our synthetically generated heat pump load profiles. First, we depict the general outcome of the clustering algorithm. Second, we compare the aggregated synthetic heat pump profiles for the test period with their actual values. For this period, we also calculate the previously introduced validation metrics. Third, we compare the interplay of accuracy and diversity over a range of possible K clusters to indicate the optimal number of clusters. Finally, we create synthetic data for exemplary cities in Germany to show the generalizability of our approach. In showing the internal validity of the synthetically generated load profiles, rather than benchmarking our model to other possible models, we follow the overall approach of other notable works in the field of synthetic load profile generation (Li et al. 2020).

Clustering: Figure 2 shows the result of the clustering process, based on \(K=10\) clusters. We observe ten different centroids of the K-means model, based on varying temperature profiles. Therewith associated, we illustrate the corresponding mean heat pump load profiles. For instance, centroid 9 exhibits constantly negative temperatures. The corresponding load profile of the centroid also exhibits the highest heat pump loads with especially high peaks in the afternoon, which would also expected rationally. Overall, we observe that most mean heat pump load profiles have afternoon peaks. Centroid 9, having the highest temperatures, has mean heat pump loads close to 0W.

Fig. 2
figure 2

Clustering results with \(K=10\)

Validation: To validate the results of our clustering process, we create synthetic heat pump load profiles for all considered households in 2018 and compare them with the real heat pump load profiles that were unseen in the training process of the k-means algorithm. Figure 3 presents the results of this comparison. The synthetic heat pump profiles match the real profile, especially showing low loads during summer time and increasing heat pump loads during colder winter time. This is confirmed in Fig. 4, where weekly synthetic versus real weekly heat pump energy consumption is displayed. After repeating the synthetic creation of all considered households 50 times and comparing the heat pump energy consumption over the whole testing period with the actual consumption, we find a relatively low error of 2.4% We conclude that during the test period and for the Hamelin data set, the synthetic heat pump load profiles are matching the shape of the real observed profiles.

Fig. 3
figure 3

Synthetic versus real heat pump load profile for all considered households

Fig. 4
figure 4

Synthetic versus real weekly heat pump energy consumption for all considered households

We further validate our synthetic heat pump generation process according two metrics from Li et al. (2020), namely load factors and load distribution curves. In Fig. 5a, we depict the load factors over time, comparing the real and synthetic heat pump load profiles. The load factor depicts the ratio of monthly average loads and peak loads. We can observe that the overall shape of the synthetic data represents the real load profile well. Especially higher load factors in winter months are depicted correspondingly. In Fig. 5, we depict the load distribution curves of the real and synthetic data, which show the percentage of load being at different values, relative to its mean values. Both curves follow the same pattern. Setting our results side by side with the results of Li et al. (2020), the deviation of load factors and load distribution curves of synthetic and real profiles is comparable, although the overall structure of the metrics varies significantly, since Li et al. generate bus-level electricity load profiles of power grids.

As another validation metric, we regard the Pearson correlation coefficient. For our test case, the Pearson correlation between real and synthetic profiles lies at 0.88. For comparison, in Fischer et al. (2016), where synthetic energy demand profiles are created with a stochastic bottom-up method, the correlation lies only slightly higher at 0.92.

Overall, comparing the various depicted validation metrics, we come to the conclusion that our synthetically generated heat pump load profiles match the distribution of the underlying data well and can be used for generating synthetic heat pump load time series.

Fig. 5
figure 5

Validation of synthetically generated heat pump load profiles

Accuracy vs. diversity: To find the optimal amount of clusters, we focus on a comparison of accuracy, in terms of the error of the heat pump energy consumption during our testing period, and the diversity of the generated load profiles, represented by the previously introduced mean variance (MV). To reduce the impact of stochastic effects, we calculate the average of both metrics for 50 generated synthetic heat pump profiles during our testing period for all possible numbers of clusters. Then, we scale down the metrics to a range from 0 to 1 to ensure comparability. Figure 6a, depicts the results of this analysis. We find that a low number of clusters leads to high annual consumption errors, providing inaccurate load profiles. On the other side, a small number of clusters comes with a high degree of variance and therewith connected diversity in load profiles. We suggest to work with up to 10 clusters for a balance trade-off between accuracy and diversity. In comparison, using the elbow method, we would work with two clusters, which correspond to the recommended trade-off between distortion and the amount of clusters, as depicted in Fig. 6b. However, as shown in the previous graph, this would lead to a relatively high annual error of the associated synthetic heat pump load profile generation process. Hence, we recommend to regard the trade-off between mean variance and the error of the generated profiles, instead of the clustering-focused elbow method.

We note that our approach is subject to stochastic effects and the underlying data set, although we aimed to reduce the stochastic effects by regarding the average results of multiple runs. We recommend a case-specific selection of the number of clusters K, while in our case study using up to 10 clusters yield a good trade-off between mean variance and accuracy.

Fig. 6
figure 6

Cluster amount selection

Transferability: We create synthetic load profiles for two German municipalities, Kuehnhaide and Koeln-Stammheim, for the year 2019, according to our previously introduced model. Kuehnhaide, with a mean temperature of 7.6\(^{\circ }\)C in 2019, belongs to the coldest German regions, whereas Koeln-Stammheim, with a mean temperature of 11\(^{\circ }\)C, belongs to the warmest regions in Germany. Figure 7 presents the transferability of our model and shows that the synthetic aggregated load profile in Kuehnhaide exhibits significantly higher peaks than the profile of Koeln-Stammheim, which would also be expected rationally. Furthermore, the overall base heat pump load level in Kuehnhaide is constantly higher than in Koeln-Stammheim. We interpret this observation as indication that we can use our approach for the synthetic generation of representative heat pump load profiles in other locations with comparable temperature profiles and building structures as in Hamelin, e.g. in Germany, Austria or Switzerland. However, we also make the observation that days with particular high heat pump loads in Kuehnhaide follow a more similar load pattern than in Koeln-Stammheim. This might indicate that load profiles on these days are drawn from a low temperature cluster with fewer observations. Hence, the publication of further open-source heat pump load data sets, especially from regions with colder temperatures, could contribute to the overall quality of synthetic heat pump load profile generation approaches.

Fig. 7
figure 7

Aggregated synthetic heat pump load profiles of 100 households in Kuehnhaide and Koeln-Stammheim


This work presents a k-means based model to generate synthetic heat pump load profiles. We show that the synthetically generated data follows the structure of the real data according to load factors, load distribution curves and the Pearson correlation coefficient, which underlines the applicability of the proposed heat pump load profile generator for power system simulations. We suggest to choose the number of clusters for the model by comparing accuracy of generated profiles and their diversity, expressed by the mean variance of generated load profiles. Future research can work on alternative synthetic load profile generation methods, such as Generative Adversarial Networks and benchmark them with the presented method. Furthermore, future studies may also apply the presented methodology on data sets of other heat pump types and use time series transformation techniques to increase the diversity of the synthetically generated heat pump load profiles.

Availability of data and materials

The underlying data set is publicly available at Schlemminger et al. (2022). The underlying source code is made public as open-source publication at The web-based heat pump profile generator is available at


  1. Web-based heat pump load profile generator available at Source-code available at


  • Azad SA, Ali AS, Wolfs P (2014) Identification of typical load profiles using k-means clustering algorithm. In: Asia-Pacific World Congress on Computer Science and Engineering, pp. 1–6. IEEE

  • (BWP) MW (2023) Wärmepumpenabsatz 2022: Wachstum Von 53 Prozent Gegenüber dem Vorjahr. Accessed 01 Feb 2023

  • Çakır U, Çomaklı K, Çomaklı Ö, Karslı S (2013) An experimental exergetic comparison of four different heat pump systems working at same conditions: As air to air, air to water, water to water and water to air. Energy 58:210–219

    Article  Google Scholar 

  • (DW) D (2023) How Germany plans to phase out oil and gas heating. Accessed 11 Mar 2023

  • El Kababji S, Srikantha P (2020) A data-driven approach for generating synthetic load patterns and usage habits. IEEE Trans Smart Grid 11(6):4984–4995

    Article  Google Scholar 

  • Fischer D, Härtl A, Wille-Haussmann B (2015) Model for electric load profiles with high time resolution for german households. Energy Build 92:170–179

    Article  Google Scholar 

  • Fischer D, Wolf T, Scherer J, Wille-Haussmann B (2016) A stochastic bottom-up model for space heating and domestic hot water load profiles for german households. Energy Build 124:120–128

    Article  Google Scholar 

  • Jessen SH, Ma ZG, Wijaya FD, Vasquez JC, Guerrero J, Jørgensen BN (2022) Identification of natural disaster impacted electricity load profiles with k means clustering algorithm. Energy Inform 5(4):1–29

    Google Scholar 

  • Li H, Yeo JH, Bornsheuer AL, Overbye TJ (2020) The creation and validation of load time series for synthetic electric power systems. IEEE Trans Power Syst 36(2):961–969

    Article  Google Scholar 

  • MacQueen J (1967) Classification and analysis of multivariate observations. In: 5th Berkeley Symp. Math. Statist. Probability, pp. 281–297. University of California Los Angeles, LA, USA

  • Maranghi F, Gosselin L, Raymond J, Bourbonnais M (2023) Modeling of solar-assisted ground-coupled heat pumps with or without batteries in remote high north communities. Renew Energy. 207:484–498

    Article  Google Scholar 

  • Panapakidis IP, Christoforidis GC (2017) Implementation of modified versions of the k-means algorithm in power load curves profiling. Sustain Cities Soc 35:83–93

    Article  Google Scholar 

  • Pillai GG, Putrus GA, Pearsall NM (2014) Generation of synthetic benchmark electrical load profiles using publicly available load and weather data. Int J Electr Power Energy Syst 61:1–10

    Article  Google Scholar 

  • Pinceti A, Kosut O, Sankar L (2019) Data-driven generation of synthetic load datasets preserving spatio-temporal features. In: 2019 IEEE Power & Energy Society General Meeting (PESGM), pp. 1–5. IEEE

  • Protopapadaki C, Saelens D (2017) Heat pump and pv impact on residential low-voltage distribution grids as a function of building and district properties. Appl Energy 192:268–281

    Article  Google Scholar 

  • Ruhnau O, Hirth L, Praktiknjo A (2019) Time series of heat demand and heat pump efficiency for energy system modeling. Sci Data 6(1):189

    Article  Google Scholar 

  • Schlemminger M, Ohrdes T, Schneider E, Knoop M (2022) Dataset on electrical single-family house and heat pump load profiles in Germany. Sci Data 9(1):56

    Article  Google Scholar 

  • Snoke J, Raab GM, Nowok B, Dibben C, Slavkovic A (2018) General and specific utility measures for synthetic data. J R Stat Soc Ser A (Statistics in Society) 181(3):663–688

    Article  MathSciNet  Google Scholar 

  • Yang Z, Yang F, Min H, Tian H, Hu W, Liu J, Eghbalian N (2023) Energy management programming to reduce distribution network operating costs in the presence of electric vehicles and renewable energy sources. Energy 263

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations



LS, PJ and CW conceived and developed the general idea of the paper. LS developed and implemented the synthetic load profile generation model. All authors read and approved the final manuscript.

About this supplement

This article has been published as part of Energy Informatics Volume 6 Supplement 1, 2023: Proceedings of the 12th DACH+ Conference on Energy Informatics 2023. The full contents of the supplement are available online at

Corresponding author

Correspondence to Leo Semmelmann.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Semmelmann, L., Jaquart, P. & Weinhardt, C. Generating synthetic load profiles of residential heat pumps: a k-means clustering approach. Energy Inform 6 (Suppl 1), 37 (2023).

Download citation

  • Published:

  • DOI: