Identification of natural disaster impacted electricity load profiles with k means clustering algorithm

Natural disasters threat the resilience of the electricity system. However, little literature has investigated the electricity system’s recovering process and progress after natural disasters’ hit which strongly influence the system operators’ planning and quality of the security of supply for the electricity customers. To fill the research gap, this paper applies an unsupervised machine learning method, the k means clustering algorithm, to investigate the normal/abnormal electricity load profiles, identify natural disaster-and electrical fault-impacted electricity load profiles with a case study of the Lombok electricity system, Indonesia, and ½-hourly electricity load data from 2015 until 2021. The results show that electricity consumption in Lombok has increased over the years, which match the installed production capacity of Lombok. The results prove that the disturbance-induced electricity load patterns and especially natural disaster-impacted load profiles can be identified by the k means clustering algorithm. Especially, the pre-, during, and post-natural disaster impacted load patterns can be portrayed. Furthermore, the investigation results regarding the impacts of natural disasters and electrical faults on the performance of the electricity system, show that the natural disaster-induced load reductions and electrical fault-induced load reductions differ from the short and long-term perspectives. Moreover, the results can facilitate the electricity system operators to better understand the load patterns, predict ND strikes’ impact on the electricity system and conduct better long-term energy management strategies.

), e.g., hospitals, to function in island microgrid mode and paralyze the cyber security infrastructure. One of the major challenges to the security of the electricity supply is natural disasters, which have led to severe economic, environmental, and technical consequences materialized in the occurrence of intentionally generated brownouts (by the electricity operator) or the worst case, uncontrollable blackouts that shut down an entire society for a few minutes to days, or even longer (Ma 2022). Many countries in the world, especially located on islands, e.g., Japan, the Philippines and Indonesia, strongly suffer from natural disasters such as earthquakes, volcanic eruptions and floodings.
The climate resilience of any given electricity system is subject to potentially devastating stress in the events of natural disasters, especially in the post-natural disaster days when the available electricity generation capacity can be limited due to key component damages. Climate resilience (International Energy Agency (IEA) 2021) is defined as "The ability of the system and its component parts to absorb, accommodate and recover from both short-term shocks and long-term changes. These shocks can go beyond conditions covered in standard adequacy assessments".
The root causes of the main challenge with the degradation of climate resilience are multifold. Climate resilience is subdivided into three dimensions that accumulated govern the overall performance of the electricity system. Therefore, to solve the climate resilience challenge, minimum three dimensions (robustness, resourcefulness, and recovery) must be investigated and analyzed. The challenges with deterioration of the climate resilience in multiple electricity systems can be divided into three aspects (the pre-, during-, and post-natural disaster strike): Pre ND strikes-long-term impacts of climate change; During ND strikes-existing infrastructure (outbreak of disruption) and emergency response actions (outbreak of disruption): Post ND strikes-recovery phase (restore the system's functionalities).
The long-term impacts of climate change on electricity systems and the existing infrastructure's resilience toward natural disasters are to a great extent already investigated in the literature (Organization for Security and Co-operation in Europe (OSCE) 2016; Korbatov et al. 2017). The recovery of the electricity system from a non-steady-state to a steady state is also a point of action, that the majority of electricity system operators are experienced in. The determination and prediction of the amount of emergency response (load shedding or production curtailment, etc.) capacity that should be aggregated and deployed during the unfolding of a natural disaster, is only partially covered in the existing literature. Furthermore, little literature has investigated the correlation between the size of a natural disaster strike and the corresponding decrease in electricity system performance after the strike.
A great amount of literature has deployed machine learning techniques to investigate and discuss how a "normal" residential load profile (without any significant variabilities in the profile) can be clustered into different groups (weekday, weekend, etc.) (Zhong et al. 2016;Damayanti et al. 2017;Vanting et al. 2021). However, no literature has identified natural disaster impacted electricity load profiles to compare the power load reduction (performance decline) with the causing ND size and type. Furthermore, little literature has investigated the electricity system's recovering process and progress after natural disasters' hit which strongly influence the quality of the security of supply for the electricity customers, and system operators' planning and decision making (Christensen et al. 2019).
Therefore, this paper aims to investigate the load profile characteristics and identify natural disaster impacted electricity load profiles based on the k means clustering algorithm with the electricity load profile for Lombok Island, Indonesia. Compared to the fault detection methods that mainly focus on detecting the defined faults, k means clustering algorithm is used to identify dynamics in behaviours ranging over several consecutive time stamps. Thus, variations in pattern dynamics can be captured with the deployment of the k means clustering algorithm.
K means clustering algorithm is a widely used method for clustering electricity load profiles, especially for residential load profiles, paving the way for clustering between weekends, weekdays, and holidays. However, no literature in load profile clustering has investigated the differentiation between normal and abnormal load profiles caused by natural disasters. By applying the K means clustering algorithm, this paper aims to identify natural disaster clusters and post-natural disaster clusters and capture the characteristics of natural disaster events.
The application of the k means clustering algorithm is based on the 2015-2022 electricity load profile with an ½-hourly timestep for the entire island of Lombok, Indonesia. The value proposition of this exact power load profile compared to other load profiles (often purely residential), is the aspects of natural disaster occurrences that have had a direct influence on the load consumption variation. Therefore, this case study data can potentially provide valuable insights into how the electricity system is controlled pre-, during-and post-natural disaster events. This paper firstly introduces the background and related works of natural disasters and their impacts on the electricity systems, with a particular focus on the discussion of the application of the clustering algorithm on electricity loads. In the methodology section, five steps (data collection and preparation, clustering algorithm selection and implementation, yearly, monthly, daily distribution analyses, cluster labelling and electricity system climate resilience capability evaluation) are introduced. In the case study section, the electricity system of Lombok is introduced. The results section is divided into four parts followed by the discussion and conclusion sections. In the results section, firstly, clustering results for the whole period of 2015-2022 are introduced, and then the clustering results of years without significant earthquake impact are discussed. Furthermore, the clustering results of years with significant earthquake impact are introduced and the electrical fault-affected load profiles is presented.

Background
Well-functioning electricity systems are a prerequisite for a modern society (Billanes et al. 2018). Therefore, the electricity supply chain, which spans from production, transmission, distribution, and consumption of electricity, is deemed a critical infrastructure (Agency 2015; Qingnan Li 2016). Due to the complex operation of electricity systems, the overall task of securing high security of supply is a top priority. Each part of the electricity supply chain involves several key components, such as generation units, transformer stations, high-voltage transmission lines and low-voltage distribution lines (Ma et al. 2021a). Each component can potentially be exposed to a great variety of different natural-, biological-, technical-, human-unintended-or human-intendedhazards (Organization for Security and Co-operation in Europe (OSCE) 2016). This implies that the performance of any electricity system can be quantified based on various metrics within different parts of the supply chain segments. Figure 1 shows how a disruption, which is defined as an interruption in the usual way that a system or process works (Cambridge Dictionary 2022), to the electricity system can be segmentized into technical hazards (non-natural dependent) and natural hazards. A more complex hazard categorization can be found in Organization for Security and Co-operation in Europe (OSCE) (2016).
Technical hazards can be caused by electrical faults, which are defined as any abnormal condition of a system that involves electrical failure of equipment (Electronics Hub 2022). Thus, failures or power outages on power plants and malfunctioning transformer stations can be characterized as electrical faults. A natural hazard is a phenomenon that may cause property damage, social and economic disruption or environmental degradation (International Federation of Red Cross and Red Crescent Societies 2020).
The impact of natural hazards materialized through earthquakes, volcanic eruptions and storms are among some of the most impactful hazards on the performance of the electricity system. According to Korbatov et al. (2017), the U.S insurance industry has identified a USD 20-55 billion annual financial loss from power outages induced by flooding, hurricanes and extreme temperatures. The climate changes accelerate the severity of these natural disasters, which impose additional stress on the electricity system (International Energy Agency (IEA) 2021; Organization for Security and Co-operation in Europe (OSCE) 2016; Korbatov et al. 2017).
A great number of stakeholder groups are affected during a natural disaster strike and within the recovery period (Ma 2019). An "affected" stakeholder group is the industrial and commercial actors that execute business activities within the given electricity system  . Electricity is a prerequisite for optimal operational achievements, yielding that poor climate resilience can result in enormous economic losses for industrial and commercial companies, exemplified in DAMVAD (2015), which tries to quantify economic losses during electricity interruptions. Another stakeholder group, which is highly reliant on accessibility to a secure electricity supply, is the tourism industry, where both short-(days) and long-term (months) economic effects can be experienced after a natural disaster event (Tangkudung 2018). Hotels and resorts geographically located in hot and humid environments consume substantial amounts of electricity, allocated for air-conditioning, lighting, Etc. Thus, electricity availability must be ensured at all times.
One main challenge is the High-impact low-probability (HILP) natural disasters that can have a big effect on the electricity system performance. HILP natural disasters potentially can wipe out an entire electricity system infrastructure (Lee and Preston 2012). The marginal cost of ensuring a resilience level that can tackle HILP events is enormous, yielding that it is not cost-effective to design an electricity system that can mitigate HILP events. This implies that an electricity system should be designed to handle Low-impact high-probability (LIHP) natural disasters, which can be effectuated with the deployment of emergency response actions. Emergency response actions, such as power production up-and downscaling, electricity grid import/export enabling and load shedding, can aid the electricity system operation during a natural disaster disruption, and thereby potentially avoid a complete blackout of the electricity system.
To quantify the performance of an electricity system during the unfolding of disruption events (see Fig. 1) and particularly natural disasters, the concept of climate resilience can aid to unlock value propositions. The climate resilience concept is introduced in International Energy Agency (IEA) (2021) as shown in Fig. 2. The figure presents three elements that characterize how a natural disaster affects an electricity system: • Impact size/magnitude: The impact of the natural disaster strike on the electricity system (how big the decrease in "performance" is) Fig. 2 The concept of climate resilience (International Energy Agency (IEA) 2021) • Disruption duration: The duration of the natural disaster strike (how long time the electricity system is decreased in "performance") • Recovery duration: The duration at which the electricity system returns from nonsteady-state to a steady-state (how long time the electricity system is increased in "performance", before returning to the pre-natural disaster state).

Related works
Natural disasters' impact on the planning and operation of electricity systems worldwide has gained a lot of interest in recent years with the transformation towards electricityand sustainable-founded societies (International Energy Agency (IEA) 2021; Korbatov et al. 2017;Nicolas et al. 2019;Engineering 2021;Karagiannis et al. 2017;Waseem and Manshadi 2020). A natural disaster's impact on the electricity system can facilitate a load consumption drop. Therefore, when a natural disaster strike is observed simultaneously with a load consumption reduction, this is interpreted to be a direct diminishing of the electricity system's performance. There has been a great focus on the vulnerability of the electricity system's supply chain segments to natural disasters/hazards (Nicolas et al. 2019;Karagiannis et al. 2017). With the accelerating climate changes and frequently-occurring unpredictable weather patterns, literature reviews and case studies have been utilized to outline the consequences of natural hazards, and to quantity potential solutions to mitigate the effect of natural hazards on electricity systems (Organization for Security and Co-operation in Europe (OSCE) 2016; Waseem and Manshadi 2020). Whereas public institutions and organizations have made valuable contributions to natural disasters' impacts on electricity system supply chain segments, academia has dived into clustering and identification of electricity load profiles. The range of literature regarding the identification and clustering of electricity load profiles has employed a great variety of methods, including extreme points and demographic characteristics (Jeong et al. 2021), fuzzy c means and k harmonic means (Damayanti et al. 2017), Hybrid Load Profile Clustering algorithm (HLPC) (Zhong et al. 2016) and self-organizing maps (SOM) (Toussaint and Moodley 2020). The application domain of the clustering techniques has proven to span with great variation from accumulated daily load profiles for an entire electricity system to specific electricity consumer identification and separation (Damayanti et al. 2017;Jeong et al. 2021). As stated previously, HILP-events could devastate an entire electricity system, and thereby no power supplies to electricity consumers during a certain period (Lee and Preston 2012; Waseem and Manshadi 2020). However, little literature has investigated how electricity load profiles affected by HILP-events.
Regarding specific utilization of the k means clustering algorithm for clustering and identification of electricity load profiles, there have been a great focus on especially two types of electricity consumers, namely the residential (Damayanti et al. 2017; Toussaint and Moodley 2020; Amri et al. 2016) and industrial (Richard et al. 2017) sectors, to detect and classify consumer behaviors and irregular industrial process patterns, respectively.
The k means clustering algorithm is an easy-implementable algorithm, whose objective is to aggregate electricity load profiles with similar characteristics into one cluster and electricity load profiles that have different characteristics into other clusters. The method aims to minimize the within-cluster sum-of-squares (which is also known as the inertia metric) (Scikit learn 2022).
The parameter n_clusters is a key parameter to input in the development phase of the k means object. It should not be set randomly (Brus 2021). However, (Richard et al. 2017) claims that 6 clusters are generally enough to capture the range of electricity load profiles for processes related to small-and medium industries. To ensure a research-based selection of the optimal amount of clusters, the inertia method can be invoked, which facilities the detection of the "elbow point" (Amri et al. 2016). The inertia method calculates the squared sum of distances of the samples to their closest cluster centers, for a user-defined range of potential cluster numbers (i.e., ranging from 1 to 20 clusters). This implies that a human-based evaluation is conducted to decide on the number of clusters where the marginal benefit of increasing with one extra cluster is negligible. Another method to determine the optimal amount of clusters is the Davies-Bouldin Index (DBI) score (Damayanti et al. 2017). The score is a measure of the average similarity of each cluster with a cluster most similar to it. Thus, the clustering algorithm has a great performance if a low DBI score is obtained since that reflects a good separation of the clusters.

Methodology
The methodology applied in this paper mainly consists of an explanatory data analysis (raw data investigation) and the application of the k means clustering algorithm. Five steps are applied for the identification of natural disaster-impacted electricity profiles: 1) Data collection and data preparation 2) Clustering algorithm selection and implementation 3) Seasonal (yearly, monthly, daily) distribution analyses 4) Cluster labelling 5) Electricity system climate resilience capability evaluation

Data collection and data preparation
The paper firstly lists all potential data needed for the experiment based on the data ecosystem concept (Ma et al. 2021b) to ensure all essential data is collected. Therefore, besides the electricity load data, this research also collects natural disaster data. On monthly (or even higher if possible) resolution, the type of natural disaster, the number of houses broken due to the disaster, the number of refugees due to the disaster, etc., are highlighted in a table form, provided by the National Disaster Management Agency (2022).
A thorough examination is carried out for each month throughout the entire dataset. The types of natural disasters could be flooding, tsunami, volcanic eruption, landslides, earthquakes, etc. Historically, earthquakes have had a significant impact on the performance of electricity systems in near-epicenter electricity networks (Karagiannis et al. 2017;Agency 2022). Therefore, the main focus is allocated towards earthquakes since they are anticipated to have the most significant impact on the electricity system. Furthermore, it is vital to collect the time instance of the earthquake strike, the Richter scale severity value, and the epicenter location.
Before performing the k means clustering algorithm, exploratory data analysis (EDA) techniques (Dwivedi 2021) are applied to the detection of missing values throughout the dataset. Throughout the 2012-2014-time duration, major chunks of missing data were identified. Due to the severity of the missing data, this fraction of the entire dataset was excluded in the forward data analysis investigation. For the period 2015-2022, the dataset was deemed to not contain any Not a Number (NaN) values. Therefore, the 2015-2022 dataset duration was the chosen amount of data to proceed with for further analysis and k means algorithm application. The dimensions of the 2015-2022 dataframe are 2589 rows and 49 columns.

Clustering algorithm selection and implementation
The selection of a feasible method for identification of normal/abnormal daily electricity profiles is an important decision to effectuate. The initial dataset contains no feature column with a "healthy/unhealthy" or "normal/abnormal" designation. Thus, an unsupervised machine learning (ML) technique is deemed feasible. In the category of unsupervised ML techniques, clustering has contributed with valuable insights for partitioning data into groups, or clusters. The most well-known and prominent clustering techniques are categorized as either partitional, hierarchical or density-base clustering. Within partitional clustering, data objects are divided into nonoverlapping groups/clusters, yielding that no data objects (daily electricity load profiles) can be a member of more than one cluster (Arvai 2022). Therefore, partitional clustering is justified to be a suitable solution for truly identifying and fully distinguishing normal and abnormal electricity load profiles from each other. The k means clustering algorithm is categorized as partitional clustering. Furthermore, it is one of the oldest and most straightforward algorithms to deploy. Based on those considerations, the k means clustering algorithm is selected as an effective technique to solve the problem.
The deployment of the k means clustering algorithm was performed with the Pythonembedded package, scikit learn, in the Jupyter Notebook. The original dataset was loaded into a Jupyter Notebook (Python) for the remaining data analysis activities. To unlock the full potential of the Jupyter Notebook's features, the original dataset was inputted into a dataframe format.
The MinMaxScaler is applied since it preserves the shape of the original distribution, so the information embedded in the original data is ensured (Gogia 2019). The MinMax-Scaler subtracts the minimum value in the feature (electricity consumption in a specific ½-hour time stamp) and then divides it by the entire range, where the range is the difference between the original maximum and original minimum electricity consumption value.
Dependent on the number of applied clusters and the extent of the load profile (full dataset length or a year of data), the outputs of the clustering algorithm are limited since a small number of clusters yield a rough separation of the daily load profiles. To decide on the optimal number of clusters, the elbow point method is utilized which can help to evaluate the most cost-effective number of clusters.

Seasonal (yearly, monthly, daily) distribution analyses
The load profile clustering is effectuated on two different load profile durations: the entire load profile duration (all years included) and a year-by-by load profile duration (each year separately). The year-by-year load profile clustering is expected to reveal precise variations in the load profile and might even classify clusters beyond a normal and abnormal differentiation resolution. The load profile clustering analysis is divided into four parts: • Entire dataset time duration clustering (2015-2021) to understand the load profiles overtimes • Yearly based electricity load clustering for years without significant earthquake impact to identify clusters that capture normal load profiles (i.e., weekday-, weekend-, holiday-separation) • Yearly based electricity load clustering for years with significant earthquake impact. It is to capture the natural disaster impacted load profiles. • Investigation of the differences between electricity-faults impacted load profiles with the natural-disasters impacted load profiles The outcome of the application of the k means clustering algorithm, is an allocation of each daily electricity load profile to a distinct cluster. After filtering each specific cluster's electricity load profiles from the original dataset, the electricity load profiles are visually illustrated to manually evaluate potential similarities/differences. Additionally, a yearly, monthly, and daily (weekend/weekday overview) distribution analysis is conducted to ease the cluster label/tag allocation process. Thus, the labelling of the clusters can be differentiated between normal and disturbance-impacted clusters.
In this paper, a normal daily electricity load profile is defined as an electricity profile that either (1) with high significance resembles the average daily electricity load profile (The average daily electricity load profile for the time duration 2015-2021 is presented in the Case study section) of the specific year with regard to shape and magnitude; or (2) Is a product of electricity consumer behavior variations (i.e., holiday or special cultural/religious traditions). The opposite of a normal daily electricity profile is an abnormal profile. Abnormality is defined as "different from what is usual or average, especially in a way that is bad" (Cambridge Dictionary 2022).Therefore, in this paper, the definition of an abnormal electricity load profile is a load profile that either (1) experiences few significant power consumption drops; or (2) contains consecutive low-magnitude power consumption values non-consumer behaviour induced. Based on the definitions, i.e., lower electricity consumption behaviour in the weekends, or increased electricity consumption during special, cultural events, are determined to give rise to normal daily electricity load profiles. In contrast, consecutive low-magnitude power consumption values, that via the seasonal distribution analysis has failed to correlate the dynamics with a certain consumer behaviour, are determined to be abnormal electricity load profiles.

Cluster labelling
The flowchart in Fig. 3 illustrates the process of allocating feasible labels/tags for the identified clusters. The figure demonstrates that the first activity is to perform the seasonal distribution analysis. Based on that, if the majority of the load profiles are evaluated to be normal load profiles (in accordance with the definition above), a normal cluster is identified. Thus, the daily-and monthly distribution analysis will yield a foundation for a proper cluster label/tag allocation for the normal cluster. However, if the if the specific cluster does not have a majority of normal load profiles embedded in the cluster, an overweight of abnormal electricity load profiles is present. Ergo, a disturbance-impacted cluster is defined. The sub-components of a disturbanceimpacted cluster are: 1) A natural disaster-impacted cluster 2) An electrical fault-impacted cluster (independent of a natural disaster strike).
The criteria for deciding on a natural disaster-impacted cluster allocation is that a natural disaster strike has been identified to have occurred simultaneously. The labelling of the natural disaster-impacted clusters should originate from either the pre-, during-or -post dimensions of climate resilience, as it was illustrated in Fig. 2. If a natural disaster does not occur simultaneously, an electrical fault-impacted cluster is identified. The electrical fault can then have various reasons to have arisen, i.e., a component outage on a power plant, or a malfunctioning transformer station.

Electricity system climate resilience capability evaluation
After the identification of the clusters that contain natural disaster impacted load profiles, the characteristics governing the three dimensions of climate resilience, which are robustness, resourcefulness, and recovery of the electricity system are analyzed. A plot that contains the daily load profiles for pre-, during and post-natural disaster strikes is designed based on the components of climate resilience (shown in Fig. 2): • Outbreak of disruption: The time instances at which the natural disaster or disruption is recorded to have occurred. • Impact size: The power consumption decrease in [MW] from the time instance for the disruption to minimum power consumption is observed. • Disruption duration of decline: The time duration from disruption to minimum power consumption. • "Time at the bottom": The time duration that passes by from minimum power consumption is obtained until the recovery phase is initiated. • Recovery duration: The time duration from the initialization of the recovery phase until the steady-state operation is achieved. • Average restoration rate: The slope of the power consumption curve during the recovery phase. This is equivalent to the ramp rate of the power production units since electricity production and consumption should be balanced.

Case study
Indonesia is highly exposed to natural disasters, yielding that the island of Lombok has been selected as a case study area. A ½-hourly electricity consumption profile from 2015 to 2022 provided by Indonesian partners lay the foundation for the investigation of the impact of natural disasters on the daily electricity profiles, where the peak power  Fig. 6. The original data contains power consumption measurements in ½-hour resolution, in [MW], for the entire island of Lombok from January 1 st , 2012, until and included February 1 st , 2022. The data originates from the Indonesian state-owned electricity utility company, PLN-Nusa Tenggara Barat (NTB). The average daily electricity load profile for Lombok electricity system from 2015 to 2021 is visualized in Fig. 7.

Results
The results section is divided into four parts: • Clustering of the entire dataset of 2015-2021 • Electricity load clustering for years without significant earthquake impact • Electricity load clustering for years with significant earthquake impact • Comparison of electricity-faults impacted load profiles with the natural-disasters impacted load profiles

2015-2021 electricity load clustering
Based on the results of applying the inertia/elbow on the entire dataset of 2015-2021, the most optimal number of clusters was exactly 7 for this time range (as shown in Fig. 8). The load profile clustering for the entire 2015-2021 dataset is shown in Fig. 9. Furthermore, the dataframe that stores the cluster number, color designation and number of samples (in the column "cluster"), is shown in Fig. 10. Following the seasonal distribution analysis for each of the identified clusters, the results indicated that each cluster approximately contains one year of electricity load profiles, and the power consumption has increased over the years. For instance, the yearly distribution of the daily electricity profiles in cluster 2 (lowest consumption cluster) and cluster 5 (highest consumption cluster) is shown in Figs. 11 and 12, respectively. These figures reveal that 86% of the daily load profiles in cluster 2 originate from 2015, whereas cluster 5 consists of almost 80% of 2021 and 2022 data. Furthermore, Cluster 3 (lime) only had 100 samples, which are illustrated in Fig. 13. This cluster contains the Ramadan periods for four consecutive years from 2018 until and including 2021.
Years with normal and disturbance-impacted clusters are identified based on the seasonal distribution analyses of the individual yearly clustering for 2015 to 2021. 2019 and 2021 have clusters that are all labelled as normal, 2015 to 2018 and 2020 have at least one labelled disturbance-impacted cluster.    Seven clusters were determined to be the most optimal number of clusters to apply for the 2019 clustering. Figure 15 shows the daily load profiles of 2019 clustering, and the dataframe for the seven clusters is shown in Fig. 16. Furthermore, Fig. 15 shows that the clusters differ significantly from each other during the midday hours when there is a large variation in the amount of electricity consumption. However, during the evening ramp at 7-7.30 pm, all clusters except for one, seem to follow the same tendency with a very steep increase in the electricity consumption. This could reflect that the electricity consumption behavior at 7-7.30 pm is non-seasonal dependent.
The highest consumption cluster in the 2019 dataset on average is cluster 0 which contains 48 samples. Figure 17 shows its monthly distribution, and 83.3% are weekdays and 16.7% are weekends as shown in Fig. 18. The electricity consumption increases significantly during November, October, and December. Therefore, the highest consumption during the year is primarily on weekdays from October to December.
Even though there are no disturbance-impacted clusters identified in the 2019 dataset, there was a 5.6 Richter Scale earthquake event on the island of Lombok on March 17, 2019 (as shown in Fig. 19). The figure portrays how the earthquake affects the load consumption, yielding a significant reduction at 3 pm. However, the recovery phase is quickly initiated yielding that the steady-state operation is rapidly obtained. This emphasizes that even though there have not been identified any disturbance-related clusters, abnormal daily profiles can easily exist within the clusters.  Table 1). Figure 20 shows the electricity consumption and the earthquake events between 2015 and 2021, and Fig. 21 shows the metadata for earthquake events.
Among the four earthquakes, there is no observed electricity consumption decrease due to the earthquake on July 29, 2018. It might be because the epicenter of the earthquake was located in the ocean, off the coast of the northern part of Lombok. For the earthquakes on August 5 and 19, 2018, there were observed significant impacts on  the electricity consumption profile, yielding a 78.3% and 98.9% decrease in the electricity consumption compared to the normal periods. The average restoration rates were determined to be 4.6 and 5.6 [MW/½ hour] for the two earthquakes, respectively. Furthermore, the duration of the recovery process lasted multiple hours. For the 5.6 Richter scale earthquake on March 17, 2019, the impact on the load consumption profile was small.  Load profile for 2018 and the earthquake impacts Figure 22 illustrates the ½-hourly consecutive load profile for 2018 and the dramatic reductions in load consumption that occurred in August. The seven clusters of the daily load profile for 2018 are shown in Fig. 23, and the dataframe for the seven clusters is shown in Fig. 24. The figures illustrate that there are multiple daily profiles with very low magnitudes, especially expressed with the lime and red clusters, which are clusters 6 (the lime cluster) and 2 (the red cluster), respectively. In the load profiles of cluster 2, shown in Fig. 25, it is detectable that the identified profiles are far from the average daily load profile. The monthly distribution of the daily load profiles in cluster 2, shown in Fig. 26, clearly outlines that severe electricity consumption reductions occurred in August 2018. Figure 27 shows that three earthquake strikes unfolded during July/August 2018. A 7.0 Richter Scale earthquake hit the northern part of Lombok at 19:46 (local time) on 5 August 2018, and the electricity load reduced by 78% (from 213.7 [MW] pre-earthquake to 37.5 [MW] upon the earthquake strike). Figure 28 visualizes a close-up of this electricity load reduction and the profile of the three following days. After the earthquake strike, it took almost 50 days (September 25, at 7 PM) to recover to the pre-earthquake  Figure 29 portrays the load profile for the 10 days before and 10 days after the earthquake strike on August 5, 2018. The electricity system does not fully recover within a 10-day range.   Table 2 shows three identified daily load profiles with observed significant power drops (independent of ND events). Figure 30 shows the daily electricity profiles for these three identified days. All three electrical fault events occurred independently from any natural disaster event. In the three electrical fault events, the electricity consumption decreased by approximately 100%. The average restoration rates of the electrical fault events are very identical compared to the restoration rates observed during natural disaster strikes as shown in Table 1. However, the duration of time for the recovery phase varied a lot. The electrical fault on 15 October 2020, was significantly different from the two other electrical faults, that materialized in a 5-h blackout incident. The power consumption of 166.53 [MW] was reached 11 h later, which emphasises that the electricity system was quickly brought back to steady-state operation following the blackout. Figure 31 compares the restoration time of the 5-h blackout event on October 15, 2020 (as illustrated in Fig. 30), with the earthquake strike on August 5, 2018 (as illustrated in Fig. 28). The post-event duration of the graphs shown in the figure indicate the amount of time that passes by before the pre-event power consumption value is obtained. For the power plant failure, which led to the 5-h backout, on October 15, 2020, the pre-event power consumption value was obtained within 11 h. On the contrary, for the 7.0 Richter Scale earthquake strike on August 5, 2018, the pre-event power consumption value was obtained 50 days later.

Discussion
The clustering results for the entire 2015-2021 electricity load reveal that identification of natural disaster-impacted electricity load profiles is not possible. Due to the selection of 7 clusters combined with a very long-time range, the k means clustering algorithm does not manage to differentiate the daily load profiles based on normal and abnormal load profiles (or with a higher granularity). However, the k means clustering algorithm manages to cluster the full-time dataset into approximately yearly clusters, yielding that the majority of the identified clusters consisted of specific-year daily load profiles. Furthermore, the load profile for the Ramadan periods is possible to be detected.
The year-by-year clustering proved able to detect and separate normal load profiles from abnormal load profiles, yielding that the outage/disturbance-related clusters are identified in the majority of the year-by-year clustering results. For instance, the 2018 load clustering identifies a cluster with 16 samples (daily load profiles) as the postnatural disaster days and separates the post-natural disaster days and electrical fault days into different clusters. However, it is not always the case. For instance, no natural disaster impacted clusters are identified by the 2019 clustering, although a 5.6 Richter scale earthquake occurred on March 17, 2019. This verifies to a great extent that the k means clustering algorithm is equipped to identify and group natural disaster impacted load profiles, which potentially can be adopted for a country-wide implementation. Furthermore, in a case that the data regarding natural disaster frequency is not available, the proposed method in this paper, the k-means clustering algorithm, can identify disturbance-impacted clusters, which will act as a foundation for a more thorough natural disaster association analysis. Both natural disasters and electrical faults have a significant short-term impact (hours to days) on the electricity system. Therefore, it is difficult to differentiate between the natural disaster-induced load profile reductions and electrical fault induced-load profile reductions without prior knowledge of natural disasters. However, the individual yearly clustering can identify the post-natural disaster clusters and electrical fault-impacted clusters because the long-term effects (days to weeks) of electrical faults on the load reductions are significantly less compared to natural disasters. Similarly, the report on power sector resilience to natural hazards in USA (Nicolas et al. 2019) also states that natural hazards/disasters in the USA on average cause a power loss of 2.5 days, whereas the non-natural disruptions (electrical faults) only last for approximately a day.
Based on the discussion with the local Indonesian experts, earthquakes in 2018 and 2019 did not cause infrastructure collapse of the electricity system, but the distribution power poles (power lines that are attached to tree poles in the air) were damaged causing disconnection of several electricity consumption areas and took 50 days for the electricity system to fully recover from. Furthermore, the 5-h electrical fault that occurred on 15 October 2020 is due to a large power plant failure that caused a cascading disconnection of the remaining power plants from the transmission system. It took 11 h to return to steady-state operation, and the average restoration rate was detected to be 132 [MW/½hour]. This is a feasible restoration rate for the local TSO, PLN-NTB since the electricity generation is primarily composed of fast-ramping combined-cycle gas power plants, which can reach a 132 [MW] power output increase within 5 min (Analysis et al.

Fig. 31
Comparison of power consumption developments following disruption events. The green dashed line indicates the unfolding of the disruption events. The starting point of the graphs are 5 days prior to the exact disruption events 2021). However, when natural disasters occur, materialized through earthquakes, PLN-NTB, has to perform reparation on the destroyed distribution poles, which can last for several days to weeks.

Conclusion
This paper applies the k means clustering algorithm to investigate the normal/abnormal electricity load profiles and identify natural disaster-and electrical fault-impacted electricity load profiles for Lombok electricity system, Indonesia with ½-hourly electricity load data from 2015 until 2021. Seven clusters are identified, and each cluster represents one year of the electricity load profile. The results show that electricity consumption has increased over the years. It matches the installed production capacity of Lombok (Embassy of Denmark (Jakarta), Danish Energy Agency, KPMG 2019). This paper also conducts clustering with the daily load profile of each year between 2015 and 2022. Based on the investigation of the registered natural disasters and the clustering results, the load patterns impacted by the natural disasters and electrical fault are analysed and compared. The results show that the short-term impacts of earthquakes and electrical faults on the electricity system do not differ greatly. However, the long-term effects have significant differences.
The majority of the literature, e.g., (Jeong et al. 2021;Amri et al. 2016;Richard et al. 2017) has investigated the residential load profiles with the weekday/weekend and/or daily to monthly variations based on the k-means clustering algorithm. Differently, this paper identifies the disturbance-induced electricity load patterns and especially natural disaster-impacted load profiles. Furthermore, this paper also investigates the impacts of natural disasters and electrical faults on the performance of the electricity system, and how the natural disaster-induced load reductions and electrical fault-induced load reductions differ from each other from the short and long-term perspectives. Specifically, the pre-, during, and post-natural disaster impacted load patterns are portrayed. This paper distinguishes the electricity consumption reduction and restoration rates between low-impact high-impact natural disasters, and electrical faults. The results can facilitate the electricity system operators to better understand the load patterns, predict ND strikes' impact on the electricity system and conduct better long-term energy management strategies. However, only one case study is applied in the paper, and the electricity system in the case study is primarily composed of fossil-based, dispatchable power plants. Thus, this paper does not demonstrate how the climate resilience of the electricity system would respond to natural disasters if there was a higher penetration of renewable energy resources, which potentially could be more exposed to earthquakes.
A case study with larger shares of renewable energy resources in the energy and capacity mix is recommended to investigate the impacts of natural disasters on the renewable-based electricity system. In a renewable-based electricity system with high shares of i.e., wind turbines and solar photovoltaic (PV) plants, there might be severe impacts directly on production facility components. In particular, PVs with a poor attachment of the panels to the support structure, are highly exposed to natural disasters and disruptive environmental changes.