Residential electricity current and appliance dataset for AC-event detection from Indian dwellings
Energy Informatics volume 5, Article number: 38 (2022)
Air Conditioners (ACs) have become a major contributor to residential electricity consumption in India. Non-intrusive Load Monitoring (NILM) can be used to understand residential AC use and its contribution to electricity consumption. NILM techniques use ground truth information along with meter readings to train disaggregation algorithms. There are datasets available for disaggregation, but no dataset is available for a hot tropical country like India especially for AC event detection. Our dataset’s primary objective is to help train NILM algorithms for AC event detection and compressor operations. The dataset comprises of home-level electrical current consumption and manually tagged AC ground truth (ON/OFF status) data at 1-min interval, indoor environment temperature and relative humidity readings at 5-min interval and dwelling, AC and household characteristics. The data was collected from 11 homes located in a composite climate zone-Hyderabad, India for 19 summer days (May) 2019. The dataset consists of 1.6 million data points and 450 AC cycles with each cycle having a runtime of more than 60 min (> 2000 compressor ON/OF cycles). Public availability of such a dataset will allow researchers to develop, train and test NILM algorithms that recognize AC and identify compressor operations.
Buildings account for about 50% of the global electricity consumption of which space cooling takes up 20% (Liu et al. 2021; Pandey et al. 2021; Hu et al. 2019). The residential sector in India is a major consumer of energy (Qarnain et al. 2021) and it is expected to increase fourfold by 2030 (Gupta et al. 2021). As per the 2019–2020 statistics published by Central Electricity Authority (CEA), the Indian residential sector is responsible for 24% of total energy consumption (Central Electricity Authority 2020). It is predicted that the global electricity consumption of residential Air Conditioners in 2050 will be more than triple of that in 2016 (International Energy Agency 2018). Ownership of ACs in the Indian residential sector could increase further due to the demographic shifts towards cities, increase in the standard of living, and declining prices of Air Conditioner units (Hu et al. 2019). The ‘Indian energy security scenario 2047’ suggests that the number of residential AC units will increase from 21.8 million in 2017 to approximately 68.9 million in 2027 and to 1046 million units by 2047 (International Energy Agency 2018; Debnath et al. 2020). Studies suggest that by 2037, AC consumption in India would increase by 4.3 times than in 2017–18 (Government of India 2019). Occupant behaviour also has a significant effect on the overall energy consumption of AC (ANNEX 2019; Brounen et al. 2012). This has led to major research interests in understanding the impact of AC usage on economics, environment, social development, and sustainability (Xu et al. 2018; Yang and Cao 2018).
AC monitoring helps in determining how much AC consumption contributes to the overall household electricity consumption. It also helps in identifying and analyzing AC usage patterns that can help to identify energy savings potential (Ali et al. 2021; Garg et al. 2021). However, individual AC monitoring for every home requires additional investment to build hardware for sub-metering and reliable communication channels that collect data from multiple sub-meters. NILM attempts to solve this problem by utilizing data from existing loggers installed at the meter level and applying disaggregation algorithms that will classify the individual appliances based on their load signatures. Researchers require access to datasets recorded in the field to develop, train and test these disaggregation algorithms. Since it is not feasible for every researcher to record their own dataset, the creation of open-access datasets promotes NILM research (Kelly and Knottenbelt 2015).
REDD (Kolter and Johnson 2011) was the first publicly available dataset for research in energy disaggregation. The data set contains power consumption data: voltage/current from 6 homes for several weeks. Many datasets on NILM have since been released. BLUED (Blued 2011), Smart (Barker et al. 2012), PLAID (Gao et al. 2014), and Dataport (Parson et al. 2015) are datasets of USA households capturing current and voltage data at the home and appliance level. Similar datasets were created from UK homes such as UK-DALE (Kelly and Knottenbelt 2015) and REFIT (Firth et al. 2017). UK-DALE collected data from 5 homes at 16 kHz frequency for a period of 2.5 years and REFIT was prepared by collecting data from 20 homes at an 8-s interval for a 2-year period. AMPds (Makonin 2016), AMPds2 (Makonin et al. 2016), and RAE (Makonin et al. 2018) are publicly available datasets from homes in Canada. Other open energy disaggregation datasets include Tracebase (Reinhardt et al. 2012) from Germany, GREEND (Monacchi et al. 2014) from Italy, ECO (Beckel et al. 2014) from Switzerland, DRED (Uttama Nambi et al. 2015) from Holand, and ENERTALK (Shin et al. 2019) from Korea. The availability of diverse datasets helps researchers understand electricity and appliance usage patterns for different countries. The usage patterns vary across countries due to differences in occupant comfort, lifestyle, and outdoor weather conditions. For a tropical country like India energy datasets I-BLEND (Rashid et al. 2019), COMBED (Batra et al. 2014a) and iAWE (Batra et al. 2014b) are available. I-BLEND contains 52 months of data from 7 buildings at a 1-min interval. Similarly, COMBED captured data from 6 commercial buildings for 1 month at a sampling rate of 30 s. The dataset contains energy utilized by transformers, chillers, UPS and lifts. IAWE recorded 6-s appliance data and 1 Hz aggregated data from one home for 73 days. Of the available building and residential datasets, 15 datasets have claimed to have disaggregated AC load out of which only 1 dataset (iAWE) is from India. This dataset has ground truth of 10 appliances collected only for a single home. Hence there is a need for a larger dataset with diverse usage and AC types: split and inverter type.
Our dataset contains monitored data from 11 homes in Hyderabad, India a city with composite climate for a 19-day period. For each home there is a record of phase-wise electrical current at 1-min interval. The dataset also contains indoor temperature and humidity of the room with AC at 5-min interval thereby generating 1.6 million data points and 450 AC cycles (> 2000 compressor ON/OFF cycles). The dataset has 7 parameters—Timestamp, phase-wise current (mA) (3-phases), AC status, temperature, and humidity. This dataset can be used to train NILM algorithms for AC. Additionally household survey and outdoor weather data is also part of the dataset.
Table 1 summarizes the household information. The income group is categorized into Low-Income (LIG), Middle-Income (MIG), and High-income (HIG) groups based on the annual household income. The categorization is done according to PMAY (Pradhan Mantri Awas Yojna) Scheme (Ministry of Housing and Urban Affairs, Government of India 2019). Figure 1 shows the schematic overview of the data acquisition system. A mobile application was also developed to connect the mobile to Garud and EnviLog loggers. The application collects data from the loggers and uploads it to the database server using Wi-Fi or a cellular network. A web application has been designed to view the data uploaded to the database server. The web application also facilitates report generation providing details of per house energy consumption on a daily, weekly, and monthly basis.
A detailed monitoring framework was developed for data collection that included low-cost current loggers (Garud) and temperature-humidity logger (Envilog). Garud, shown in Fig. 2a, a current consumption logger recorded electric current at 1-min intervals (Tejaswini et al. 2019). It consists of three CT clamps that can be installed on the main circuit board having a 1-phase or 3-phase connection. Garud devices were installed by trained electricians at the main circuit board whereas EnviLog, as shown in Fig. 2b, was installed by the researcher in the room with the AC.
Garud is a battery-powered logger designed at our research lab using low-cost BLE (Bluetooth Low Energy) that operates in the 2.4 GHz ISM band using GFSK (Gaussian Frequency Shift Keying). BLE is supported by mobile phones and tablets, making it an ideal solution for interfacing the logger to an Android application and capturing data at different intervals based on user’s choice. An on-board memory chip of 4 MB capacity was added to hold data for up to 12 months.
Internal circuitry of Garud is shown in Fig. 3 and it consists of the following subsystems:
Current transformer (XH-SCT-T10) is a 50A to 0.33 V equivalent voltage output current transformer rated for input current from up to 100A.
Analog to Digital Converter (MCP3919) from Microchip is used to convert analog voltage from the current transformer to digital signals that can simultaneous sample 3-phase current. Its low power consumption (< 6 mA at 3.3 V) makes it an appropriate component for a low power analog data acquisition device.
Microcontroller with Bluetooth Low Energy Subsystem (NRF52832) from Nordic semiconductors is a multi-protocol System on Chip (SoC) that supports Bluetooth 5 and Bluetooth mesh. It can achieve a very low power consumption as it has an on-chip adaptive power management system.
Flash Memory (W25Q32JVSSIQ) from Winbond Electronics is interfaced with the microcontroller to log data. It has 4 MB storage with 66 MB/s continuous data transfer and over 20-year data retention.
Real-time Clock (MCP7940N) tracks time using internal counters for hours, minutes, seconds, days, months, years, and day of the week.
Boost Converter (MAX17220–MAX17225) is a family of ultra- low quiescent current boost (step-up) DC-DC converters with a 225 mA/0.5A/1A peak inductor current. It helps in keeping the device running at a very low voltage range up to 0.8 voltage from two AA Batteries.
Garud is connected to the DB box/main meter using three CT clamps as shown in Fig. 4. The number of CTs connected depends on whether the homes run on 3-phase electricity or single phase.
EnviLog is a custom-built relative humidity and temperature logger that uses bluetooth for communication and data transfer. It uses lithium coin batteries and can store 12,000 temperature and humidity records. The specifications of Garud and EnviLog are presented in Table 2.
Application for data collection
The RESIDE-AC dataset is available as well-structured and easy-to-use .CSV (Comma Separated Values) files. The data set contains 19 summer days data from 10/05/2021 IST to 28/05/2021 IST for 8 single-AC homes and 3 homes with more than one AC. Garud and Envilog captured datetime values in IST (Indian Standard Time) which can be converted to UTC by subtracting 5:30 h. Each home has 27,360 Garud records and 5472 Envilog records. However, the start time of Garud and Envilog is not the same and therefore the timestamps will not match exactly. The dataset is available in 3 directories namely Garud, Envilog, and Household Information. ‘Garud’ and ‘Envilog’ folders include 11 files, one for each home. The naming convention for these files is <Gxx> and <Exx> where ‘xx’ is a number between 01 and 11. ‘xx’ indicates the house ID that is used as a key to identify each home uniquely. Household characteristics, Building details and AC information are present as separate CSV files named ‘HomeDetails’, ‘BuildingDetails’ and ‘ACDetails’ in the ‘Household information’ folder. Data related to daily outdoor temperature and humidity during the specified period is present in a separate file ‘weatherdata.csv’. The weather data is collected from the official government website (Open Data Telangana 2017). The file contains minimum and maximum temperature in °C and minimum relative humidity and maximum relative humidity in %.
Each Garud file consists of 5 fields: “datetime”, 3-phase current: “R”, “Y”, “B”, “AC status” captured at 1-min interval. The “datetime” column contains the IST time when the data was recorded. The 3-phase current columns represent the phase current in milli-Amperes. For a single-phase house Phase-R column is filled with current values and the remaining 2 columns hold the value 0. For a three-phase house Phase-R, Y and B are filled with current values. The AC status field contains the value 0 indicating AC OFF and 1 indicating AC ON. Envilog sensors are placed in the bedroom with most frequently used AC. Each Envilog file consists of 3 fields: “datetime”, “temperature” and “relative humidity” captured at 5-min interval. The datetime column contains the IST time of the record. Temperature and humidity fields represents the indoor room parameters in °C and %RH.
The monitored dwellings were either stand-alone homes or flats located in a low-rise apartment building (< 6 story). All the dwellings were constructed using Reinforced Cement Concrete (RCC) frame structure with burnt clay bricks as infill. The windows were single glazed with fixed external shading. Additional details related to number of rooms, area of the dwellings and age of the dwelling are present in the ‘buildingdetails’ file. The ‘DwellingDetails’ file consists of the following fields: House ID (key), number of occupants, income group, ownership, annual electricity consumption calculated from electricity bills, phase connection type (single-phase/three-phase) and appliance list. The appliance columns indicate the number of appliances owned. The appliances owned were: washing machine, refrigerator, microwave/oven, television, water pump (Indian residences use water pumps that pump water from underground water storage tank to roof-top tanks for gravity based water supply system), Electric Geyser (electrical resistance heating appliance to heat water for domestic usage in winter), Dessert cooler (device that cools air in an entire room through evaporation of water using wet grass). The device is driven by an electric fan) and Electric inverter with battery backup (Backup devices which supply AC power to appliances during power outages).
In Indian homes majority of the Air Conditioners are present as individual units in the bedroom and living rooms and operate using thermostat in contrast to centralized AC systems that are common in the West. The average size of bedroom whose AC was monitored is 20 sq m. The ‘ACDetails’ file contains additional information such as AC tonnage, type of AC, age of the AC and BEE star rating.
AC ON/OFF tag values were assigned manually to each record by observing the increase in current values. We confirmed that it was AC load by verifying the drop in room temperature 5-min after the increase in current consumption. Similarly, when there was a decrease in AC current consumption and a gradual increase in temperature for continuous 15-min period, we indicated the activity as AC OFF. In the case of a three-phase connection, the phase on which the primary AC is running was selected for tagging. In the case of a single-phase connection, the default phase was selected. Figure 6 shows the graph of AC consumption along with tags for a 1-h period.
To validate that the logging system is running as expected, several test routines were carried out. One test checked that the electricity consumption values (mA) captured by Garud are similar to values observed in the monthly electricity bills. Though our dataset was for 19 days, 1 year current data was captured and compared to the monthly electricity bills. Before deployment, the devices were checked for precision by testing against accurate meters. Yokogawa (WT330) was used for laboratory reference and Wattnode (WNC-3Y-480-MB) was used as a field reference meter. An experiment was conducted to measure Garud’s precision by comparing its readings with Yokogawa (WT330). The experiment was conducted by varying the load connected across Garud and the two standard meters. The current was varied from 100 mA to 12 A. Yokogawa (WT330) was further used for calibrating Garud and its results are displayed as a scatter plot in Fig. 7.
A similar accuracy test was performed on Envilog by comparing its readings with iButton and HOBO UX100-003. For this test, the sensors were placed inside a box and the temperature was varied by changing the set point of the air-conditioner. The temperature was varied from 19 to 30 °C. Since Envilog did not use a real-time clock, we may observe a possible time shift during the study for a couple of minutes. The temperature and humidity graph plots in Fig. 8 show the results of conducted experiment.
To verify data correctness, we visualized the average AC usage hours for all 11 homes. Figure 9 shows the fraction of time AC is used in the given 1 h interval for 1 day averaged across 11 homes. It can be observed that the usage hours are high during early mornings hours and late nights due to the use of AC. The figure also shows that the most common AC usage window in all houses is during sleep hours which are generally between 10:00 pm and 07:00 am with an average runtime of 6.2 h. It is also observed that there is minor usage of AC during the afternoon hours between 02:00 pm and 06:00 pm.
India being a hot tropical country, AC is a major load contributor. The demand for number of residential AC units in India is predicted to increase from 21.8 million in 2017 to 1046 million units by 2047. In contrast to the centralized ACs that are common in the west, India has different types of ACs (split AC and window AC) installed in individual rooms. Our dataset contains monitored data from 11 homes in Hyderabad, India a city with composite climate for a 19-day period. The dataset comprises of 450 AC cycles and more than 2000 compressor ON/OF cycles. Availability of such a dataset will help researchers train and test NILM algorithms to recognize AC events.
Availability of data and materials
The dataset is available on figshare with https://doi.org/10.6084/m9.figshare.16869439.
Non-Intrusive Load Monitoring
Central Electricity Authority
Comma Separated Values
Indian Standard Time
Ali SB, Hasanuzzaman M, Rahim NA, Mamun MA, Obaidellah UH (2021) Analysis of energy consumption and potential energy savings of an institutional building in Malaysia. Alex Eng J 60(1):805–820
ANNEX E (2019) Definition and simulation of occupant behavior in buildings. Tech. rep., IAE. http://www.annex66.org/. Accessed 10 June 2022
Barker S, Mishra A, Irwin D, Cecchet E, Shenoy P, Albrecht J (2012) Smart*: an open data set and tools for enabling research in sustainable homes. SustKDD 111(112):108
Batra N, Parson O, Berges M, Singh A, Rogers A (2014a) A comparison of non-intrusive load monitoring methods for commercial and residential buildings. arXiv preprint. arXiv:1408.6595
Batra N, Singh A, Singh P, Dutta H, Sarangan V, Srivastava M (2014b) Data driven energy efficiency in buildings. arXiv preprint. arXiv:1404.7227
Beckel C, Kleiminger W, Cicchetti R, Staake T, Santini S (2014) The ECO data set and the performance of non-intrusive load monitoring algorithms. In: Proceedings of the 1st ACM conference on embedded systems for energy-efficient buildings. pp 80–89
Filip A (2011) Blued: a fully labeled public dataset for event-based nonintrusive load monitoring research. In: 2nd workshop on data mining applications in sustainability (SustKDD), vol 2012
Brounen D, Kok N, Quigley JM (2012) Residential energy use and conservation: economics and demographics. Eur Econ Rev 56(5):931–945
Central Electricity Authority (2020) Growth of electricity sector in India. https://cea.nic.in/wp-content/uploads/pdm/2020/12/growth_2020.pdf. Accessed 10 June 2022
Debnath KB, Jenkins DP, Patidar S, Peacock AD (2020) Understanding residential occupant cooling behaviour through electricity consumption in warm-humid climate. Buildings 10(4):78
Firth S, Kane T, Dimitriou V, Hassan T, Fouchal F, Coleman M, Webb L (2017) REFIT smart home dataset
Gao J, Giri S, Kara EC, Bergés M (2014) Plaid: a public dataset of high-resoultion electrical appliance measurements for load identification research: demo abstract. In: proceedings of the 1st ACM conference on embedded systems for energy-efficient buildings. pp 198–199
Garg A, Maheshwari J, Mukherjee D (2021) Transitions towards energy-efficient appliances in urban households of Gujarat state, India. Int J Sustain Energy 40(7):638–653
Government of India (2019) India cooling action plan; Government of India, New Delhi, India. http://ozonecell.nic.in/wp-content/uploads/2019/03/INDIACOOLING-ACTION-PLAN-e-circulation-version080319.pdf. Accessed 10 June 2022
Gupta R, Antony A, Garg V, Mathur J (2021) Investigating the relationship between residential AC, indoor temperature and relative humidity in Indian dwellings. J Phys Conf Ser 2069(1):012103
Hu S, Yan D, Qian M (2019) Using bottom-up model to analyze cooling energy consumption in China’s urban residential building. Energy Build 1(202):109352
International Energy Agency (2018) The future of cooling. https://www.iea.org/reports/the-future-of-cooling. Accessed 10 June 2022
Kelly J, Knottenbelt W (2015) The UK-DALE dataset, domestic appliance-level electricity demand and whole-house demand from five UK homes. Sci Data 2(1):1–4
Kolter JZ, Johnson MJ (2011) REDD: a public data set for energy disaggregation research. In: Workshop on data mining applications in sustainability (SIGKDD), San Diego, CA, vol 25, no. Citeseer, pp 59–62
Liu H, Sun H, Mo H, Liu J (2021) Analysis and modeling of air conditioner usage behavior in residential buildings using monitoring data during hot and humid season. Energy Build 1(250):111297
Makonin S (2016) Ampds2: the almanac of minutely power dataset (version 2). Harvard Dataverse. V2
Makonin S, Ellert B, Bajić IV, Popowich F (2016) Electricity, water, and natural gas consumption of a residential house in Canada from 2012 to 2014. Sci Data 3(1):1–2
Makonin S, Wang ZJ, Tumpach C (2018) RAE: the rainforest automation energy dataset for smart grid meter data analysis. Data 3(1):8
Ministry of Housing and Urban Affairs, Government of India (2019) https://pmay-urban.gov.in/. Accessed 10 June 2022
Monacchi A, Egarter D, Elmenreich W, D'Alessandro S, Tonello AM (2014) GREEND: an energy consumption dataset of households in Italy and Austria. In: 2014 IEEE international conference on smart grid communications (SmartGridComm). IEEE, pp 511–516
Open Data Telangana (2017) https://data.telangana.gov.in/. Accessed 10 June 2022
Pandey B, Bohara B, Pungaliya R, Patwardhan SC, Banerjee R (2021) A thermal comfort-driven model predictive controller for residential split air conditioner. J Build Eng 1(42):102513
Parson O, Fisher G, Hersey A, Batra N, Kelly J, Singh A, Knottenbelt W, Rogers A (2015) Dataport and NILMTK: a building data set designed for non-intrusive load monitoring. In: 2015 IEEE global conference on signal and information processing (globalsip). IEEE, pp 210–214
Qarnain SS, Muthuvel S, Bathrinath S (2021) Modelling of driving factors for energy efficiency in buildings using Best Worst Method. Mater Today Proc 1(39):137–141
Rashid H, Singh P, Singh A (2019) I-BLEND, a campus-scale commercial and residential buildings electrical energy dataset. Sci Data 6(1):1–2
Reinhardt A, Baumann P, Burgstahler D, Hollick M, Chonov H, Werner M, Steinmetz R (2012) On the accuracy of appliance identification based on distributed load metering data. In: 2012 sustainable internet and ICT for sustainability (SustainIT). IEEE, pp 1–9
Shin C, Lee E, Han J, Yim J, Rhee W, Lee H (2019) The ENERTALK dataset, 15 Hz electricity consumption data from 22 houses in Korea. Sci Data 6(1):1–3
Tejaswini D, Garg V, Hussain AM, Mathur J (2019) Development of open-source low-cost building monitoring sensors using IoT standards. Air Cond Refrig J ISHRAE. 74–86
Uttama Nambi AS, Reyes Lua A, Prasad VR (2015) Loced: location-aware energy disaggregation framework. In: Proceedings of the 2nd ACM international conference on embedded systems for energy-efficient built environments. pp 45–54
Xu X, González JE, Shen S, Miao S, Dou J (2018) Impacts of urbanization and air pollution on building energy demands—Beijing case study. Appl Energy 1(225):98–109
Yang W, Cao X (2018) Examining the effects of the neighborhood built environment on CO2 emissions from different residential trip purposes: a case study in Guangzhou, China. Cities 1(81):24–34
We thank the homeowners who voluntarily participated in the study.
About this supplement
This article has been published as part of Energy Informatics Volume 5 Supplement 4, 2022: Proceedings of the Energy Informatics.Academy Conference 2022 (EI.A 2022). The full contents of the supplement are available online at https://energyinformatics.springeropen.com/articles/supplements/volume-5-supplement-4.
The study is funded by the Department of Science and Technology, India (DST) and Engineering and Physical Sciences Research Council, UK (EPSRC) has provided joint funding for work under the India-UK partnership grant no. EP/R008434/1 for Residential Building Energy Demand in India (RESIDE). We also thank IHub-Data, IIIT Hyderabad for granting research fellowship to the primary author (Dharani Tejaswini).
Ethics approval and consent to participate
IRB, Ethics Approval Committee of IIIT-Hyderabad has approved the study.
Consent for publication
All authors have given their consent for publication.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Tejaswini, D., Ramapragada, P., Gundepudi, S. et al. Residential electricity current and appliance dataset for AC-event detection from Indian dwellings. Energy Inform 5 (Suppl 4), 38 (2022). https://doi.org/10.1186/s42162-022-00225-4