Prior knowledge from EWH thermal dynamics can be employed in PINN loss functions and in designing PINN architecture. Two separate temperature prediction models with one-minute prediction intervals for ON- and OFF-events of an EWH are developed. This section describes data preparation, setting up physics-based loss functions, PINN architecture, and recursive training steps for the proposed approach.

### Data preparation

The training data set used in this work contains minute-wise continuous power consumption data. The number of ON/OFF events and their duration can be inferred from power consumption data. From the data set, ON and OFF events are separated for training ON- and OFF-dynamics models respectively. Average water rates are estimated for randomly selected events and used for training.

### Loss functions

Custom loss functions developed to train the neural network (NN) are explained here. Since temperature data is unavailable, it is necessary that developed loss functions well represent the physics of EWH in order for the NN to learn. Multiple loss functions that represent the physics are developed and listed here.

#### Temperature range

In a thermostat-controlled EWH, under normal operation, the water temperature will not exceed \(T_{hi}\) and will not drop below \(T_{lo}\). This knowledge is modeled as a loss function \(L_1\) and is shown in Eq. (5).

$$\begin{aligned} \begin{aligned} L_1=&\,\,{\left\{ \begin{array}{ll}(T_{prd}(t)-\frac{(T_{hi}+T_{lo})}{2}) &{} T_{prd}(t)>T_{hi} \\ &{} or\; T_{prd}(t)< T_{lo}\\ 0 &{} T_{lo}< T_{prd}(t)< T_{hi}.\end{array}\right. } \end{aligned} \end{aligned}$$

(5)

Loss function L\(_{1}\) penalizes the network during training when the predicted temperature is not between \(T_{hi}\) and \(T_{lo}\).

#### Temperature change

When the EWH is OFF, water temperature continues to decrease until \(T_{lo}\) before the EWH turns on. So in the OFF-dynamics model, the difference between two consecutive predictions for two consecutive time steps of an event should be positive. For the OFF-dynamics model, this information is modeled as a loss function \(L^{off}_{2}\) as shown in (Eq. 6).

$$\begin{aligned} \begin{aligned} L^{off}_2={\left\{ \begin{array}{ll}(T_{prd}(t)-T_{pred}(t-1)) &{} T_{prd}(t)\ge T_{prd}(t-1)\\ 0 &{} otherwise. \end{array}\right. } \end{aligned} \end{aligned}$$

(6)

When the EWH is ON, water temperature will increase when water rate is sufficiently low or will decrease/stay the same when water rate is sufficiently high. Since the estimated average water rate is used in this model, water temperature will not decrease at any instant when EWH is ON according to DEM. For the ON-dynamics model, this information is modeled as loss function \(L^{on}_{2}\) as shown in Eq. (7).

$$\begin{aligned} \begin{aligned} L^{on}_2={\left\{ \begin{array}{ll}(T_{prd}(t-1)-T_{pred}(t)) &{} T_{prd}(t-1)\ge T_{prd}(t)\\ 0 &{} otherwise. \end{array}\right. } \end{aligned} \end{aligned}$$

(7)

#### Total temperature change per event

The water temperature at start (T\(_\mathrm{{start}}\)) and end of an event (T\(_\mathrm{{end}}\)) is known for a particular event. T\(_\mathrm{{start}}\) and T\(_\mathrm{{end}}\) are equal to thermostat control set-points \(T_{hi}\) and \(T_{lo}\) for OFF events. T\(_\mathrm{{start}}\) and T\(_\mathrm{{end}}\) are equal to \(T_{lo}\) and \(T_{hi}\) for ON events. Time t\(_\mathrm{{end}}\) can be inferred from the power consumption data. This information is modeled as loss function L\(_{3}\) as shown in Eq. (10).

$$\begin{aligned} T_{spd}= & {} \mid (T_{hi}-T_{lo})\mid. \end{aligned}$$

(8)

$$\begin{aligned} T^{prd}_{spd}= & {} \mid (T(t_{start})-T_{pred}(t_{end}))\mid \nonumber \\ t_{start}= & {} 0\quad \quad T^{on}(t_{start})=T_{lo}\quad \quad T^{off}(t_{start})=T_{hi}. \end{aligned}$$

(9)

$$\begin{aligned} L_3= & {} {\left\{ \begin{array}{ll}\mid (T_{spd}-T^{prd}_{spd})\mid &{} T^{prd}_{spd}\ne T_{spd}\\ 0 &{} otherwise. \end{array}\right. } \end{aligned}$$

(10)

\(T_{spd}\) and \(T^{prd}_{spd}\) represent actual and predicted set points difference. \(L_{3}\) loss function ensures that the predicted temperature \(t_{end}\) is equal to the set-point temperature for a particular event.

#### Relevant-ODE

L\(_{1}\)-L\(_{3}\) loss functions do not reflect the influence of water temperature *T*(*t*) on the rate of change in water temperature \({\dot{T}}(t)\). Temperature change is continuous in the real world. Hence, \({\dot{T}}(t)\) also changes continuously. The predictions can have constant \({\dot{T}}(t)\) and still satisfy L1–L3 losses. For a constant water rate, \({\dot{T}}(t)\) will decrease when *T*(*t*) moves closer to \(T_{hi}\) when EWH is ON and \({\dot{T}}(t)\) will decrease when *T*(*t*) moves closer to \(T_{lo}\) when EWH is OFF.

In a supervised PINN model, generally relevant ODE is used to model an additional loss function to incorporate prior scientific knowledge. ODE used in this work is given in Eq. (2). The goal is to minimize the difference between the gradient of the network’s output with respect to its inputs and the gradient calculated using DEM. This ensures that PINN solutions are consistent with the known physics. The modeled loss function L\(_{4}\) is shown in Eq. (11).

$$\begin{aligned} L_4=\left(\frac{dT_{DEM}(x_i)}{dt}-\frac{dNN(x_i)}{dt}\right), \end{aligned}$$

(11)

\(\frac{dT(x_i)}{dt}\) is calculated using Eq. (2). L4 loss function ensures that for a constant water rate, \({\dot{T}}(t)\) will decrease when *T*(*t*) moves closer to \(T_{hi}\) when EWH is on and \({\dot{T}}(t)\) will decrease when *T*(*t*) moves closer to \(T_{lo}\) when EWH is off.

All loss values are scaled and combined together for training the PINN.

### Network architecture

A simple feed-forward artificial neural network is used to develop the proposed architecture. A priori knowledge from single-zone model is used to define the network architecture. More precisely, the initial layers are structured to mimic the differential equation defined in Eq. (2). This ensures that physics from single-zone model is embedded in the NN architecture and inputs are processed according to it before feeding it to neural layers. The schematic representation of the proposed architecture for the ON- and OFF-dynamics model is shown in Fig. 2. The number of layers and neurons in the figure does not reflect the actual model. NN parameters such as the number of layers, neurons, activation functions, etc., were selected heuristically. The inputs to the models are previous time step temperature, average water rate, and other relevant inputs to Eq. (2) which are assumed constant. The output is current water temperature.

There is also another constant input representing the time step (60s) of the prediction. This input is constant for all predictions and yet relevant to training the network. Without the time input, there is no time reference in the network. The time input is also required to calculate the L\(_{4}\) loss function as it requires a gradient of NN with respect to the input time-step.

The proposed architecture mimicking the DEM requires previous step temperature as an input. However, the goal is to learn the underlying EWH thermal dynamics in the absence of temperature data. A recursive training approach is selected to train the network.

### Recursive training

In time series data, the current value is highly correlated with the immediate past values. Hence, previous step temperature is highly relevant for current temperature prediction model. Recursive training is an iterative training approach in which the next input is created based on the output from the previous training input. During training, new input is created recursively after processing the previous input. In this case, the previous step temperature input is created based on the PINN output to the previous input. The pseudo-code for recursive training approach is given as follows.

First, the network weights are initialized randomly. For each event, the initial point for previous step temperature is required and is available. It is known that the initial temperature for an event is equal to \(T_{hi}\) or \(T_{low}\) depending upon whether it is an OFF or ON event. \(T_{hi}\) or \(T_{low}\) will act as an initial point for previous step temperature. The next input is created based on the output predicted by PINN. This process repeats until the end of an event. Since the event duration is known, once an event ends, the input is reset to \(T_{hi}\) or \(T_{low}\) for the next event and the steps are repeated. Once all events are completed, the loss value is calculated for the epoch and the weights are updated. The same procedure is repeated for the next remaining epochs.

The flow chart summarizing the entire model development is shown in Fig. 3. Once the NN is trained to reduce total custom loss function, ON- and OFF-dynamics models can be used recursively to predict the water temperature and power consumption can be inferred from the \(T_{prd}(t)\) for the future events. The schematic representation of the procedure to use the trained model is shown in Fig. 4. The model requires an initial point for starting the prediction. If the initial point is the beginning of the ON-event, it will continue to use the ON-dynamics model until \(T_{prd}(t)\) reaches \(T_{hi}\), after which it switches to the OFF-dynamics model. This continues till the end of the simulation period.