Accurate estimation of the energy need and consumption is considered as one of the most important basis of the economy worldwide. It is also of high importance to mitigate the adverse effects of the release of CO2 (e.g., climate change) from conventional energy sources by using renewable energies, as recommended by European commission. Thus, in this study a forecast regarding the residential energy consumption of the household sector in countries belonging to the Euro area was executed. To proceed with this prediction, time related data from 1990 till 2015 along with Auto Regressive Integrated Moving Average (ARIMA) model were applied. ARIMA model was considered due to possessing the ability of providing accurate results while being able to receive stationary and non-stationary data. The obtained results from the analysis clarified that ARIMA (0,1,1) model is the most accurate model to undertake such prediction as the amount of RMSE achieved was 0.097. This comparison was accomplished by considering the ARIMA (0,1,0) and ARIMA (1,1,2) models as their amounts regarding RMSE were respectively 0.1068149 and 0.0975575. The results indicate that the amount of the energy predicted to be consumed in household sector in EU area is estimated to be 186244 toe (tonne of oil equivalent) which shows a drop in the energy consumption in Euro area probably due to the increase in the energy efficiency especially in recent years.
1.
Introduction
Since the beginning of the 20th century, the amount of global energy demand has increased constantly due to the effective growth in different technologies for various applications [1,2,3] and revolution in social urges. Since 1970, the increase in energy consumption changed its trend and had gotten doubled the previous amount and even afterwards did not continue the predicted path and in 2013 gained the growth of 2.3 times the previous amount resulting in the total consumption of 930 Mtoe, which is equal to 390 EJ [4]. As the energy consumption affects the economic upturn, an accurate estimation of the forthcoming energy demand is essential in order to provide a prediction with higher accuracy regarding the energy supply [5]. Between 1996 and 2006, the residential segment was responsible for consuming the approximate 30% of the average of energy consumption globally. For instance, in Spain, which has the similar behavior to its surrounding countries when it comes to energy consumption, in 2010 households had the consumption of 17% of all final energy and 25% of electricity. Due to the fact that the residential buildings share a large portion of energy consumption, it is crucial for any applied forecasting software on this segment to be highly contributed in order to achieve the objectives in regards with the energy policy [6]. Hence, in this study we aimed to predict the energy consumption in Euro area household sector using an auto regressive integrated moving average (ARIMA) model to be used for future energy supply planning activities. In order to proceed with this research, the annual energy consumption of residential segments located in countries belonging to the Euro area has been chosen to be inserted in ARIMA model as an input. After presenting a prior art, this study will present the details of the methodology applied and the predictions have been provided afterwards followed by a discussion on the results achieved.
2.
Prior art
Energy supply is among the most challenging issues in global scales. Depletion of the fossil fuels, in one hand, and the rising concerns on the subsequent environmental impacts of the carbon dioxide and other greenhouse gases emission into the atmosphere, on the other hand, forced the decision makers to substitute the current energy sources by sustainable and renewable energy sources. In this regard, various roadmaps have been established by the authorities to secure the future energy supply. Hence, prediction of the energy consumption can assist to make the plans more realistic. There are various types of mathematical models to be applied to perform the energy consumption forecasting [7,8]. To name a few of these approaches, moving average, multiple regression models, exponential smoothing and neural network can be mentioned. ARIMA has gained popularity in the forecasting domain regarding the consumed energy due to the fact that it provides adaptability and also as it facilitates the search of the most functional model at each stage whether it is identification, estimation or diagnostic checking [9]. This method has been widely applied for the energy consumption in various countries [8,10,11,12]. Nichiforov et al. [13] reported that the ARIMA model is more accurate than the non-linear autoregressive neural network (NAR) model in forecasting the energy consumption forecasting. Sen et al. [14] also indicated that ARIMA methodology is an efficient method to forecast the greenhouse gases emission as well as energy consumption in an Indian pig iron fabrication industry. However, the number of reports on the application of ARIMA methodology for the prediction of energy consumption by household sector is rare in the literature. Due to this fact that Euro area is among the major players in the global energy policy (together with US, EU, and China) and in the climate change mitigation [15], it is of highly importance to have a realistic prediction of the amount of the energy needed by various sectors.
3.
Materials and methods
3.1. Materials
In this study, the logarithm of annual household energy consumption (lhec) of countries belonging to the Euro area from 1990 to 2015 was utilized, extracted from PORDATA [16]. To perform the ARIMA analysis, Microsoft Excel Spread Sheet 2016 and R Software (3.4.2) were applied.
3.2. ARIMA model(s)
In the ARIMA analysis, an identified underlying process is produced according to the annual time series data aiming to provide a functional model, which represents the generation mechanism thoroughly. Based on Box-Jenkins method, the ARIMA approach provides the main three stages of identification, estimation and diagnostic checking which is demonstrated in Figure 1.
In the first step, prior to identify the order of ARIMA which best suit the data, it is necessary to check the data which is stationary or not [17,18,19]. Aside from the analysis regarding the time plot of data, Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test was also performed in order to assess the stationary situation of the data. In KPSS test for the logarithm of household energy consumption, the null and alternative hypotheses are as below:
H0: lhec is a stationary process
H1: lhec is not a stationary process
As the result, H0 imply that the data is stationary, its relevant calculated P-value must be greater than 0.05. To proceed with the application of ARIMA model, auto correlation function (ACF) and partial auto correlation function (PACF) graphs were illustrated, which inform about the AR and MA orders. The ACF and PACF graphs determine the MA and AR process orders, respectively.
In the second step, with the ARIMA technique, a model was estimated with a certain number of parameters and the significance of the parameters was tested. To clarify, it evaluates whether the parameters are zero (null hypothesis, H0), or different from zero (alternative hypothesis, Ha). To assess the significance of the parameters, the statistic P-value was applied. The software automatically provided the amount of 0.05 as the P-value, which can be defined as a level that corresponds to 95% of confidence interval. Under the condition that the P-value is less than the mentioned amount, H0 was rejected. Finally, the extracted results from various ARIMA models were taken into for comparisons and while taking the Bayesian information criterion (BIC) into consideration, the most functional and suitable models in order to proceed with this study were identified. In addition, the models which provide the lowest amount of BICs were considered to be the most convenient ones.
The final phase in Box-Jenkins method, before conducting the forecasting stage, is evaluating whether the residuals in identified model has been normalized distributed—white noise and also if the parameters have been non-correlated to each other. In this communication, Kolmogorov-Smirnov and Box-Pierce tests were applied for normality and non-correlation assessments of the residuals, respectively. The null hypothesis in Kolmogorov-Smirnov assessment demonstrates that the residuals are independently distributed and, moreover, in the Box-Pierce assessment is that they are non-correlated. As the result, the calculated P-value for residuals with normalized distribution and non-correlated with a 95% confidence interval should be greater than 0.05. On the other hand, in different circumstances where there is more than one parameter in the model, in order to calculate the non-correlated characteristic of the parameters with utilization of the variance-covariance matrix, the following Eq 1 can be adopted;
Due to the fact that the Box-Jenkins approach is an iterative procedure, thus, on any occasion that a brand-new information is produced from the diagnostic stage, it is possible to return to the primary phase and regenerate new modeling classes.
3.3. Forecasting methodology
To evaluate the prediction accuracy provided by the models, the data was subdivided into two separate durations, one from 1990 till 2010 and known as 'training data', and likewise the other one from 2011 till 2015 known as 'testing data'. Afterwards, by inserting the data by the end of 2010 in the model, which is accounted as 'training data', the predicted logarithm of residential energy consumption for 2011 till 2015 was achieved. The results were then taken into comparison with the 'testing data'. Ultimately, the forecasting error can be calculated by the difference of the actual values known as 'testing data' and the obtained values from the model. Equation 2 defined as following demonstrates the above-mentioned calculation;
In Eq 2, et stands for the forecasting error in time t, yt refers to the actual values in time t, and y^ represents the calculated forecasting values in time t. To evaluate the accuracy of the suitable models, Mean Absolute Error (MAE) and Root Mean-Square Error (RMSE) were applied, which are the assessing measurements regarding the accuracy of the suitable models [20] (Eqs 3 and 4).
The RMSE and MAE are not quite explanatory by themselves, however, they can be applied in order to provide more accurate comparisons among the ARIMA models. Generally, the smaller the values are, the better fitting the models will be.
4.
Results
As the time plot of the data is illustrated in Figure 2, non-stationary process of the time series is evident as the mean change during the time. Thus, to eliminate the trend, the differencing method was applied, which is demonstrated in Figure 3.
As the time plot of the first differenced of household energy consumption is not easy to interpret to recognize if the data is stationary or not, the KPSS assessment was implemented. Based on the obtained values from the KPSS assessment with the P-value equal to 0.1, which is greater than 0.05 (P-value = 0.1 > 0.05), it was comprehended that the first differenced form of the data follows a stationary process (Table 1).
Later on, in order to proceed with the identification of the AR and MA orders of the ARIMA model, ACF and PACF graphs were illustrated (Figures 4 and 5).
Thereafter examination of ACF and PACF graphs, it was comprehended that the single negative spike at lag 1 in the ACF is an MA (1) signature, resulting in the ARIMA (0, 1, 1) to be a possible alternative. Moreover, seven other models are represented in Table 2 to provide a better comparison among the models.
The ARIMA (1, 1, 1) and ARIMA (2, 1, 1) were completely eliminated as all their estimated parameters are not in the range of defined confidence interval of 95%. Likewise, the ARIMA (2, 1, 1), ARIMA (0, 1, 2) and ARIMA (2, 1, 0) are also neglected, due to the fact that each possesses an insignificant parameter. The remaining ARIMA (0, 1, 1), ARIMA (1, 1, 0) and ARIMA (1, 1, 2) are all taken into account and will be further analyzed and evaluated regarding their diagnostics and forecasting accuracy as they all own an approximate equal value regarding the BIC. Furthermore, the identified models are analyzed based on their residuals to ensure that not non-normalized distribution and also no correlation exist among them. Regarding this matter, Quantile-Quantile (Q-Q) plot of the residuals are illustrated in Figure 6. Moreover, the results in regards with the Box-Pierce assessment on non-correlation and also Kolmogorov-Smirnov assessment on normality in distribution are respectively represented in Tables 3 and 4. Based on the P-value of Box-Pierce assessment of each model, whose all P-values are greater than 0.05, it can be concluded that the residuals are not correlated and likewise, the results of the Kolmogorov-Smirnov assessment determines that all three models possess a normal distribution of residuals.
Furthermore, the parameters of ARIMA (1, 1, 2) should be analyzed to be evaluated regarding their non-correlated characteristic. The results in regards with their correlated functionality calculated by Eq 1 and their relevant variance-covariance matrix are respectively represented in Tables 5 and 6. By examining Table 6, it will be comprehended that the estimated parameters are all non-correlated, due to the fact that all their respective correlation values are less than 50%.
Corresponding to the diagnostic checking, all the three models of ARIMA (0, 1, 1), ARIMA (1, 1, 0) and ARIMA (1, 1, 2) are properly specified, which are in perfect harmony with the algorithm of residential energy consumption time series data in countries of the Euro area. As the result, all three above-mentioned models can be applied to provide the forecasting objectives. The forecasting results of the logarithm of residential energy consumption until 2020, which are attained from the applications of ARIMA (0, 1, 1), ARIMA (1, 1, 0) and ARIMA (1, 1, 2) are indicated in Figure 7.
To proceed with the evaluation regarding the accuracy of the forecasting models, the logarithm of household energy consumption relevant to the period of 2011 till 2015 is estimated, utilizing data for the period of 1990 till 2010 by ARIMA (0, 1, 1), ARIMA (1, 1, 0) and ARIMA (1, 1, 2) models. The estimated values obtained from the different models along with the actual values, known as 'Test Set', are represented in Table 7.
Afterwards in consonance with Eqs 3 and 4, MAE and RMSE measurements relevant to three models were computed and displayed in Table. 8.
5.
Discussion
According to the results achieved for BIC criterion (−6.059352, −6.001923 and −5.909778 for ARIMA (0, 1, 1), ARIMA (1, 1, 0) and ARIMA (1, 1, 2), respectively), ARIMA (0, 1, 1) provides the smallest value, however very closed to the others. Also, diagnostic checking for all three models indicate that they fallow a normal distribution with non-correlated parameters. Therefore, it can be stated that all three models can be applied to forecast energy consumption by residential sector in Euro area. However, corresponding to the highest accuracy, ARIMA (0, 1, 1) model provides the most suitable approach in this regard. The amount of the energy which is predicted to be consumed in household sector in Euro area is estimated to be 186244 toe (tonne of oil equivalent) - thousands which is less compared to that consumed in the beginning of 20th century. The predictions (Figure 7a) can demonstrate a drop in the energy consumption in Euro area. It can be attributed to the increase in the energy efficiency especially in recent years. Various reports are available in the literature to emphasize this fact that the energy efficiency has considerably improved in recent years using innovative and sustainable technologies[21,22,23,24,25]. This fact has been also emphasized by European Energy Agency [26].
Prediction of the energy consumption in household sectors in Euro area is of high importance in terms of both environmental and economic perspectives. Due to the large amount of CO2 being released into the atmosphere since the previous century, the earth is going through some undeniable changes regarding its elevating temperature [27]. This is a concerning issue which has attracted a huge attention of the involved parties worldwide. In order to mitigate the occurrence of such events, one of the effective solutions is to decrease the amount of CO2 being released, even promoting the belief that the one of the alternatives is the implementation of 'zero CO2 emission' in which all sectors must find alternatives that little or no CO2 will be produced from their activities [28]. In compliance with this situation, the European Commission set some goals and regulations to pave the path of obtaining this importance forcing all the countries of the Euro area to decide on some specific terms and also to implement them. Although the regulation is the same for all the members of the Euro area, these countries are allowed to choose their own path to follow the rules [29]. One of these goals which has been under the center of the attention is the implementation of renewable energy sources by 2050 to provide the required energy for all sectors due to the fact that the process of energy production via the fossil fuels makes a large amount of CO2 [30]. In some of the well-developed countries, Germany for instance, some precautions in this favor has been already implemented. Germany is planning to close its nuclear facilities which produce energy and replaces them with renewable sources by 2020 [31]. From another point of view, this change can be considered a risky procedure as a cheap and highly efficient energy will be replaced with an expensive one which does not provide the same efficiency [32]. Thus, a model must be applied in order to forecast the amount of required energy so the calculation regarding the necessary resources be carried out. The forecasting of the amount of energy consumption will cause in an accurate investment in energy providing sources as early and precautionary actions will result in saving money more effectively. According to EU [33], the role of energy in household's expenditure is rising to 16% and 15% by 2030 and 2050, respectively, reflecting its importance in the welfare of the communities. Prediction of the energy consumption can considerably aid to make sustainable plans to develop more efficient and economic sources of energy for the future needs.
6.
Conclusion
This study seeks to determine the most functional and accurate ARIMA model in order to proceed with the prediction of residential energy consumption for countries which belong to the Euro area. To accomplish the forecasting purpose, Box-Jenkins approach regarding the identification, estimation and diagnostic checking is applied and as the result ARIMA (0, 1, 1), ARIMA (1, 1, 0) and ARIMA (1, 1, 2) models are obtained. The forecasting accuracy provided by each model is further assessed by the MAE and RMSE measurements. Ultimately, thereafter data analysis of the results, it is concluded that ARIMA (0, 1, 1) model with the RMSE of 0.09725904 is the most accurate and functional model to predict the residential energy consumption data in countries which belong to the Euro area. The results of this study can indicate that the energy consumption in Euro area will decrease when compared to that of the beginning of 2000s. This might be due to this fact that the energy efficiency has been considerably improved by utilization of novel and advanced technologies.
7.
Research implication
The results of this study will assist the decision makers to have a realistic assumption of the actual future energy needed in Euro area household sector to be able to substitute the current energy sources by sustainable and renewable energy sources. This work can also be considered by the future roadmaps prepared by the authorities to secure the future energy supply in Euro area.
Conflict of interest
The authors declare there are no conflicts of interest in this paper.