Efficient Building load forecasting - CiteSeerX

1 downloads 286 Views 364KB Size Report
Against this background, we advance the state of the art in three main ways. First ..... technology,” http://neuron.tu
Efficient Building load forecasting Iv´an Fern´andez, Cruz E. Borges and Yoseba K. Penya University of Deusto, Bilbao, Basque Country {ivan.fernandez,cruz.borges,yoseba.penya}@.deusto.es

Abstract—The arrival of the smart grid paradigm has brought a number of novel initiatives that aim at increasing the level of energy efficiency of buildings such as smart metering or demand side management. Still, all of them demand an accurate load estimation. Short-term load forecasting in buildings presents additional requirements, among others the need of prediction models with simple or non-existing parametrisation processes. We extend a previous work that evaluated a number of algorithms to this end. Herewith we present several improvements including a variable data learning window and diverse learning data weighting combinations that further up improve our results. Finally, we have tested all the algorithms and modalities with four different datasets to show how the results hold up.

I. I NTRODUCTION Hundred years from their initial development, deployment, and spreading, power networks must face nowadays a new revolution. With the peak of petrol production soon in sight, the continuous increment in the demand world-wide as well as the need to reduce contaminating emissions, efficiency at all levels is the key to the optimal, sustainable use of the resources [1]. The emerging smart grid paradigm (term coined in contrast to the old allegedly non-intelligent transmission and distribution networks) refers to the future electrical grid in which these ICT (Information and Communication Technologies) will play a key role to achieve higher degree of efficiency. Smart metering is one of the basic pillars of this vision: modern meters will be able to allow a number of innovative services such as remote metering, bi-directional communication, electricity price updating or load forecasting. Specifically, the prediction of the future consumption in a time-span ranging from minutes to several days is crucial for several smart grid applications. For instance, Demand Side Management (DSM) is the common banner that groups techniques and models to encourage consumers to modify their amount and pattern of energy demand. Therefore, it requires load forecasting in order to known in beforehand the load to be smoothed. Moreover, it is well-known that publishing the actual energy consumption in public buildings and also in households reduces the load (usually enriching the released information with equivalent CO2 emissions and consumption cost); showing the future consumption reinforces this effect. Finally, modelling accurately the typical load curve of a building and its future form(s) (for instance in the case of a school, university or companies’ facilities) helps negotiating a better price with the energy retailer, since in this way they will have to face smaller deviations in their client portfolio’s consumption (and, therefore, pay less penalties).

Historically, there has been a hectic activity around shortterm load forecasting (STLF). The solutions proposed can be divided into two main groups, depending on the strategy followed. On the one hand, statistical methods try to estimate a regression function that contains the points registered in the historical load data (i.e. consumption record). They are very effective ways for approaching linear curves but, since load forecasting is non-linear, statistical methods present poorer results than their counterparts [2], [3]. On the other hand, Artificial Intelligence has designed a bunch of techniques, methods, and models that deal with risk and uncertainty (the main aspects behind forecasting and prediction). The most popular due to their efficiency are Support Vector Machines (SVM) and Neural Networks (see Section II for a more accurate description). Still, though their accuracy, artificial intelligence methods present a number of drawbacks and inconveniences such as difficult parametrisation, non-obvious selection of variables, and over-fitting and require much historical data to learn the patterns inherent on it [3]. We focus here on building STLF (meaning non-residential buildings). This special branch presents different features: for instance, in normal country-wide STLF, the non-linearity of the load becomes smoothed, since expected consumption that does not take place is compensated by non-expected consumption that does. Moreover, the consumption curve tends to be stationary, seasonal and regular, coinciding with the times the building is used. Hence, there is no consumption at night (or it is negligible) and, anyway, there exists a notable gap between idle and activity times. Further, many of these buildings are not yet fully-automated: either the HVAC is manually controlled or it is switched on and off remotely. Anyway, it does not adapt to sudden weather changes and this influence is comprehended within the consumption data. Another critical aspect is that usually there is scarce (if any) historical data on hourly load and the load profile is sure to vary and evolve over the time (just think of the gadget an office used to have 10 years ago compared to nowadays fully-equipped on-line ones). Finally, the method or algorithm chosen for this purpose must be simple to tailor to every single case (e.g. there should not be a Neural Network freak in a school to control and periodically adapt the NN that predicts their load profile). In summary, an STLF method for nonresidential buildings should satisfy the following premises: • •

Be easily adaptable and not require any tedious trial-anderror process customisation. Work with scarce and evolving historical data.

Be as accurate as possible. There has been a remarkable research on building STLF, especially backed by forecast competitions such as [4] or [5]. Unfortunately, all the solutions developed have been only tested with the proposed datasets and, therefore, are susceptible to be over-fitted to them. In a previous work [6], we presented a number of well-known Artificial Intelligence (AI) and statistical methods for building STLF and evaluated them with 2 datasets to test their performance under diverse conditions. We showed that under these premises an autoregressive model outperformed more sophisticated methods such as Neural Networks, ARIMA or Bayesian Networks. Against this background, we advance the state of the art in three main ways. First, we use a variable learning data window to assess the number of previous data days that optimise the prediction. Second, we use different weight combinations to highlight the importance of those days. Third, we introduce a post processing algorithm that studies the typical historical error and introduces on purpose error on the prediction output in order to correct that historical error. The remainder of the paper is organised as follows. Section II analyses related work in STLF and building STLF. Section III presents the algorithms to be tested as well as details the source and features of the used datasets. Section IV details the tests and discusses the obtained results. And, finally, Section V concludes and draws the avenues of future work. •

II. R ELATED W ORK As already mentioned, there exists a very large literature on short-term load forecasting (see [2], [7], [8], [9], [3] for a comprehensive survey on STLF) but, comparatively, little on the same topic applied to buildings. In both scenarios, research presents two main branches. The first one includes different types of statistical methods, including univariate time series, in which the load is modelled according to historical data (e.g. multiplicative autoregressive models [10], dynamic linear [11] or non-linear [12] models, threshold autoregressive models [13], Kalman filtering [14], [15], and Gaussian Process prior [16]), and causal models, in which the load is modelled as a function of an exogenous factor(s) (e.g. weather). In this latter group we can place ARMA models (also known as BoxJenkins [17], [18]), ARMAX models [19], [20] optimization techniques [21], non-parametric regression [22], structural models [23], and diverse curve-fitting procedures [24]. In spite of the large number of alternatives, however, linear regressions [25], [26] have been the most popular election, and, most accurately, ARIMA has been the technique showing the most promising results [18]. Nevertheless, all these methods are basically suitable for linear prediction and the load series they try to approximate is known to be a non-linear function of an exogenous variable [27]; this is the main reason that explains why they have failed to top artificial intelligence methods, as shown for instance in diverse comparisons presented in [2], [28], [29], [30], [31], [3]. Against this drawback, [32] addressed a mixed model combining an ARIMA model to deal with the linear part and a

NN with the non-linear one. Moreover, another main drawback of statistical methods in STLF is that they model the whole series without taking into account the type of the day [33], [9], [34]. In this way, lately the bulk of STLF research has been concentrated on the second group, using several artificial intelligence methods to deal with the non-linearity of the historical load data. In this way, the techniques addressed include fuzzy logic [35], [36], [37] expert systems [38], [39], evolutionary algorithms [40], support-vector machines [41], [42], [31], [43], [44] and, specially, all kinds of neural networks [45], [12], [46]. The most promising techniques are NN and SVM but they both have to deal with a number of problems. First, either NN and SVM require much more historical data than any of the statistical methods [3]. This data set may also pose a problem to NNs since they fail when it presents random correlations among the inputs and the output because conventional NNs will not set the coefficients for those junk inputs to zero. In this way, irrelevant variables may blur the accuracy of the prediction (for instance, [47] uses Bayesian methods to alleviate this trouble). Moreover, just selecting the most suitable set of variables in NNs demands dissector’s skills with Occam’s razor and so does parametrising SVMs: they both relay on a tedious trial-and-error process to tune them up properly (see, for instance, discussions on clustering-based parameter setting in SVM [44] and simulated annealing SVM [31], [43] for the same purpose). Therefore, tailoring one of these models to a certain scenario, as we wish, is a nightmare that only some initiated may withstand without yielding. Finally, well-known issues that arise in load forecasting, such as over-fitting and data-ageing [48], [49], remain still open. Yet, as discussed before, STLF in buildings addresses a different problem domain, and there have been a number of interesting initiatives dealing with the special features of this scenario, described before. For instance, [33] tried to model the hourly energy use in commercial buildings with Fourier Series. They performed poorer on the weekends due to the fact that they used the whole data series for the modelling. [50] corrected this drawback by distinguishing day types but, still, they focused on hourly modelling rather on load forecasting itself. Regarding artificial intelligence methods, [51] addressed a SVM for predicting the load of a building complex, [47] proposed a NN tuned up by Automatic Relevance Determination in order to optimise the selected input. Moreover, [52] put forward an NN in which the input variables where selected by a version of the Wald’s test. In the same spirit, [53] used the temperature data in a feedback NN with a remarkable MAPE of the 1.945 % (for instance, [54] included inputs about orientation, insulation thickness and transparency ratio without improving that result). As aforementioned, NNs require much historical data (which, in our case may not be available) and, further, a complicated configuration process that yield them unable to be easily adaptable to single small scenarios (say the buildings of a school). Finally, all artificial intelligence methods squander

all their efforts on modelling the non-linearity. As we have shown, in our problem scenario this can be avoided easily by using the work calendar. A similar concept has been applied in [55] (one model for each type of day) and [56] (one model for each hour of the day) to STLF and in [50] to building STLF.

polynomial or exponential methods. Polynomial methods produce the following parameters: (q − i)l , ϕi = Pq l i=0 (q − i) whereas the exponential method produces:

III. A DVANCED STLF IN NON - RESIDENTIAL BUILDINGS In a previous work ([6]), we analysed the nature of our problem domain based on the validation of 3 hypotheses, all of them related to the non-linearity of the load of data. The first hypothesis claimed that weather variables do not influence load consumption. The second hypothesis defended the work calendar as more effective and accurate classifier compared to any clustering technique usually addressed to this task. The third hypothesis maintained that the work calendar provided enough information to solve the non-linearity. We empirically validated all three hypothesis true by statical hypothesis test. Further, we presented the methodology in ([6]) comprises two steps. First, we classify the day whose load we want to predict depending on the day in the work calendar and then, we adjust the load curve of such day within the models/methods for each hour. To this end, we use three type of days: week day, Saturday and Sundays. In terms of regression computing, there is an extended error when it comes to predicting the whole load on different types of days chronologically, (see for instance [57]): it is not accurate to compare the load on weekends to that of weekdays. Hence, predicting the load for 11 am on a Monday implies selecting data from similar types of previous days, such as Friday, Thursday, Wednesday if the learning window is 3 days. Hereby, we present a number of new concepts, extending and improving our work in STLF ([6]). First, we add the concept of a variable learning window for data collections and, second, we apply a coefficients model. In the development of experiments to test the improved results, we will test values of a learning window (q ∈ {1, 3, 5, 10, 30, 50}) meaning how many days does the training comprise. A. Models used for forecasting 1) Time Series model: We have chosen an Autoregressive Model (which is commonly used for modelling univariate time series) for every hour and day type: sh,d = t

q X

h,d ϕh,d i st−i ,

i=1

ϕh,d i

where are the model parameters. In the adjusting step, we have computed the q last values of the same day type (e.g. with q = 3, from a Tuesday, the previous Monday, Friday, Thursday) and not the q last chronological values (e.g. from a Tuesday, the previous Monday, Sunday, and Saturday). Moreover, we assign weights (model coefficients l) for those days of the prediction window, in order to give a higher priority to the latest data against the oldest values, by

2(q−i) ϕi = Pq , (q−i) i=0 2 where q is the value of learning window and l can take values l ∈ Z for polynomial case. We have used different values for the parameter l. Namely we have carried out our tests with l ∈ {0, 1, 2, 3, 4, 5, exp}. Note that l = 0 corresponds to the mean of the previous values and exp denotes the exponential method. 2) Polynomial model: The second model consists of univariate polynomial that tries to (clumsily) capture the load curve. It is defined as follows: ld (x) =

d X

αii xi .

i=0

d is is the degree of the polynomial. It is adjusted to every single day and hour by using the least squares technique. We have tested several degrees, namely d ∈ {4, 5, 6, 7, 8}. 3) Neural network: NNs are non-linear circuits whose perceptron (say simple information processors) structure adapts according to the external or internal information that flows through the network during the learning phase. Their output is a linear or non-linear function of the inputs and, therefore, they have been widely used for predicting non-linear data (as in STLF [2], [3], [53]). We have performed the tests using only one hidden layer composed of {10, 30, 50, 100} neurons with T AN H activation function. 4) Support Vector Machines: SVM constructs a hyperplane or set of hyperplanes in a high or infinite dimensional space, which can be used for classification, regression, or other tasks. SVM have been used for load forecasting in buildings ([51]). In this case we have used a ν-SVR using a Radial Basis Function as kernel and parameters: ν = 0.9, ε = 10−2 , C ∈ {1, 10, 100} and γ ∈ {1, 10−1 , 10−2 }. 5) Fix error algorithm: Finally, we have tested a post process to the algorithm output that simple adds a Gaussian random value with the same mean and standard deviation than the error in the learning windows that have been measured for this particular method, hour and day type. The aim of this strategy is to correct this typical historical error of the output. IV. D ISCUSSION This section describes the results of the experiments we carried out to select the most suitable model for STLF in nonresidential buildings.

dataset donosti1 donosti2 ashrae eunite

Week 3.89 7.16 4.24 9.22

Saturdays 6.47 8.66 5.67 NA

Sundays 5.52 9.52 6.70 8.68

Total 5.06 8.05 4.96 9.20

TABLE I M INIMUM EXPECTED MAPE FOR EVERY DATASET (%).

Fig. 1.

Flowchart of our methodology.

A. Experimental Design 1) Algorithm Description: In order to perform the experiments for the day-ahead prediction, we followed the methodology described before: first, we selected the type of the day according to the work calendar. Then we fed the model with the previous q days (learning window) of the same type and adjusted the model to that data using different methods (AR model, neural network, polynomial or support vector machines). Finally we issued a prediction for the next 6 days, iterating over the whole dataset using this procedure (see Fig. 1). Then, we compared the predicted result with the real consumption value and computed the Mean Absolute Percentage Error (MAPE) to measure it. We have selected this error to measure the performance of the models since it is unit free; this is, it allows comparisons between forecasting errors from different measurement units. Moreover, it is the error measure most widely used in forecasting [53]. It is calculated as follows: ! days 24 1 X 1 X |rij − pji | × 100, M AP E := days j=1 24 i=1 rij where pji is the predicted value of the load for the hour i of the day j, rij the actual one and days represent the numbers of days in that particular datasets. 2) Datasets: We have recorded the energy consumption of the University of Deusto in Donostia-San Sebasti´an (Basque Country). We have downloaded this data directly from the meter, placed by the Spanish law (54-1997) directly at the transformer, and using the IEC 60870-5-102 standard protocol [58]. This building presents an special feature since its heating system is not regulated according to the weather: from autumn to spring, it is manually turned on every day at approximately the same time and it works until the campus is closed at night; therefore, meteorological conditions do not have influence on the electricity consumption at all (season, on the contrary, does) or this influence is somehow dissolved (say represented) in the data. Furthermore, a new building was added to the campus in July 2009 (our first records date from March 2009). This fact yields forecasting more difficult due to the noise it introduces; nevertheless, the tested algorithms also demonstrate their ability to adapt to evolving data, as we will see.

Hereby, we have split this full dataset in two datasets (donosti1 and donosti2 ) because there is a big difference due to the increment of consumed energy after the construction of the new building. Fig. 2 and Fig. 3 show the average daily load curve for dataset donosti1 and donosti2 . As shown in these figures, the curve presents quite a regular profile in working days, with consumption from 7am to 10am (open hours go from 8am to 9pm). On Saturdays, it shows a peak at noon and on Sundays it is almost flat. The first one (donosti1 ), with a length of 6 months (March to September 2009) is more regular and homogeneous. The second one (donosti2 ) has a length of 12 months (September 2009 to September 2010), showed quite a non-regular profile with frequent noisy values due to the construction of the new buildings. Moreover, we made our experiments with other datasets specifically, we used the data provided in the Eunite ([5]) and Ashrae ([4]) competitions. The Ashrae competition dataset (ashrae) comes from a unknown building and has a length of only 6 months (September 1989 to February 1990). It presents similar profile as the University of Deusto datasets (donosti1 and donosti2 ). In contrast, the Eunite competition dataset (eunite) has a length of 24 months (January 1997 to December 1998) and comes from the whole Eastern Slovakian electricity demand. It only presents two patterns, one for weekdays and one for holidays. We extracted the work calendar from the load data. Fig. 4 and 5 portray their profiles. Note that the results obtained of both competitions are not comparable to ours. The Eunite competition asks to solve the forecasting of maximum daily electrical load of one month while the Ashrae competition demands participants to predict certain gaps in the dataset and used a custom error function. Finally, we have estimated the best MAPE that can be expected, as summarised in Table I (see the appendix for a detailed explanation on the procedure). B. Experimental Result We carried out the experiments on a Core 2 Duo E7400 CPU with 2GB RAM and a Gentoo Linux up to date. We used a home made java program that links to the following libraries to implements the different models: libsvm-3.1 (SVM model), neuroph-2.5b (NN model) and commons-math-2.1 (Polynomial model). Table III shows the best MAPE results (ussing parameters form Table II) for a day-ahead forecasting for all models and datasets. AR model outperforms the best but SVM is very close to it. The difference between those two models is that

20

20

5

10

15

20

20

10

15

20

Donosti1 exp 10, 1 8 10

15

20

Donosti2 exp 10, 1 6 10

5

10

Donosti1 7.34 7.92 11.91 13.46

Donosti2 13.78 14.25 19.78 17.64

Ashrae exp 10, 1 6 10

Ashrae 5.74 5.88 6.94 6.63

400 100

200

300

400 200 100 20

400 100

200

300

400 300 200 100 15

5

10

15

20

5

10

15

20

20

5

10

15

20

800

1000

Sunday

600

600

800

1000

Saturday

5

10

15

20

5

10

15

20

Friday

Saturday

15

20

Sunday

5

10

15

20

5

10

15

20

5

10

15

20

Average daily load for dataset eunite.

Eunite 1 10, 1 8 10

TABLE II B EST PARAMETERS FOR DAY- AHEAD FORECASTING .

Models AR SVM Polynomial Neural Net

15

1400 10

300

400 200 100 400 200 100 10

1000 5

Fig. 5.

Models AR (l) SVM (C, γ) Polynomial (d) Neural Net (hidden)

5

1200

1400 1000 5

800

15

10

Friday

Thursday

1200

1400 1200 10

Thursday

Wednesday

1000 5

800

800

1000

1200

1400

Tuesday

5

Average daily load for dataset ashrae.

Fig. 4. Monday

20

Sunday

400

20

20

1400

15

15

1200

10

15

600

600 5

10

800

1000

Wednesday

800

1000 20

10

1000

15

5

800

10

400

5

400

600

800

1000 800 600 400

Tuesday

20

Average daily load for dataset donosti2 .

Fig. 3. Monday

5

400

15

15

1400

10

10

Saturday

300

400 200 100 5

5

1200

20

1000

15

20

800

10

15

Friday

300

400 200 100 5

10

600

20

5

1400

15

20

1200

10

15

Thursday

300

400 300 200 5

10

Average daily load for dataset donosti1 .

Wednesday

100

100

200

300

400

Tuesday

5

Sunday

1000

15

800

10

Fig. 2. Monday

300

400 300 100 5

400

20

Saturday

1000

15

800

10

Friday

200

300 5

1000

20

400

15

800

10

100

200

300 100

200

300 200 100

5

Thursday

400

Wednesday

400

Tuesday

400

Monday

Eunite 6.69 7.34 7.36 7.78

TABLE III MAPE RESULTS IN DAY- AHEAD FORECASTING (%).

AR model is 200 times faster than SVM and has a lot less parameters to tune. Table IV shows the best MAPE results for a day-ahead

Models AR SVM Polynomial Neural Net

Donosti1 7.51 8.53 11.19 12.23

Donosti2 13.76 14.67 18.02 16.51

Ashrae 5.86 6.06 6.58 6.67

Eunite 6.66 7.45 7.30 7.94

TABLE IV MAPE RESULTS IN DAY- AHEAD FORECASTING USING THE FIX POST PROCESS (%).

forecasting for all models and datasets using the fix error algorithm. As can be seen contrasting with Table III in some case this strategy give better results and in others worse. Moreover, we tested all algorithms when predicting more than a day ahead in order to evaluate their degradation. In this way, Tables V-to-VIII show the results of the models for 6 days load forecasting. The results of forecast do not degrade much even if we increase open the prediction time horizon. The best record to our knowledge in short-term load forecasting in the ashrae dataset, presenting a MAPE of the

Models 1-day 2-days 3-days 4-days 5-days 6-days

P oly 11.91 12.66 13.41 14.18 14.89 15.75

AR 7.35 8.29 8.99 9.59 10.25 10.97

NN 13.46 14.38 15.12 15.74 16.42 17.11

SV M 7.92 8.95 9.70 10.17 10.80 11.58

TABLE V MAPE RESULTS donosti1 FORECASTING (%).

Models 1-day 2-days 3-days 4-days 5-days 6-days

P oly 19.73 20.38 20.92 21.53 22.23 23.01

AR 13.87 14.74 15.34 15.95 16.51 17.31

NN 17.64 18.69 19.46 20.11 20.32 20.78

SV M 14.25 15.35 16.02 16.64 17.14 17.78

TABLE VI MAPE RESULTS donosti2 FORECASTING (%).

1.53% [59] with a neural network. Hence, after the competition [53] accomplished a 1.945% using a neural network, five points above the our time series and support vector machine models. Still, neural networks offer a worse trade-off among the difficulty of design, parametrisation, execution time and the performance, in comparison, with time series. Moreover, the best result in the Eunite competition [41] achieves a MAPE of above 2% in predicting the maximum load of every day in a month using SVM. Note that the problem we tried to solve is much more error prone as we are predicting the load of the entire day. Even in that condition, our time series model achieves a meritorious 6.69% of MAPE. Furthermore, we presume that the exact winning parameter configuration of [59], [41] are dataset dependent and, hence, a new dataset would require tackling again with the tedious configuration process. None of them validated their model by testing their performance with others datasets. Finally, one of the dataset we used for our tests was deliberately a very noisy one (donosti2 ) as there have been the construction of a new building, almost one month vacation, and about 15% increment of consumption due to the new building in only one year. Therefore, the time series model is likely to perform better in different evolving conditions. In addition, the best results are obtained when the value of learning window q is 5. Smaller and higher and values worse the result because in the first case, the data is not properly smoothed and in the second case the data is very distant and might be unrelated. The results using exponential coefficients with a learning window greater that q = 5 are not relevant because the final value of the coefficients are 0. In some preliminary test we observed an improving in the worse daily error using the fix post process but, as we can see in Tables III and IV, this post process have an impact in the MAPE. This issue is worth of further research as the electric companies are more interested in a forecast that reduce the worse daily error than MAPE.

Models 1-day 2-days 3-days 4-days 5-days 6-days

P oly 6.94 7.65 8.49 9.29 10.12 11.02

AR 5.73 6.50 7.24 7.88 8.62 9.37

NN 6.63 7.32 8.01 8.80 9.55 10.18

SV M 5.88 6.52 7.21 7.99 8.75 9.41

TABLE VII MAPE RESULTS ashrae FORECASTING (%).

Models 1-day 2-days 3-days 4-days 5-days 6-days

P oly 7.42 7.69 7.77 7.72 7.87 8.32

AR 6.69 6.92 7.04 7.07 7.15 7.6

NN 7.78 8.48 8.70 8.49 7.61 6.85

SV M 7.34 8.08 8.34 8.15 7.25 6.45

TABLE VIII MAPE RESULTS eunite FORECASTING (%).

V. C ONCLUSION Short-time load forecasting in buildings is key aspect if we aim at increasing the energy consumption efficiency in buildings. This discipline presents special requirements compared to the classical STLF since meteorological variables have much lesser influence on the variations of the load and forecasting models should be simple and not present complicated configuration processes. In a previous work, we have presented and evaluated a number of candidate algorithms for building STLF (namely an autoregressive model, a polynomial model, a neural network and a support-vector machine). Here we introduce three new concepts to further improve the performance of these models under the aforementioned requirements. First, we use variable data windows for the days taking into account in the calculations of the algorithms. Second, we have used several ways of weighting those days. Third, we have used a random Gaussian noise with the same mean as the typical historical error to correct the predictions bias of the algorithms. We have thoroughly tested all algorithms with four different datasets and concluded that, as in our previous work, the AR model, even being computationally much more simpler, faster and not requiring any trial-and-error setting up process or additional data (e.g. temperature) suffices and outperforms the rest. Further works will concentrate on the design of meta algorithms that combining two or more of the presented algorithms, are able to top their predictions. A PPENDIX In this section we present how we have computed the estimation of the minimum MAPE. Suppose that the load curve l(h) of an specific day type has the following expression: l(h) := f (h) + ξh , where f is an unknown function and ξh is a Gaussian random variable with mean 0 and variance σh2 . Any method that

successfully forecasts the load curve l will have learned f (h). We may estimate the min expected MAPE for that case as follows: " # 24 100 X f (h) + ξh − f (h) min := E = 24 i=1 f (h) + ξh   24 ξh 100 X , (1) = E 24 f (h) + ξh h=1

Up to this point we do not have any evidence on how to compute the exact value of this expected value. Note that this would be the best theoretical error we may achieve. Our next steps aim at giving a rude estimation on Equation (1). Suppose the following bound applies: f (h) + ξh < max(l).

(2)

Using the bound in Equation (2) in Equation (1) leads to: min ≥

24 X 100 1 E[|ξh |]. 24 max(l) h=1

As E[|ξh |] =

q

2 π σh

(see [60] for example) we have that:

100 min ≥ 24

r

24 X 2 1 σh . π max(l) h=1

We may then estimate σh for instance by Var(l(h)). R EFERENCES [1] U.S. Department Of Energy, “Grid 2030: A national vision for electricitys second 100 years,” 2003. [2] H. Alfares and M. Nazeeruddin, “Electric load forecasting: literature survey and classification of methods,” International Journal of Systems Science, vol. 33, no. 1, pp. 23–34, 2002. [3] V. Hinojosa and A. Hoese, “Short-term load forecasting using fuzzy inductive reasoning and evolutionary algorithms,” IEEE Transactions on Power Systems, vol. 25, no. 1, pp. 565–574, 2010. [4] J. Kreider and J. Haberl, “Predicting hourly building energy usage: The great energy predictor shootout – overview and discussion of results,” ASHRAE Trans. 100, part, no. 2, p. 1104, 1994. [5] E. C. Comite, “Electricity load forecast using inteligent adaptative technology,” http://neuron.tuke.sk/competition/instructions.php, 2001. [6] Y. Penya, C. Borges, D. Agote, and I. Fern´andez, “Short-term load forecasting in air-conditioned non-residential Buildings,” in Proceedings of the 20th IEEE International Symposium on Industrial Electronics (ISIE), 2011 in press. [7] S. Tzafestas and E. Tzafestas, “Computational intelligence techniques for short-term electric load forecasting,” Journal of Intelligent and Robotic Systems, vol. 31, pp. 7–68, 2001. [8] E. Feinberg and D. Genethliou, “Load forecasting,” in Applied Mathematics for Power Systems, Chapter 12, 2005, pp. 269–285. [9] E. Kyriakides and M. Polycarpou, “Short term electric load forecasting a tutorial,” Studies in Computational Intelligence (SCI), vol. 35, pp. 391–418, 2009. [10] G. Mbamalu and M. El-Hawary, “Load forecasting via sub optimal seasonal auto regressive models and iteratively reweighted least squares estimation,” IEEE Transactions on Power Systems, vol. 8, no. 1, pp. 343–348, 1993. [11] A. Douglas, A. Breipohl, A. Lee, and R. F.N. Adapa, “The impact of temperature forecast uncertainty on bayesian load forecasting,” IEEE Transactions on Power Systems, vol. 13, no. 4, pp. 1507–1513, 1998. [12] R. Sadownik and E. P. Barbosa, “Short-term forecasting of industrial electricity consumption in brazil,” International Journal of Forecasting, vol. 18, no. 3, pp. 215–224, 1999.

[13] S. Huang, “Short-term load forecasting using threshold auto regressive models,” IEE Proceedings on Generation, Transmission and Distribution, vol. 144, no. 5, pp. 477–481, 1997. [14] D. Infield and D. Hill, “Optimal smoothing for trend removal in short term electricity demand forecasting,” IEEE Transactions on Power Systems, vol. 13, no. 3, pp. 1115–1120, 1998. [15] S. Sargunaraj, D. Sen-Gupta, and S. Devi, “Short-term load forecasting for demand side management,” IEE Proceedings on Generation, Transmission and Distribution, vol. 144, no. 1, pp. 68–74, 1997. [16] D. Leith, M. Heidl, and R. Ringwood, “Gaussian process prior models for electrical load forecasting,” in Proceedings of the International Conference on Probabilistic Methods Applied to Power Systems, 2004, pp. 112–117. [17] A. Douglas, A. Breipohl, A. Lee, and R. F.N. Adapa, “Practical experiences with modeling and forecast,” Time Series, 1979. [18] M. Hagan and S. Behr, “The time series approach to short term load forecasting,” IEEE Transactions on Power Systems, vol. 2, no. 3, pp. 785–791, 1987. [19] H. Yang and C. Huang, “A new short-term load forecasting approach using self-organizing fuzzy armax models,” IEEE Transactions on Power Systems, vol. 13, no. 1, pp. 217–225, 1998. [20] H. Yang, C. Huang, and C. Huang, “Identification of armax model for short term load forecasting: An evolutionary programming approach,” IEEE Transactions on Power Systems, vol. 11, no. 1, pp. 403–408, 1996. [21] Z. Yu, “A temperature match based optimization method for daily load prediction considering dlc effect,” IEEE Transactions on Power Systems, vol. 11, no. 2, pp. 728–733, 1996. [22] W. Charytoniuk, M. Chen, and P. Van-Olinda, “Non parametric regression based short-term load forecasting,” IEEE Transactions on Power Systems, vol. 13, no. 3, pp. 725–730, 1998. [23] A. Harvey and S. Koopman, “Forecasting hourly electricity demand using time-varying splines,” Journal of American Statistics Assoc., vol. 88, no. 424, pp. 1228–1236, 1993. [24] J. Taylor and S. Majithia, “Using combined forecast switch changing weights for electricity demand profiling,” Journal Operations Research Society, vol. 51, no. 1, pp. 72–82, 2000. [25] R. Engle, C. Mustafa, and J. Rice, “Modeling peak electricity demand,” International Journal of Forecasting, vol. 11, pp. 241–251, 1992. [26] S. Soliman, S. Persaud, K. El-Nagar, and M. E. El-Hawary, “Application of least absolute value parameter estimation based on linear programming to short-term load forecasting,” Electrical Power and Energy Systems, vol. 19, no. 3, pp. 209–216, 1997. [27] A. Jain and B. Satish, “Clustering based short term load forecasting using artificial neural network,” in Proceedings of the IEEE PES Power Systems Conference Exposition (PSCE), 2009, pp. 1–7. [28] K. Liu, S. Subbarayan, R. Shoults, M. Manry, C.Kwan, F. Lewis, and J. Naccarino, “Comparison of very short-term load forecasting techniques,” IEEE Transactions on Power Systems, vol. 11, no. 2, pp. 877–882, 1996. [29] H. Wu and C. Lu, “Comparison of very short-term load forecasting techniques,” IEE Proceedings on Generation, Transmission and Distribution, vol. 146, no. 5, pp. 477–482, 1999. [30] D. Srinivasan, C. Chang, and A. Liew, “Demand forecasting using fuzzy neural computation, with special emphasis on weekend and public holiday forecasting,” IEEE Transactions on Power Systems, vol. 10, no. 4, pp. 343–348, 1992. [31] P. Pai and W. Hong, “Support vector machines with simulated annealing algorithms in electricity load forecasting,” Energy Conversion and Management, vol. 46, no. 17, pp. 2669–2688, 2005. [32] J. Lu, D. Niu, and Z. Jia, “A study of short-term load forecasting based on arima-ann,” in Proceedings of the International Conference on Machine Learning and Cybernetics, 2004, pp. 3183–3187. [33] A. Dhar, T. A. Reddy, and D. E. Claridge, “Modeling hourly energy use in commercial buildings with fourier series functional forms,” Journal of Solar Energy Engineering, vol. 120, no. 3, pp. 217–223, 1998. [34] H. Hahn, S. Meyer-Nieberg, and S. Pickl, “Electric load forecasting methods tools for decision making,” European Journal of Operational Research, vol. 199, pp. 902–907, 2009. [35] H. Mori and H. Kobayashi, “Optimal fuzzy inference for short-term load forecasting,” IEEE Transactions on Power Systems, vol. 11, no. 1, pp. 390–396, 1996. [36] Z. Yun, Z. Quan, S. Caixin, L. Shaolan, L. Yuming, , and S. Yang, “RBF neural network and ANFIS-based short-term load forecasting approach

[37]

[38]

[39] [40] [41] [42] [43] [44] [45]

[46] [47] [48]

in real-time price environment,” IEEE Transactions on Power Systems, vol. 23, no. 3, pp. 853–858, 2008. H. Mori and H. Kobayashi, “Application of a fuzzy neural network combined with a chaos genetic algorithm and simulated annealing to short-term load forecasting,” IEEE Transactions on Evolutionary Computing, vol. 10, no. 3, pp. 330–340, 2006. K. Ho, Y. Hsu, C. Chen, T. Lee, C. Liang, T. Lai, and K. Chen, “Shortterm load forecasting of taiwan power system using a knowledge-based expert system,” IEEE Transactions on Power Systems, vol. 5, no. 4, pp. 1214–1221, 1990. S. Rahman and O. Hazim, “A generalized knowledge-based shortterm load-forecasting technique,” IEEE Transactions on Power Systems, vol. 8, no. 2, pp. 508–514, 1993. J. Sharp, “Comparative models for electrical load forecasting,” International Journal of Forecasting, vol. 2, no. 2, pp. 241–242, 1986. C. Bo-Juen, C. Ming-Wei, and L. Chih-Jen, “Load forecasting using support vector machines: a study on EUNITE competition 2001,” IEEE Transactions on Power Systems, vol. 19, no. 4, pp. 1821–1830, 2004. S. Fan and L. Chen, “Short-term load forecasting based on an adaptive hybrid method,” IEEE Transactions on Power Systems, vol. 21, no. 1, pp. 392–401, 2006. S. Lin, Z. Lee, S. Chen, and T. Tseng, “Parameter determination of support vector machine and feature selection using simulated annealing approach,” Applied Soft Computing, vol. 8, no. 4, pp. 1505–1512, 2008. A. Jain and B. Satish, “Clustering based short term load forecasting using support vector machines,” in Proceedings of the IEEE Bucharest PowerTech, 2009, pp. 1–8. S. E. Papadakis, J. B. Theocharis, S. J. Kiartzis, and A. G. Bakirtzis, “A novel approach to short-term load forecasting using fuzzy neural networks,” IEEE Transactions on Power Systems, vol. 13, no. 2, pp. 480–492, 1998. J. Park, Y. Park, and K. Lee, “Composite modeling for adaptive shortterm load forecasting,” IEEE Transactions on Power Systems, vol. 6, no. 2, pp. 450–457, 1991. D. J. C. MacKay, “Bayesian non-linear modelling for the prediction competition,” in ASHRAE Transactions, V.100, Pt.2. Atlanta Georgia: ASHRAE, 1994, pp. 1053–1062. G. Webb and M. Kuzmycz, “Evaluation of data aging: A technique for discounting old data during student modeling,” in Proceedings of the Fourth International Conference on Intelligent Tutoring System. Ablex, 1998, pp. 384–393.

[49] J. Greer, J. Zapata-Rivera, C. Ong-scutchings, and J. Cooke, “Visualization of bayesian learner models,” in Proceedings of the Open, Interactive, and other Overt Approaches to Learner Modelling Workshop at the 9th International Conference on Artificial Intelligence in Education, 1999, pp. 6–10. [50] A. Dhar, T. A. Reddy, and D. E. Claridge, “Generalization of the fourier series approach to model hourly energy use in commercial buildings,” Journal of Solar Energy Engineering, vol. 121, no. 1, pp. 54–62, 1999. [51] B. Dong, C. Cao, and S. Lee, “Applying support vector machines to predict building energy consumption in tropical region,” Energy and Buildings, vol. 37, no. 5, pp. 545–553, 2005. [52] R. H. Dodier and G. P. Henze, “Statistical analysis of neural networks as applied to building energy prediction,” Journal of Solar Energy Engineering, vol. 126, no. 1, pp. 592–600, 2004. [53] P. Gonz´alez and J. Zamarre˜no, “Prediction of hourly energy consumption in buildings based on a feedback artificial neural network,” Energy and Buildings, vol. 37, no. 6, pp. 595–601, 2005. [54] B. Ekici and U. Aksoy, “Prediction of building energy consumption by using artificial neural networks,” Advances in Engineering Software, vol. 40, no. 5, pp. 356 – 362, 2009. [55] G. Darbellay and M. Slama, “Forecasting the short-term demand for electricity do neural networks stand a better chance?” International Journal of Forecasting, vol. 16, pp. 71–83, 2000. [56] L. Soares and L. Souza, “Forecasting electricity demand using generalized long memory,” International Journal of Forecasting, vol. 22, no. 1, pp. 17 – 28, 2006. [57] C. Garc´ıa-Ascanio and C. Mat´e, “Electric power demand forecasting using interval time series: A comparison between VAR and iMLP,” Energy Policy, vol. 38, no. 2, pp. 715–725, February 2010. [58] IEC60870: Telecontrol equipment and systems - Part 5: Transmission protocols - Section 102: Companion standard for the transmission of integrated totals in electric power systems, 1st ed., IEC, 1996. [59] M. Daneshdoost, M. Lotfalian, G. Bumroonggit, and J. Ngoy, “Neural network with fuzzy set-based classification for short-term load forecasting,” IEEE Transactions on Power Systems, vol. 13, no. 4, pp. 1386– 1391, 1998. [60] Wikipedia, “Normal distribution — wikipedia, the free encyclopedia,” 2011, [Online; accessed 19-April-2011]. [Online]. Available: http://en.wikipedia.org/w/index.php?title=Normal distribution&oldid=424417039