• Nebyly nalezeny žádné výsledky

Hlavní práce64371_xpild00.pdf, 2.3 MB Stáhnout

N/A
N/A
Protected

Academic year: 2023

Podíl "Hlavní práce64371_xpild00.pdf, 2.3 MB Stáhnout"

Copied!
85
0
0

Načítání.... (zobrazit plný text nyní)

Fulltext

(1)

University of Economics in Prague Faculty of Finance and Accounting

Department of Banking and Insurance

MASTER’S THESIS

Nord Pool day-ahead electricity price modeling and forecasting

Author: Bc. Daniel Piln´y

Supervisor: Ing. Milan Fiˇcura, Ph.D.

Academic Year: 2018/2019

(2)

Declaration of Authorship

The author hereby declares that he compiled this thesis independently, using only the listed resources and literature, and the thesis has not been used to obtain a different or the same degree.

The author grants to the University of Economics in Prague permission to reproduce and to distribute copies of this thesis document in whole or in part.

Prague, August 19, 2019

Signature

(3)

Acknowledgments

This thesis is dedicated to my family and my closest friends, who continuously supported me throughout my studies and especially during the crucial time of writing this thesis. Special thanks to my supervisor Ing. Milan Ficura, PhD.

for his patience and advice.

(4)

Abstract

The main focus of this thesis is a comparison of models for Nord Pool day-ahead electricity price prediction. Classical econometric models Reg-(S)ARIMA(- GARCH) and machine learning models SVR and LSTM are used to model and forecast of electricity price in the bidding areas NO2 (Norway) and DK1 (Denmark). The thesis contains analysis of electricity market in both of the countries and based on the local specifics, it explores potential contribution of exogenous variables such as air temperature, precipitation, wind speed, fos- sil fuels price, hydro reservoir levels and electricity consumption. The models are developed on data in years 2014 - 2016 and tested on data in years 2017 - 2018. Forecast accuracy is evaluated as Root Mean Square Error (RMSE) and statistical significance of the difference between two separate forecasts is determined based on Diebold-Mariano test. The models are compared to a Naive forecast and among each other. Even though electricity prices show very specific behaviour, the used methods are relevant even for prediction of other financial time series.

JEL Classification C40, C51, C52, L94, Q47

Keywords electricity, time series, forecasting, machine learning, ARIMA, LSTM, SVR

Author’s e-mail xpild00@vse.cz Supervisor’s e-mail milan.ficura@vse.cz

(5)

Abstrakt

Hlavn´ım c´ılem t´eto pr´ace je porovn´an´ı model˚u pro predikci Nord Pool ’day- ahead’ ceny elektˇriny. Standardn´ı ekonometrick´e modely Reg-(S)ARIMA(- GARCH) a modely strojov´eho uˇcen´ı SVR a LSTM jsou pouˇzity pro mode- lov´an´ı a pˇredpovˇeˇd cen elektˇriny v nab´ıdkov´ych oblastech NO2 (Norsko) a DK1 (D´ansko). Pr´ace obsahuje anal´yzu trhu elektˇriny v obou zem´ıch a na z´akladˇe lok´aln´ıch specifik zkoum´a potenci´aln´ı pˇr´ınos exogenn´ıch promˇenn´ych jako jsou teplota vzduchu, sr´aˇzky, rychlost vˇetru, cena fosiln´ıch paliv, hlad- ina vodn´ıch rezervo´ar˚u a spotˇreba elektˇriny. Modely jsou vyvinuty na dat- ech z let 2014 - 2016 a testov´any jsou na datech z let 2017 - 2018. Pˇresnost pˇredpovˇed´ı je vyhodnocena jako odmocnina stˇredn´ı ˇctvercov´e chyby (RMSE) a statistick´a v´yznamnost rozd´ılu dvou r˚uzn´ych pˇredpovˇed´ı je urˇcena pomoc´ı Diebold-Mariano test. Modely jsou porovn´any s Naivn´ı pˇredpovˇed´ı a mezi sebou navz´ajem. Pˇrestoˇze ceny elektˇriny vykazuj´ı velmi specifick´e chov´an´ı, pouˇzit´e metody jsou relevantn´ı i pro predikci jin´ych finanˇcn´ıch ˇcasov´ych ˇrad.

Klasifikace JEL C40, C51, C52, L94, Q47

Kliˇcov´a slova elektˇrina, ˇcasov´e ˇrady, predikce, strojov´e uˇcen´ı, ARIMA, LSTM, SVR

E-mail autora xpild00@vse.cz E-mail vedouciho prace milan.ficura@vse.cz

(6)

Contents

List of Tables viii

List of Figures x

1 Introduction 1

2 Nordic electricity market 3

2.1 Nord Pool Power Exchange . . . 3

2.2 Day-ahead market . . . 5

2.3 Norwegian and Danish electricity markets . . . 6

2.4 NASDAQ OMX Oslo ASA . . . 10

3 Literature review 13 4 Models theory 16 4.1 Naive model . . . 16

4.2 Autoregressive Integrated Moving Average models (ARIMA) . . 16

4.3 Support Vector Regression (SVR) . . . 19

4.4 Long Short-Term Memory (LSTM) . . . 20

5 Methodology 24 5.1 Hypotheses . . . 24

5.2 Target definition . . . 25

5.3 Stationarity tests . . . 26

5.4 Seasonality and spikes identification . . . 27

5.5 Raw prices vs. Logarithmic prices . . . 29

5.6 Feature scaling (Normalization/Standardization) . . . 30

5.7 Feature-selection & hyper-parameters tuning . . . 31

5.8 Cross-validation . . . 32

5.9 Diagnostic tests . . . 35

(7)

Contents vii

6 Data 37

6.1 Day-ahead electricity price . . . 37

6.2 Electricity consumption . . . 38

6.3 Weather . . . 39

6.4 Hydro reservoirs . . . 41

6.5 Fossil fuels . . . 43

7 Results 45 7.1 Samples . . . 45

7.2 Seasonality . . . 48

7.3 Stationarity . . . 57

7.4 Exogenous variables selection . . . 59

7.5 Final hyperparameters . . . 60

7.6 Performance evaluation . . . 61

8 Conclusion 65

Bibliography 68

A Autoregressive models specification and diagnostics I

(8)

List of Tables

2.1 Nord Pool market structure (2018) . . . 3

2.2 Norwegian electricity mix 2018 . . . 8

2.3 Danish electricity mix 2018 . . . 8

2.4 Production per capita in 2018 . . . 9

2.5 Overview of number of models per country . . . 12

5.1 Average of hourly prices . . . 26

5.2 Overview of number of models per country . . . 26

5.3 Cross-validation samples . . . 33

6.1 Hydro reservoirs capacity . . . 42

7.1 Periodogram . . . 48

7.2 Average price by day of the week, train sample 2013 - 2014 . . . 49

7.3 Average price by month of the year, train sample 2013 - 2014 . . 49

7.4 Percentiles analysis, train sample 2013 - 2014 . . . 50

7.5 Spikes analysis, train sample 2013 - 2014 (peak) . . . 51

7.6 Weekly seasonality (NO2) . . . 52

7.7 Weekly seasonality (DK1) . . . 53

7.8 Annual seasonality (NO2) . . . 55

7.9 Annual seasonality (DK1) . . . 56

7.10 Fossil fuel prices stationarity . . . 57

7.11 Weather variables stationarity . . . 58

7.12 Elspot prices and consumption stationarity . . . 58

7.13 Reg-SARIMA-GARCH hyperparameters . . . 60

7.14 SVM hyperparameters . . . 60

7.15 LSTM hyperparameters . . . 61

7.16 Denmark (DK1): Test sample RMSE . . . 63

7.17 Denmark (DK1): Model RMSE vs. Seasonal Naive RMSE . . . 63

(9)

List of Tables ix

7.18 Denmark (DK1): Model RMSE vs. Regression RMSE . . . 63 7.19 Denmark (DK1): Model A RMSE vs. Model B RMSE . . . 63 7.20 Norway (NO2): Test sample RMSE . . . 64 7.21 Norway (NO2): Model RMSE vs. Seasonal Naive RMSE . . . . 64 7.22 Norway (NO2): Model RMSE vs. Regression RMSE . . . 64 7.23 Norway (NO2): Model A RMSE vs. Model B RMSE . . . 64 A.1 Denmark (DK1): Reg-ARIMA . . . II A.2 Norway (NO2): Reg-SARIMA . . . III

(10)

List of Figures

2.1 Nord Pool hourly day-ahead system price . . . 4

2.2 Nord Pool bidding areas . . . 5

2.3 Electricity supply and demand . . . 6

2.4 Energy sources by year . . . 7

2.5 Norwegian and Danish electricity production by area . . . 9

4.1 The soft margin loss setting for a linear SVM . . . 19

4.2 LSTM architecture . . . 21

4.3 Sigmoid and Tanh . . . 22

5.1 Target definition . . . 25

5.2 Cross-validation samples graphics . . . 33

5.3 First phase: training . . . 33

5.4 Second phase: development . . . 34

6.1 Day-ahead hourly electricity prices in NO2 and DK1 . . . 38

6.2 Electricity consumption . . . 39

6.3 Air temperature . . . 40

6.4 Wind speed . . . 41

6.5 Hydro reservoir levels . . . 43

6.6 Fossil fuels prices . . . 44

7.1 Graphical overview of NO2 samples . . . 46

7.2 Graphical overview of DK1 samples . . . 47

7.3 Periodogram . . . 48

7.4 Weakly seasonality ACF . . . 54 A.1 Norway (NO2): Reg-SARIMA (off-peak 1, peak, off-peak 2, day) IV A.2 Denmark (DK1): Reg-ARIMA (off-peak 1, peak, off-peak 2, day) V A.3 Denmark (DK1): Reg-ARIMA-GARCH (Day) . . . V

(11)

List of Figures xi

A.4 Denmark (DK1):Reg-ARIMA-GARCH (Off-peak 2) . . . VI A.5 Denmark (DK1): Reg-ARIMA-GARCH (Peak) . . . VI

(12)

Chapter 1 Introduction

Deregulation of the Nordic energy sector in 1990s brought transformation of monopolistic regime into liberalized market. Since then, electricity price has no longer been a subject to the state’s control, but rather it has been deter- mined by supply and demand at an organized power market. At the time, individual Nordic countries joined together to create more efficient grid, where power would be shared between countries, providing secure power supply to a relatively inelastic power demand. Over the years, the grid expanded to Baltic countries and further to Europe. Today, trade of the Nordic electricity is concentrated at the Nord Pool power exchange.

With its unique features, electricity is unlike any other commodity. First of all, electricity is non-storable, it must be available on demand and used when it is generated. As such, no-arbitrage condition fails in the electricity market.

Electricity price exhibits seasonality on daily, weekly and annual levels, because demand depends on weather and intensity of work both during week (working days vs. weekends and holidays) and day (peak vs. off-peak hours). The prices are highly volatile with abrupt spikes, positive skewness, excess kurtosis and conditional heteroscedasticity (Karakatsani & Bunn, 2008a). Also, negative prices occur as a result of high inflexible power generation.

The free competition introduced both benefits and drawbacks to the energy market participants. The electricity price volatility can be up to two orders of magnitude higher than volatility of any other commodity or financial asset (Weron, 2014), posing a threat to a degree of bankruptcy. Therefore, a reliable prediction of electricity price is necessary for building effective strategies and managing risks. Many different methods have been utilized for this purpose over the years.

(13)

1. Introduction 2

The objective of this thesis is a comparison of several models for the Nord Pool day-ahead electricity price prediction, namely electricity prices in the bidding areas NO2 (Norway) and DK1 (Denmark) were used. The prices we work with are the wholesale electricity prices, retail electricity prices are not subject of this thesis. As the price of each geographical area has its specifics, especially due to different electricity production mix, different methods have been observed to have different degree of success across countries. We consider three different modeling approaches that are used to predict the price and their performance is compared to each other in terms of Root Mean Square Error (RMSE). Autoregressive Integrated Moving Average (ARIMA) approach represents a classical parametric modeling method used widely in financial time series analysis. We also consider its extended form Reg-SARIMA-GARCH. It is put in contrast with two machine learning methods: Support Vector Regression (SVR) and Long Short-Term Memory (LSTM). A price of the previous day (Naive approach) and price of the previous same weekday (Seasonal Naive approach) are considered as the benchmark forecasts (together with Multiple Linear Regression), all other methods should be superior to these forecasts.

Data preparation as well as Reg-SARIMA-GARCH and SVR estimation were performed in R, LSTM was trained in Python as neural networks are better implemented there.

The thesis is structured as follows: Chapter 2 provides description of the Nord Pool power exchange with explanation of the day-ahead electricity price formation and trading conventions. The chapter also contains analysis and comparison of electricity markets in Norway and Denmark, which are primary focus of this thesis. Chapter 3 gives a summary of the most recent and the most relevant research on day-ahead electricity price modeling and forecasting, especially with regard to the Nord Pool data. Chapter 4 covers the selected modeling methods (Reg-SARIMA-GARCH, SVR, LSTM), explaining underly- ing logic of each approach. Chapter 5 explains methodology used for treating the data, samples split and cross-validation of the models. Chapter 6 con- tains description and analysis of the data considered for modeling. Chapter 7 presents results of the modeling itself, comparing accuracy of the model fore- casts based on RMSE of test sample. Diebold-Mariano test is used to test whether the difference between the RMSE values is statistically significant.

Chapter 8 is dedicated to conclusion and final remarks that should help in further research of the topic.

(14)

Chapter 2

Nordic electricity market

2.1 Nord Pool Power Exchange

With 524 terawatt-hours (TWh) traded in 2018, Nord Pool is the largest mar- ket place for trading power in Europe. Over 380 companies from 20 different countries trade power at this power exchange, posting more than 2 000 orders each day. The exchange runs two leading markets:

• Day-ahead market (Elspot),

• Intraday market (Elbas).

The day-ahead market (Elspot) is the Nord Pool’s most important market.

In 2018, 516 TWh (98%) was traded in the day-ahead market itself, 396 TWh (76%) in the Nordic and Baltic market and 120 TWh (23%) in the UK market.

The intraday market (Elbas) forms only approximately 2% of the total volume traded at Nord Pool. It is used only to balance the demand and supply between announcement of the day-ahead prices and physical delivery.

Table 2.1: Nord Pool market structure (2018)

Market Volume (Twh) Volume (rel.)

Nordic and Baltic day-ahead market 396 76

UK day-ahead market 120 23

Nordic, Baltic and German intraday market 8 2

Total 524 100

Source: Nord Pool; own processing

(15)

2. Nordic electricity market 4

Countries that are part of the Nord Pool day-ahead market are integrated in a single power grid, where electricity is distributed in a way to increase liquidity, efficiency and social welfare of the participating parties. In theory, a single electricity price exists for all of the parties. The price is called system price and it is used as Nordic reference price for trading and clearing of most financial contracts. It assumes no congestion restrictions and infinite capacities for transmission between different areas.

Figure 2.1: Nord Pool hourly day-ahead system price

0 50 100 150 200

2014 2016 2018

SYS Elspot hourly

20 40 60 80

2014 2016 2018

SYS Elspot daily

Date

EUR/MWh

Source: Nord Pool; own processing

However, due to existing transmission system constraints, the electricity prices throughout the countries differ. Each country is divided into bidding areas by the local Transmission system operator (TSO) and each of those areas has its own area price. The transmission constraints between all areas (in both directions) are updated daily by the relevant TSO. Currently, there are 5 bidding areas in Norway (NO1, NO2, NO3, NO4, NO5), 4 bidding areas in Sweden (SE1, SE2, SE3, SE4), 2 bidding areas in Denmark (DK1, DK2).

Finland (FI), Estonia (EE), Lithuania (LT) and Latvia (LV) has each only one bidding area (one electricity price for the entire country). Figure 2.4 provides overview of the bidding areas’ geographical location.

(16)

2. Nordic electricity market 5

Figure 2.2: Nord Pool bidding areas

Source: Nord Pool

2.2 Day-ahead market

As already mentioned in the Chapter 1, electricity market is unique especially due to non-storability of the electric power. This feature directly influences the structure and trading conventions of the market as both demand and supply need to be properly planned in advance. Buyers of the electric power plan how much energy they need to cover the demand of the following day and how much they are willing to pay for each hour of the day. Sellers of the electricity decide how much energy they are able to produce and at what cost in each hour of the day.

The day-ahead market is an auction based exchange, where bids for physical delivery of power on the next day (T+1) are submitted to the Nord Pool exchange till 12:00 CET of the current day (T). The Nord Pool’s computer algorithm then computes electricity prices for each hour of the following day.

The prices are published in 12:42 CET (T) or later and the contracts are physically delivered from 00:00 CET of the following day (T+1).

By European convention, the day-ahead electricity prices are regarded as spot electricity prices even though intraday market is much closer to real time trading. The standardized trading unit of electric power is MWh and official currency of the market is EUR.

The final day-ahead price of electricity reflects the cost of producing one

(17)

2. Nordic electricity market 6

MWh from the most expensive power source that needs to be employed to meet the demand. Renewable sources of energy such as hydro, wind and solar power are the cheapest to utilize, but might be quite unpredictable in case of wind and solar. Hydro power itself represents around half of the total power generation capacity in the Nordic market. When the demand cannot be covered just by renewable sources of energy, other sources with higher production costs such as nuclear power, CHP (combined heat and power) or coal condensing are utilized (Knapik, 2017). Figure 2.3 provides illustration of the price formation according to marginal cost of each power source.

Figure 2.3: Electricity supply and demand

Source: Knapik (2017).

Extreme prices are mostly a cause of technical restrictions of the electricity production and transmission. Oversupply of the inflexible sources of energy (shutting down and restarting the power generation is associated with high costs) leads to negative prices. On the other side, unexpected outages or trans- mission grid problems cause price spikes.

2.3 Norwegian and Danish electricity markets

Even though each Nordic country has its own typical electricity production mix, the whole region is characteristic for a large share of renewable energy sources on the total power production. For the purpose of this thesis, two countries

(18)

2. Nordic electricity market 7

that rely on the renewables the most and that are completely different from each other in terms of the sources used for power generation are considered:

Norway and Denmark. Figure 2.5 illustrates share of different sources of energy on the total power generated in the countries over the time period 1990 - 2015.

Figure 2.4: Energy sources by year

Denmark Norway

1990 2000 2010 1990 2000 2010

0.00 0.25 0.50 0.75 1.00

Year

Fuel share of the total electricity production

variable

Coal Wind Gas Oil Biofuels Waste Hydro Solar

Source: International Energy Agency; own processing

Norway has historically relied almost entirely on the hydro power. Its hydro reservoirs have total capacity of 86.9 GWh (see Chapter 5. Inflow to hydro reservoirs depends on melting snow or precipitation conditions. During normal years, Norway has a surplus of energy that is exported to other countries.

Norway optimizes its use of hydro power by importing electricity during off- peak hours in the night, saving the water for the best-paid periods during the day. The imported power comes primarily from thermal power plants and wind farms on the Continent (Statnett).

In 2018, total power generation in Norway reached 145 686 GWh. The hydro power accounted for 95% of the energy production, the rest was covered by wind power, fossil fuels or other marginal sources. Table 2.2 provides overview of the Norwegian power generation in 2018.

(19)

2. Nordic electricity market 8

Table 2.2: Norwegian electricity mix 2018 Fuel Generation (GWh) Generation (%)

Hydro 138 040 95

Wind 3 384 2

Fossil 3 170 2

Other 1 092 1

Total 145 686 100

Source: Entsoe; own processing

In contrast to Norway, Denmark has considerably changed its power pro- duction mix over the years and did so in a completely different direction. In 1990, it was almost entirely dependent on coal, while in 2018, 48% of the to- tal production was generated by the wind power. It has produced most of the wind turbines all around the world as well, which only confirms Denmark to be a wind power leader the same way as Norway is considered to be hydro power leader. While Norwegian fjords provide great opportunity for harnessing the water, Danish flat landscape allows to take every advantage of the wind.

Table 2.2 provides overview of the Danish power generation in 2018.

Table 2.3: Danish electricity mix 2018 Fuel Generation (GWh) Generation (%)

Wind 13 889 48

Fossil 9 130 32

Bio 3 667 13

Waste 1 267 4

Solar 959 3

Hydro 15 0

Total 28 927 100

Source: Entsoe; own processing

Table 2.4 compares Denmark and Norway in terms of electricity production per capita in 2018. In 2018, the Norwegian electricity production per capita was almost 6 times higher than Danish electricity production per capita. Norwegian consumption of the energy is among highest in the world.

(20)

2. Nordic electricity market 9

Table 2.4: Production per capita in 2018

Country Population Production per capita (MWh)

Norway 5 295 619 27.5

Denmark 5 781 190 5.0

Source: Entsoe; own processing

Because both Norway and Denmark are divided into smaller price areas (bidding areas) and the prices generally don’t differ too much across one coun- try, only one area from each country is chosen for the modeling of day-ahead electricity prices. The bidding area with the highest production in each country is chosen. As illustrated by Figure 2.5, NO2 is the largest electricity producer in Norway and DK1 is the largest electricity producer in Denmark.

Figure 2.5: Norwegian and Danish electricity production by area

0.1 0.2 0.3 0.4

2014 2016 2018

Bidding area NO1 NO2 NO3 NO4 NO5

Norway

0.3 0.4 0.5 0.6 0.7

2014 2016 2018

Bidding area DK1 DK2

Denmark

Date

Share of total electricity production by area

Source: Nordpool; own processing

Notice that the electricity production in bidding areas NO1, NO2 and NO5 considerably changed during 2013. This was caused by change of the bidding areas definition on 2nd December 2013. The local transmission system operator (TSO) Statnett informed about this change on the Nord Pool websites. As the area NO5 was chosen for the day-ahead price modeling, only data since 2014 are considered for the model development. Using earlier data could bias the models.

(21)

2. Nordic electricity market 10

2.4 NASDAQ OMX Oslo ASA

Nord Pool ASA was acquired by NASDAQ OMX in 2010 and subsequently it changed its name to NASDAQ OMX Oslo ASA with trade name NASDAQ OMX Commodities Europe. It is a commodity derivative exchange (a single fi- nancial energy market for Norway, Denmark, Sweden and Finland) that holds a license under the Norwegian Exchange Act. It is authorized by the Norwegian Ministry of Finance and supervised by the Norwegian Financial Supervisory Authority (NFSA), Finanstilsynet. It provides trading on power, natural gas and carbon emission markets, fuel oil, seafood derivatives, iron ore, electric- ity certificates and clearing services. Financial electricity contracts (such as futures) are used to guarantee prices and manage risk when trading power.

NASDAQ OMX Commodities offers contracts of up to ten years’ duration, with contracts for days, weeks, months, quarters and years.

Generally, the relationship between spot and futures prices is explained either by the storage cost theory or risk premium theory.

Storage cost theory:

Ft,T =St·e(rt+yt−ut)T (2.1) Where:

F futures price S spot price

r risk-free interest rate u storage cost

y convenience yield T time period Risk premium theory:

Ft,T =Et(St+T)e(rt−it) =Et(St+T)e−pt (2.2) Where:

pt risk premium

Botterudet al.(2010) analyzed relationship between the Nord Pool spot and weekly futures prices in time period from 1996 to 2006 with a conclusion that futures prices tend to be higher than spot prices, having average convenience yield negative with seasonal variability depending on water levels in the hydro reservoirs and demand. Weron & Zator (2013) contradicted the original paper

(22)

2. Nordic electricity market 11

from Botterud et al. (2010) pointing at weaknesses in the econometric setup and flaws in the argumentation (the study used longer time series from 1998 - 2010). The direction of the relationship, arguing that relation of the water level and the risk premium is of opposite sign that claimed in Botterud et al.

(2010). The convenience yield model does not have strong support as well.

The reasons why we do not consider the futures as a market benchmark against which to evaluate our forecasts:

1. The underlying price of the futures is the Nord Pool system price (SYS), which is a theoretical price calculated for the whole Nord Pool area.

However, we predict prices for individual bidding areas - NO2 and DK1.

2. The futures previous day closing price is almost always the same as the Spot price itself, which is given by the fact that the derivatives exchange is closing only after the auctioned prices for the next day are already known. Prices of the week futures are not suitable for comparison with our forecasts.

3. Prices of futures are not available on weekends and public holidays, in contrast the physical electricity market is active throughout the whole week.

Table 2.5 provides an overview of day futures prices, week futures prices and system spot prices. Note that while futures prices are reported in the table as the closing prices on a given day, the system spot prices are reported for the day in which the electricity is delivered (T+1).

(23)

2. Nordic electricity market 12

Table 2.5: Overview of number of models per country

Day Weekday Futures (day) Futures (week) Spot (SYS)

12/8/2018 Sunday - - 45.50

11/8/2018 Saturday - - 47.18

10/8/2018 Friday 47.18 49.30 48.22

9/8/2018 Thursday 48.22 48.85 50.53

8/8/2018 Wednesday 50.53 48.35 51.14

7/8/2018 Tuesday 51.14 51.00 53.52

6/8/2018 Monday 53.52 52.00 52.25

5/8/2018 Sunday - - 49.53

4/8/2018 Saturday - - 51.58

3/8/2018 Friday 51.58 51.90 53.52

2/8/2018 Thursday 53.52 51.48 53.29

1/8/2018 Wednesday 53.29 50.50 53.88

31/7/2018 Tuesday 53.88 52.20 53.32

30/7/2018 Monday 53.32 52.25 53.03

Source: Reuters & Nord Pool; own processing

(24)

Chapter 3

Literature review

Jonnson et al. (2010) concluded a strong impact of day-ahead wind power penetration forecast on Western Denmark day-ahead electricity spot prices.

The relationship was found not only to influence level of the spot prices, but to alter their distributional characteristics as well. According to the study, predicted increase in wind power penetration causes (in average) the Elspot prices to decrease and diminish their intra-day variation. The authors used non- parametric regression approach, which proved to be a good choice as the studied relationship was non-linear. The results are beneficial to any market having wind power as part of their power generation mix. According to the authors, the conclusions can help explain some of the regime-switching behaviour in the prices. It also support some previous work, such as Karakatsani & Bunn (2008b), which argued that aspects of plant dynamics should be considered when modelling short-term electricity spot prices.

Weron (2014) provides a complex overview of the most important elec- tricity price forecasting models and practices. It compares different meth- ods, both statistical and computational, and underlines their strengths and weaknesses. It stresses out importance for the community co compare differ- ent methods on the same datasets, with the same error evaluation procedures and statistical testing of the significance of the model’s outperformance of an- other multi-layer LSTM approach is proposed by Jiang (2018) for Australian and Singapore market. Exogenous variables such as day of the week, hour of the day, weather conditions, oil prices and historical price and demand are used. The LSTM approach outperforms four other methods in terms of MAPE:

adaptive neuro-fuzzy inference system optimized with particle swarm optimiza- tion algorithm (PSO-ANFIS), conventional backpropagation based multilayer

(25)

3. Literature review 14

feed-forward network (BP-ANN), artificial neural network with the Wavelet transformation(WT-ANN), seasonal ARIMA (SARIMA).

Kuo & Huang (2018) proposes hybrid model consisting of 2 types of neural networks: convolutional neural network (CNN) and Long Short Term Memory (LSTM). In terms of MEA and RMSE, the model outperformed separate neural network models and other machine learning models such as Support Vector Machines, Rangom forest and Decision trees.

Lago et al. (2018) shows on case of Belgium that market integration can play large role in prediction of electricity price. Features from connected Bel- gium and French market are considered in the deep neural network. The study highlights benefits of the considering neighbouring markets in terms of regula- tion, grid stability and economic profit of the agents in the electricity market.

Including market integration improved MAPE.

Loland & Dimakos (2010) investigates Elspot spread between NO1 (Oslo) price area and system price, which enlarged during 2007. Different factors were examined, such as deviation from normal level of water reservoir, Elspot flow and capatity, net capacity utilization, time and season. Considered models are compared in terms of Bayesian information criterion (BIC). Other factors not used during the analysis were suggested, such as CO2 prices and snow reservoir levels. Generalized additive model framework was used for estimation of relative price difference between NO1 and Nord Pool system price. The results show that NO1 price is below system price when water reservoir levels are high relative to normal levels and export capacity of the are is limited.

Kristiansen (2012) was predicting Nord Pool day-ahead prices in time period from January 2004 to May 2011. The study compared OLS regression with regularized (Rigde and LASSO) regressions and neural networks. The results were compared in terms of MAPE and R2 with neural network as a winner.

Hybrid electricity price forecasting model is presented in Voronin & Par- tanem (2012). Finnish electricity spot prices (Elspot) in time period from January 2006 to December 2009 are used. The models are estimated in 2 lay- ers for normal and spiky behaviour. ARMA, GARCH and Neural network models are applied. Gausian Mixture model and K-nearest neighboring model are used for estimation of probablity of price spike occurrence.

Knapik (2017) focuses on Nord Pool hourly intraday prices (Elbas) in time period from September 2009 to December 2015. The study takes into account persistence in the series, seasonality and also external information such as hy- dro reservoir levels or electricity loads. Autoregressive ordered probit model,

(26)

3. Literature review 15

Markov model and Autoregressive conditional multinomial model (ACM) are applied for prediction of extreme price events. ACM model was able to predict the price spikes most accurately.

(27)

Chapter 4

Models theory

4.1 Naive model

In time-series prediction, the simplest and the most cost-effective forecast ˆYt+1 that can be made is to use the last observed value Yt. Naive model is used as a benchmark model. Any other more complex and computationally more de- manding model should provide superior forecast that out-performs this bench- mark, otherwise there is no added value for the increased complexity.

t+1 =Yt (4.1)

As there is a strong weekly seasonality in the electricity market, we also compute seasonal naive forecast. It is more precise forecast in this case. For example, Monday electricity price will be forecasted better by the price on the previous Monday, rather than by price on the previous day (Sunday), which is due to very different electricity demand.

t+1 =Yt−6 (4.2)

4.2 Autoregressive Integrated Moving Average mod- els (ARIMA)

ARIMA models are”gold standard”of financial time series forecasting. The de- velopment of these models is desribed byBox-Jenkins Method (Box & Jenkins, 1970), which is a well-known aggregate of statistical techniques that provides guidance on every step of the way - pre-processing data (treating stationarity

(28)

4. Models theory 17

and seasonality), identification/selection appropriate order, parameters estima- tion and model diagnostic.

ARIMA(p,d,q) models are combination of AR(p) and MA(q) component, with possible integration(d). They are also called models with short memory.

ARMA models can be desribed as Yt =c+t+

p

X

i=1

ϕiYt−i+

q

X

i=1

θit−i (4.3)

If we assume differencing (d >0) and backward operator is defined as:

BkYt =Yt−k (4.4)

then ARIMA can be expressed as:

φp(B)(1−B)dXtq(B)at (4.5) Two most frequently used methods for parameters estimation are Method of Least Squares and Maximum likelihood estimation (MLE).

For initial values:

Xt, . . . , Xp, atp(B)θq(B)−1Xt=π(B)Xt and at∼N(0, σ2a) (4.6) the parameters of the model are estimated by maximum likelihood function (MLE):

ln(L(φ1, . . . , φp, θ1, . . . , θq, σa2)) =−T −p

2 (ln 2π+ lnσa2)− 1 2σa2

T

X

t=p+1

a2t (4.7)

If exogenous variables are added to the model, it is called ARIMAX model.

However, such models are also known astransfer function modelsin engineering literature and dynamic econometric/regression models in economics literature (Nogales & Conejo, 2005)(Pankratz, 1991):

Yt=C+ν(B)Xt+Nt (4.8)

(29)

4. Models theory 18

Where:

Yt Dependent variable Xt Independent variable

C Constant

Nt Stochastic disturbance ν(B)Xt Transfer function

B Back-shift operator

ν(B)Xt = (ν01B+ν1B2+. . .)Xt (4.9) When using statistical packages to estimate the parameters, careful atten- tion should be dedicated do specification of the model as

Transfer function model:

Yt= β(B)

ϑ(B)Xt+ θ(B)

φ(B)Zt (4.10)

Linear regression with ARMA errors (Reg-ARMA):

Yt=βXt+ θ(B)

φ(B)Zt (4.11)

In constructing transfer function, there are two important concepts: prewhiten- ing and cross-correlation (CCF) function. If the analysed time series are auto- correlated, it might be difficult to correctly access the cross-correlation and the final model migh be mis-specified. Prewhitening means that all time series are first transformed in such a way that they resemble white noise. ARMA model is first fit to the input series to reduce the residuals to white noise. The response series is then filtered with the same model and the residuals are cross-correlated.

It is a commonly used correction for model identification diagnostics.

Possible extension of the ARIMA models is including a seasonal component and get SARIMA (if we consider quartely data here):

(1−φ1B)(1−Φ1B4)(1−B)(1−B4)yt = (1 +θ1B)(1 + ΘB4)et (4.12) ARIMA models are estimating mean, Generalized Autoregressive condi- tional heteroskedasticity (GARCH) model can be joined to model the variance:

(30)

4. Models theory 19

Yttt σt20+X

αiYt−12 +X βjσt−j2

(4.13)

ARIMA-GARCH parameters should be estimated simultaneously, otherwise the estimated parameteres will be biased. There are more variants of GARCH such as EGARCH, TGARCH, IGARCH or GARCH-M.

4.3 Support Vector Regression (SVR)

Support Vector Machine (SVM) is used for classification problems, where target is categorical. However, the same princips can be used for regression problems when the target is a continuous variable. The difference from a simple regression minimizing an error, we try to fit error in a certain threshold. It is a non- parametric technique as it relies on kernel functions, which works good for non-linear problems.

Figure 4.1: The soft margin loss setting for a linear SVM

Source: Scholkopf & Smola (2002)

Formulated as convex optimization problem:

min1

2kwk2 (4.14)

yi −wxi −b ≤+ξi

wxi+b−yi ≤+ ˆξi

(4.15)

(31)

4. Models theory 20

1

2kwk2+C

N

X

i=1

i+ ˆξi) (4.16)

Kernels are used to map a lower dimensional data into a higher dimensional data. The most common kernels are:

linear u0v

polynomial k(xi,xj) = (xi·xj+ 1)d radial (RBF) k(xi,xj) = exp(γ||xi−xj||2)

sigmoid k(x, y) = tanh(αxTy+c)

(4.17)

We will use radial kernel as default.

There are three hyper-parameters to tune:

• Epsilon (): if difference between prediction and real value is not higher then epsilon, it is not considered an error, the thickness of a tube

• Cost (c): penalty parameter

• Gamma (γ): kernel function parameter, accounts for smoothness of the decision boundary and controls the variance of the model

Cross validation and grid search are used for tuning the hyper-parameters.

4.4 Long Short-Term Memory (LSTM)

LSTM belongs to a class of models calledartificial neural networks (ANN). The basic principles of ANN are inspired by neural activity of a biological brain.

Those methods have a wide range of application in classification, clustering and regression problems. However, ANNs are a black box, which might provide very accurate predictions, but without ability to interpret the model in a way as linear regression or ARIMA models can be interpreted. LSTM is useful when autocorrelation is present and it can handle long dependencies.

LSTM is a special case of recurrent neural network (RNN), which process sequential data (such as time series). There is a problem in RNN architecture called vanishing gradient. It is connected to training artificial neural networks with gradient-based learning methods and back-propagation. This issue arises especially when dealing with large time series data sets. Gradient gets smaller

(32)

4. Models theory 21

when going into deeper layers and at some point weights do not change at all.

This is remedied by the LSTM architecture.

it=σ(wi[ht−1, xt] +bi) ft=σ(wf[ht−1, xt] +bf) ot =σ(wo[ht−1, xt] +bo)

(4.18)

Where:

it Input gate Xt Forget gate

C Output gate σ Sigmoid function

wx Weight of a given gate (x) Neuron ht−1 Output of the previous lstm block

xt Input

bx Biases for the respective gates (x)

Figure 4.2: LSTM architecture

Source: https://colah.github.io/posts/2015-08-Understanding-LSTMs/

3 layers: Input Layer, Hidden Layer, Output Layer

˜

ct=tanh(wc[ht−1, xt] +bc) ct=ft·ct−1+it·c˜t

ht=ot·tanh(ct)

(4.19)

(33)

4. Models theory 22

Where:

ct Cell state memory at time-stamp (t)

˜

ct Represents candidate for cell state at time-stamp (t)

There are many differentactivation functions that are used for transforming input into output. By computing weighted sum of output and adding bias, it decides whether neuron would be activated or not. Activation function should be monotonic, bounded and easily differentiable. Sigmoid and Tanh have all three properties and they are used as activation functions in LSTM cells.

σ(z) = 1

1 +e−z (4.20)

tanh(z) = ez−e−z

ez +e−z (4.21)

Figure 4.3: Sigmoid and Tanh

sigmoid tanh

−6 −3 0 3 6 −6 −3 0 3 6

−1.0

−0.5 0.0 0.5 1.0

x

y

Another very popular (due to its good experimental results) activation function is called Rectified Linear Unit (ReLU). Its simplicity and low com- putational demands allow for fast convergence of the weights. However, in comparison to Sigmoid and Tanh, the function is not bounded and is therefore recommended rather for Multilayer Perception (MLP) an Convolutional Neural Networks (CNN).

f(z) = max(0, z) (4.22)

Due to properties of the activation functions, normalization of data is nec-

(34)

4. Models theory 23

essary, otherwise it would be very difficult for the gradient-based algorithms to effective converge to correct weights.

(35)

Chapter 5

Methodology

5.1 Hypotheses

There are two main questions which are the interest of this thesis:

1. Does any of the proposed models outperform (Seasonal) Naive forecast?

2. Which of the proposed models provides the best forecasts?

In this thesis, forecast (predictive) accuracy is measured in terms of Root Mean Square Error (RMSE), which will be further described in the later sec- tions. A simple difference in forecast accuracy of two different forecast is not a sufficient evidence of superiority of one forecast over the other as it might be just a result of chance. Therefore, statistical significance of the predictive accuracy difference between any two forecasts is tested by theDiebold-Mariano test (Diebold & Mariano, 1995).

The null hypothesis of the Diebold-Mariano (DM) test is that there is no difference in the accuracy of two competing forecasts.

H0 :E(D12t) =E(L(e1t(F1t))−L(e2t(F2t))) = 0 (5.1) An important property of the test is that it allows for non-Gaussian, non-zero mean and serially/contemporaneously correlated forecast errors. The only DM requirement is that the loss differential is covariance stationary.

DM12= d12 ˆ σd12

d

→N(0,1) (5.2)

It should be also emphasized that the DM test is meant to compare fore- casts, not the models (Diebold, 2015). In this regard, when comparing the

(36)

5. Methodology 25

forecasts we are interested in the prediction itself rather then in the data gen- erating process.

5.2 Target definition

The target definition follows the day-ahead market convention described in Chapter 2. The definition is graphically illustrated by Figure 5.1. If today is day T, the target is electricity price for a given hour announced on day T in 12:42 CET (or later) with physical delivery on T+1 (tomorrow). If only lagged day-ahead electricity prices are used for the prediction, the window for prediction is between 12:42 CET on T-1 (yesterday) and 12:00 CET on T (blue line). However, when additional explanatory variables represented by publicly available information are considered in the model, the window for prediction is between 00:00 CET and 12:00 CET on T (green line), because only then the most relevant data is available (such as average wind speed of the past day).

To extend this prediction window (the green line), different frequency of data or older data (which would probably have very low predictive power) could be considered.

Figure 5.1: Target definition

12:00

T - 1 T T + 1

12:42

12:42

Source: Own processing.

As mentioned in Chapter 2, physical electricity is traded through an auction system that sets the electricity price for each individual hour of a following day resulting in 24 different prices. Therefore, for each bidding area, 24 individual models could be build to forecast price of each hour separately, with possibility to combine the forecasts into one single day forecast while making use of all available price dynamics information (strong daily seasonality due to varying demand throughout the day). Based on the electricity demand, each day can be divided into three time periods having a similar price behaviour: off-peak

(37)

5. Methodology 26

1, peak and off-peak 2. Price for each of these periods is calculated as an arithmetic average of the underlying hours.

Table 5.1: Average of hourly prices Type Hours (CET) Description

Off-peak 1 00:00 - 08:00 Average of 8 hours.

Peak 08:00 - 20:00 Average of 12 hours.

Off-peak 2 20:00 - 00:00 Average of 4 hours.

Day 00:00 - 00:00 Average of 24 hours.

Table 5.2 provides overview of all models to be evaluated.

Table 5.2: Overview of number of models per country

Hour Reg-ARIMA SVR LSTM

off-peak 1 M1,1 M2,1 M3,1

peak M1,2 M2,2 M3,2

off-peak 2 M1,3 M2,3 M3,3

day M1,4 M2,4 M3,4

5.3 Stationarity tests

Two types of stationarity are generally recognized: strong (strict) and weak.

The weak stationary process has the property that the mean, variance and autocovariance structure do not change over time:

µt=E(Xt)

σt2 =D(Xt) = E(Xt−µt)2

γ(t, t−k) = E[(Xt−µt)(Xt−k−µt−k)]

ρ(t, t−k) = γ(t, t−k) σtσt−k

(5.3)

Weak stationarity of the time series is required by most of the quantitative models in finance. However, the assumption is often violated due to nature of the financial time series, which usually contain a trend and high volatility that changes in time. Although stationarity is not strictly required for the machine

(38)

5. Methodology 27

learning methods (such as SVR or LSTM), it can enhance performance of the models and therefore we require stationarity of time series for all modeling methods used in the thesis. We use two tests to determine whether a time series is stationary or not:

• Augmented Dickey – Fuller test (ADF),

• Kwiatkowski – Phillips – Schmidt – Shin test (KPSS).

ADF is a unit root test with null hypotesis of non-stationarity (unit root is present in the time series):

Xt1Xt−1+

p−1

X

i=1

βi∆Xt−i+at

Xt01Xt−1+

p−1

X

i=1

βi∆Xt−i+at

Xt0+γt+φ1Xt−1+

p−1

X

i=1

βi∆Xt−i+at

(5.4)

H01 = 1 versus H1 :|φ1|<1

while the null hypotesis of KPSS is stationarity. When ADF rejects the null hypotesis and KPSS supports the null hypotesis (both with significance level 0.05), the time series is considered stationary.

Xt =rt+γt+e (5.5)

rt =rt−1+ut ut∼iid(0, σu2) (5.6) In case that at least one of the tests conclude non-stationarity of the time series, differentiation is used to obtain a stationary time series.

5.4 Seasonality and spikes identification

Apart from making a times series stationary by removing a trend, seasonal and cyclical components should be removed as well to make the forecasting of the non-systematic part easier.

(39)

5. Methodology 28

yt=Tt+Ct+St+It (5.7) Where:

yt Dependent variable Tt Trend

Ct Cycle St Season

It Non-systematic part

In general, it is extremely important for electricity price models to capture the seasonal component correctly. However, on the other side, there are many different competing approaches, each having its advantages and disadvantages.

In general, electricity prices show seasonality at the annual, weekly and daily levels. A proper treatment of the seasonality is a modelling problem on its own.

Failure to model seasonality correctly results in poor model estimates. Analysis and discussion of different approaches for seasonality modeling is out of scope of this thesis. See Janczura (2013) for overview of the different approaches. For the sake of simplicity and comparability of different models, we decided to use dummy variables that is a common practice when removing seasonality from data. A dummy variable is made for each month and day. Public holidays are considered as eight day of the week. Daily seasonality is not a concern as the electricity price is modeled for each hour of the day separately as described in Target definition section.

As already mentioned in the previous sections, sudden and extreme spikes are characteristic to the electricity prices. However, according to Janczura (2013) these spikes might unfavourably influence estimation of the seasonal pattern and therefore the extreme values should be identified and treated before the estimation.

Replacement of observed price spikes with the mean of the desea- sonalized price series is like replacing the extraordinary conditions leading to a spike by the typical or ”normal” conditions on that day of the week and time period of the year. The replacement of a particular spike may be also interpreted as a low marginal cost power plant replacing a very high marginal cost power plant on the marginal cost curve on that day. (Janczura 2013)

(40)

5. Methodology 29

Nadaraya-Watson kernel regression is considered for modeling the annual seasonal pattern. It is a non-parametric technique to estimate the conditional expectation of a random variable.

ˆ

mh(x) = Pn

i=1YiK(x−Xh i) Pn

j=1K(x−Xh j) (5.8)

Where:

ˆ

mh unknown function K kernel

h bandwidth

The smoothing parameterhis called bandwidth and controls the smoothing level of the estimation

Additional analysis of seasonality is performed by means of periodogram (spectral analysis):

Pˆ(f) = 1 N

N

X

n=1

x[n] exp(−j2πf n)

2

, 0≤f < 1 (5.9)

5.5 Raw prices vs. Logarithmic prices

In finance, time series are mostly non-stationary and heteroscedastic. As the statistical models usually require the time series to be stationary and ho- moscedastic, it is a common practice among researchers to perform logarithmic transformation of the prices and take first difference. The resulting log returns have better statistical properties.

logarithmic return= log Pt

Pt−1

= log(Pt)−logPt−1 (5.10) However, as electricity prices can be negative (which is especially case of Denmark in our case, see Figure 6.1) and this phenomenon has its economical reasons, logarithmic transformation is not always an option (if appropriate scale is not applied). Erni (2012) mentions benefits of modelling raw electricity prices and lists other research papers considering efforts to stabilize the variance of the original series as contra-productive, concealing important statistical properties of the time series.

We decided to use raw prices in this thesis. We consider differencing when

(41)

5. Methodology 30

neccessary to make time series stationary, but no logarithmic transformation is not applied.

5.6 Feature scaling (Normalization/Standardization)

Feature scaling is used in machine learning algorithms to ensure each predictor has the same scale and weights of all predictors are proportional. Another reason is that gradient descent, parameter optimizer, converges faster when data is scaled.

There are two options: either to normalize data using minimum and maxi- mum and obtain values in range [0,1]:

Xnorm = X−Xmin

Xmax−Xmin (5.11)

or use z-standardization and obtain values with zero mean and standard deviation equal to one:

Xstd= X−µ

σ (5.12)

where

σ = v u u t

1 N −1

N

X

i=1

(xi−x)¯ 2 (5.13)

Due to the extreme values present in the electricity price time series, z-score standardization is more appropriate as the normal values would otherwise got squeezed. Standardized values of target and exogenous variables are used as input for both SVR and LSTM.

When performing standardization, possible leak of information between samples has to be prevented. Data should always be standardized by mean and standard deviation calculated from train/development sample. This means that validation sample is scaled by mean and standard deviation of the train sample and test sample is scaled by mean and standard deviation of the de- velopment sample. If we used scale computed from validation/test sample, we would use information that was supposed to be hidden from the model. It follows the same logic as with estimation of seasonality.

(42)

5. Methodology 31

5.7 Feature-selection & hyper-parameters tuning

Due to inherent differences between parametric models such as ARIMA and non-parametric models such as SVR and LSTM, model selection process was approached differently (based on best-practices for each group of models).

When developing parametric models, the goal is usually to find a good fit to the data at hand with a strive for parsimony, i.e. using as few parameters as possible (only statistically significant parameters are kept). This is achieved by using all available data (the whole data-set without any further split) and in-sample model selection. The methods usually enable interpretation of the causal relationships.

To prevent over-fitting of a model selected based on in-sample performance, twoinformation criterionsare computedBayesian Information Criterion (BIC) and Akaine Information Criterion (AIC).

BIC =−2 ln( ˆL) + ln(n)k (5.14)

AIC =−2 ln( ˆL) + 2k (5.15) Where:

x Observed data

n Number of observations k Number of parameters

Lˆ The maximized value of model likelihood function

Both AIC and BIC penalize models with more parameters. We decided to use BIC as it gives larger penalty and results in less complex models. Based on experimentation with both metrics, AIC tends to overparametrize and leads to overfitting the models.

Model selection process for non-parametric/machine learning methods is rather different. Part of the development sample is ”cut off” to be used as pseudo-out-of-sample (validation sample). As machine learning methods tend to have many different hyperparameters, which need to be tuned, model is trained for different combinations of the hyperparameters and the combination, which results in the lowest loss function, is chosen. Overfitting the model is controlled by the validation sample in this case.

There are many different methods that can be used forfeature selection, i.e.

identification of the useful predictors. Some of the methods include:

(43)

5. Methodology 32

• filter methods

• wrapper methods (forward selection, backward elimination, recursive fea- ture elimination

• embedded methods

Initially, as we did not have so many exogenous variables, we wanted to perform exhaustive wrapper method, i.e. test each combination of the variables.

The best combination of the variables would be the, which decreased the RMSE the most.

Because we do not have so many exogenous variables, we decided to use exhaustive wrapper method, i.e. to test each combination of the variables for each model separately. The best combination of features is decided on minimal RMSE on validation sample. This approach cannot be recommended universally as with higher number of observations and predictors, the process becomes computationally very expensive.

Pearson correlation coefficient is a useful technique for initial data analysis and determination of whether there is a strong relationship between variables:

ρX,Y = E[(X−µX)(Y −µY)]

σXσY (5.16)

5.8 Cross-validation

Rolling cross-validation is chosen to tune hyperparameters and test perfor- mance of the models. First, the considered time period 2014 - 2018 is split into 3 different samples: train sample (40% of the observations), validation sample (20% of the observations) andtest sample (40% of the observations). Train and validation samples together are called development sample (60% of the obser- vations). The ratio of the split was chosen so that each sample contains only full years for a case that the performance of the models differs over different seasons of a year. The Table 5.3 contains numerical overview of the split and Figure 5.2 provides graphical illustration of the split. Note that the number of observations in each sample is the same for each bidding area and time period.

(44)

5. Methodology 33

Table 5.3: Cross-validation samples

Sample Year Observations (abs) Observations (%)

Train 2014, 2015 730 40

Validation 2016 366 20

Test 2017, 2018 730 40

Figure 5.2: Cross-validation samples graphics

Development sample

Train sample Validation sample

2014 2015 2016 2017 2018

Test sample

YEAR

The train sample is used to train an initial model. Validation sample is used for hyper-parameters tuning and possibly for features selection. Once hyperparameters and features that optimize performance of the model in terms of loss function on the validation sample are found through a grid search, the whole development sample is used to train the model (re-calibrate the weights).

ARIMA models are fit on the whole development sample using a standard in- sample model selection and therefore the validation sample is relevant only to the machine learning methods (SVR, LSTM), where hyper-parameters need to be set up first.

Figure 5.3: First phase: training

Train

Val.

Data Val.

2014 2015 2016 2017 2018

YEAR

Data Data

Val.

The test sample is used to evaluate performance of the final models and compare them between each other. The evaluation is performed on basis of a rolling window. As the focus of this thesis is one step prediction, i.e. prediction of the day-ahead electricity price (see beginning of the section for definition

(45)

5. Methodology 34

of the target), the window is always rolled forward by one (test) observation and error of the forecast is computed. The process repeats itself till the test observation is the last available observation in the test sample. In our case we use expanding window, when observations are added one by one to the original development dataset, which expands in result (old observations will have no effect, we do not have to worry about inconsistency of forecasts).

Sometimes, fixed window is preferred, especially in cases when the parameters are re-estimated after each iteration. The process is illustrated by Figure 5.4.

Figure 5.4: Second phase: development

Development

Data Test

Data Test

2014 2015 2016 2017 2018

YEAR

Data Data

Test Test

There are manyevaluation metrics that can be used to compare predictive power of different models, these are the most popular metrics (the smaller value, the better):

• Mean Absolute Error (MAE) M AE = 1

n

n

X

i=1

|yi−yˆi| (5.17)

• Mean Squared Error (MSE)

M SE = 1 n

n

X

i=1

(yi−yˆi)2 (5.18)

• Root Mean Square Error (RMSE) RM SE =

v u u t 1 n

n

X

i=1

(yi−yˆi)2 (5.19)

(46)

5. Methodology 35

Apart from the mentioned metrics,coefficient of determination is also often computed as a goodness of fit indicator (R2 = 1 means perfect fit):

R2 = 1− Pn

i=1(yi−yˆi)2 Pn

i=1(yi−y)2 (5.20)

For parametric models, computation ofadjusted coefficient of determination is preferred as it penalizes the model for additional parameters (as R2 tends to decrease with any additional parameters):

R2 = 1−(1−R2)

n−k n−(k+ 1)

(5.21) RMSE is used in this thesis both as evaluation metric and as a loss function that is minimized (for example hyperparameters are chosen based on the min- imal RMSE on validation sample). In comparison to MAE, RMSE penalizes the larger errors more (because of the second power of the errors).

5.9 Diagnostic tests

Performing diagnostic tests on the in-sample model residuals is part of the Box-Jenkins methodology. According to the Gauss-Markov theorem, if errors are uncorrelated, have equal variance (homoskedasticity) and expected value equals to zero, Ordinary Least Squares (OLS) estimates of the regression co- efficients are the best linear unbiased estimates (BLUE). The assumptions do not extend to SVR and LSTM and therefore the tests are performed only for the Autoregressive models:

1) Autocorrelation test: Ljung-Box test (H0: no autocorrelation)

Q=n(n+ 2)

h

X

k+1

ˆ ρ2k

n−k (5.22)

Where:

n sample size ˆ

ρ2k sample autocorrelation at lag k h number of tested lags

2) Heteroscedasticity test: Engle’s ARCH test (H0: Homoskedasticity) a2t01a2t−1+· · ·+αpa2t−p +et (5.23)

(47)

5. Methodology 36

Q=n(n+ 2)

h

X

k+1

ˆ ρ2k

n−k (5.24)

3) Normality test: Jarque-Bera test (H0: Normality) J B = n−k+ 1

6

S2 +1

4(C−3)2

(5.25)

S = µˆ3 ˆ

σ3 (5.26)

C = µˆ4 ˆ

σ4 (5.27)

Where:

ˆ

µ3 estimate of 3rd central moment ˆ

µ4 estimate of 4rd central moment

The assumptions are usually tested in the order as presented above. Pres- ence of autocorrelation in the in-sample ARIMA residuals is considered as the most severe defect of the model, while normality assumption might often be relaxed (especially for samples with large number of observations).

We first perform the diagnostic tests for residuals of simple Reg-(S)ARIMA models and based on the results decide whether there is a need for adding the GARCH component as well. This decision is made based on the ARCH test.

Odkazy

Související dokumenty

In order to be able to assess robustness of member states and EU energy policies, Behrens and Egenhofer expand the narrow concept of security of supply, identifying six risks while

The goal of this master’s thesis is to provide business growth strategies on the e-commerce market for a price comparison website

Higher inflation in Slovakia might had been caused by the increase in regulated prices, because the prices of electricity, gas, traffic and rent were held on the

The main purpose of demand management for electricity is to reduce the peak load in the power system, which is necessary both to reduce prices in the electricity market and to

The objective of the diploma thesis is declared in the introduction of thesis – the prediction of the operating profit under market risks by applying the CorporateMetrics

In spite of that, the research performed and the comparison of the 3 countries across selected product categories brought several interesting findings to what extend quality, price

There is no clearly declared main goal of the thesis, but several research questions are raised, several research hypotheses are formulated and the ”general objective” of the thesis

This thesis aims to explore the effect that the implementation of Enterprise Resource Planning systems has on the five performance objectives of operations