Analyzing gold price during selected crises

In this section, I will be analyzing the gold price during selected crises that I provided in section 5.1. I will be analyzing each crisis individually and while I will be providing Python code used to conduct the analysis, I will do such thing only for the first crisis analyzed, as it will be the same for every that will follow. However, if some deviation from procedure occurs during analysis of other crises, I will provide examples of code for that deviation.

When analyzing each crisis, I will do so in following manner:

1. Assessing relationship between each independent variable and the dependent variable separately

This step ensures that independent variable indeed has linear relationship and is correlated with independent variable enough to make it to the final model.

2. Assessing relationships between all independent variables

In this step, I will be looking into correlation matrix to determine whether there are not any pairs of variables with high correlation.

3. Calculating and evaluating the model

In this step, I will be calculating and evaluating the final model.

Figure 12: Constant maturity rates, 1-year in upper plot, 10-year in bottom plot

Source: https://fred.stlouisfed.org/series/DGS1 - (1-year), https://fred.stlouisfed.org/series/DGS10 - (10-year)

19 3.3.1 1^st crisis

First period to be analyzed starts in 1973. At the beginning of that year, major stock markets around the world crashed as the result of collapse of Bretton Woods system. However, later that year, the crisis has been compounded by oil crisis, which was a result of oil embargo proclaimed by OPEC. Overall, according to US business cycle, the 70s recession lasted until March 1975.⁶

So the period that I will be analyzing starts on January 1973 and ends on March 1975.

Selecting data according to analyzed period

Upon retrieving the data, variables in Python store whole period that is selected. That is since the beginning of 1970. However, it is needed to extract just the period between January 1973 and March 1975.

In order to obtain such period, first it is need to create two Python variables, start_date and end_date which will store the beginning and end of the desired period. After that the function .loc[] is called on the dataset to index based on label. In this case, the filtering is done by

inserting start_date and end_date as arguments as seen in Figure 13. Technically speaking, those two variables are not needed and it is possible to just type the dates as arguments, but creating the variables is more convenient and prevents mistakes.

The same is repeated for every variable and we can proceed to the next step.

1. Assesing relationship between each independent variable and dependent variable separately

Scatterplots

6 https://www.nber.org/research/data/us-business-cycle-expansions-and-contractions

Figure 13: Selecting data according to period

Source: Self-made

In this step, I will be trying to determine whether each dependent variable has enough strong relationship with independent variable to even make it to the final model. Firstly, it is good to obtain scatter plots to visually examine the relationship.

Scatterplots of all the independent variables and dependent variable can be seen in Figure 14.

Just upon looking at them, there are few clearly highly correlated pairs of independent variables and dependent variable. The most obvious one is pair with S&P 500 index. Then it seems that variable tenYear, which is the ten year treasury constant maturity rate, also shares a high correlation with Gold. After that, the Dollar, GDP and WTI are also possible candidates while CPI and oneYear do not appear highly correlated. It is already highly possible that these two lastly mentioned variables won’t make to the final model, however, few more steps need to be taken before reaching that conclusion.

After visually observing the relationships, via scatterplots, next step is obtaining correlation coefficients of dependent variable and independent variables.

Figure 14: Gold to Independent variables scatterplots

Source: Self-Made, data taken from FRED and Stooq

21 Correlation table

After obtaining the Correlation table(Figure 15), it should confirm findings from scatterplots.

It appears that S&P 500 index is indeed highly correlated(negatively) with Gold. The same goes for tenYear variable. Slightly lower, but still significant correlation can be also observed between Gold and independent variable GDP. Also, variables CPI and oneYear, which were suspected of low correlation with Gold, truly appear to be lowly correlated. In all of these cases, correlation table serves as confirmation of scatterplots observation. However, while it was suspected that WTI would be correlated with Gold, but not highly, it appears that it is indeed highly correlated, according to Figure 15.

Code I used to obtain correlation coefficients can be seen in Figure 16. This exact function, that I used(.corr()) is called on a DataFrame object, so firstly I created a list of individual datasets and then used a pd.concat() function to create variables DataFrame. After calling .corr() on that object, it returns a correlation matrix, however, I altered it to look like Figure 15. Also, I decided to change the names of the columns in line 4 for better clarity.

So after these two “sub-steps”, I already have an idea of variables that will be left out of the final model, however, there is still one more thing to be done to confirm these findings.

Simple regressions for each pair

Source: Self-made, data taken from FRED and Stooq

Source: Self-made

Figure 15: Correlation table, Independent variable to Dependent variables

Figure 16: Python code for correlation

Next sub-section of this step is to conduct simple regression for each pair of independent and dependent variables. This is done to find out what independent variables make good model separately, since that will usually lead to them making a good model when used together.

While it is possible to just conduct the regression analysis on each variable separately, it is way easier to do that using loop that extracts the desired values from the model and then appends it to the list. This can be seen in Figure 18. First, the list info is created that will store the values. Then I created I list of names so that the loop can append them as well for better clarity.

After that follows the actual loop that

loops through the X list that contains the variables. For each of the variables, it firstly fits the intercept, as sm.OLS() function does not do that and without the intercept, the squared R values would be highly inflated. Next we actually create the fitted model by calling sm.OLS() function that takes X and Y argument and appending .fit(). The name variable is there so that the right name gets appended to the list. After that we decide which values we wish to extract and append them to the info created earlier with name, to clarify which variable were the values extracted for. This creates a “list of lists” so on the last line I converted it to DataFrame.

Source: Self-Made

Source: Self-made, data taken from FRED and Stooq

Figure 18: Simple regression loop

Figure 17: Simple regressions values

Figure 17 shows the values for each simple regression with dependent variables. As suspected, SPX, WTI and tenYear do indeed look like very good candidates for final multiple regression model, since they do provide a lot of explanation of variance of the dependent variable at significant level.

While GDP appears to explain lower amount of variance of the dependent variable, however it seems like it still is significant enough with p-value of .0002. This means that it is still worth trying to put it into model.

On the other hand, variables CPI, oneYear and Dollar do not explain much of variance in the dependent variable and to not appear to be significant. That means that these variables will most likely be excluded from the final model.

2. Assesing relationship between all independent variables

Next step in the process is to see what the relationships among the independent variables are.

I will again start with scatterplots and then calculate the correlation.

Figure which shows the grid plot of scatterplots can be found in APPENDIX A, as there are 7 independent variables, meaning that the plot is quite large. Upon glancing over the scatterplots, it seems like scatterplots of pairs SPX-GDP, GDP-tenYear, SPX-tenYear and oneYear-tenYear have high correlation. However, to get the precise answer, it is important to look at correlation matrix which can be found in Figure 19.

Upon looking at the correlation matrix, it is rather clear that there are several problematic pairs. It seems like a lot of them indeed share a high correlation, which could result in

Figure 19: Correlation matrix for independent variables

Source: Self-made, data taken from FRED and Stooq

multicollinearity within the final model. Since I already determined the most likely non-redundant variables, I decided to produce second correlation matrix, which can be seen in Figure 20, which shows only those variables.

It is quite clear that this data set will most likely indeed have multicollinearity present. That would lead to the fact that coefficients could not be relied on and high p-values. For now, I will leave it at that, proceed with the last two steps and examine the model later to asses just how does the multicollinearity affect the model.

3. Calculating and evaluating the model

Calculating the multiple independent variables model is conducted the same way as simple one, however, instead of passing just one variable as X, the list containing all desired variables is passed to the function.

Firstly, I decided to calculate the model that contains every independent variable.

Unfortunately, it is rather underperforming.

As shown in Figure 21, the model explained 95.4% of variance in the independent variable and overall model is significant. But looking at the lower part of the figure reveals that variables are nowhere near significant enough. This problem most likely comes from presence of multicollinearity.

Figure 20: Correlation matrix containing non-redundant variables

Source: Self-made, data taken form FRED and Stooq

Figure 21: Full Model, 1st crisis

Source: Self-made, data taken from FRED and Stooq

25 So then I tried to calculate the model using four variables that I determined to be the best when it came to simple regressions. But the model not even explains less variance in the independent variable(which is not that surprising, as adding more variables will never result in lower R squared, meaning that it is impossible for the previous model to explain less variance), but it is actually even less significant than the previous model.

After receiving very underperforming results from the previous models, I decided to try to obtain the best model by forward selection.

Luckily, I had more success with this method, as I obtained the model depicted in Figure 23. This model not only explains 94.7% of variance in the independent variable(with adjusted R squared being 0.943) with high overall significance, but the coefficients are significant

as well. For those reasons I will proclaim this as the final model for this crisis. The model can be seen in (3.1).

𝑃𝑟𝑖𝑐𝑒 𝑜𝑓 𝑔𝑜𝑙𝑑 = 318,16 + 9,79(𝑝𝑟𝑖𝑐𝑒 𝑜𝑓 𝑊𝑇𝐼) − 3,23(𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑈𝑆𝐷) 3.1

With this the analysis of the first crisis is done. Next crises will be much shorter in volume of text, since I explained my steps in detail during the first analysis and the steps will be very similar for those that follow.

3.3.2 2^nd crisis

Next crisis that I will be analyzing is the early 80s recession jointly with the 1979 oil crisis that was a key event leading to the recession, since rather fast rise in oil prices pushed inflation of developed countries even higher than it was before.

The period that I will be analyzing starts on January 1979 and ends on January 1983.

Figure 22: Model with four selected variables

Source: Self-made, data taken from FRED and Stooq

Figure 23: Model obtained by forward selection, 1st crisis

Source: Self-made, data taken from FRED and Stooq

26 In this case, I will not be adding any more variables.

1. Assesing relationship between each independent variable and dependent variable separately

Scatterplots

The scatterplots for pairs of independent variables and dependent variable can be seen below in Figure 24.

The scatterplots show that there is a suspicion of high correlation of dependent variable with WTI and SPX. GDP also appears to be somewhat correlated, however it seems like there are some outliers that can effect the coefficient, so lets confirm that by calculation correlation coefficients.

Correlation table

The correlation table of dependent variable and independent variables can be seen in Figure 25.

Source: Self-made, data taken from FRED and Stooq

Figure 25: Dependent variable and independent variables correlation table, 2nd crisis

Source: Self-made, data taken from FRED and Stooq

Figure 24: Independent variables versus dependent variables scatterplots, 2nd crisis

Upon looking at correlation table above, it is clear that overall correlation of independent variables with dependent one is rather low. However, it indeed looks like SPX and WTI are at least somewhat strongly correlated with Gold. Variables like GDP, CPI, oneYear, tenYear and Dollar have very low correlation.

Simple regression for each pair

Running simple regression for each pair indeed shows that during this crisis, independent variables are not that good at explaining variance in the Gold. Results can be seen in Figure 26.

The table above not only shows that individual variables do not do a good job at explaining the variance in Gold, but most of them don’t appear significant as well. At 5% level, only three of them are significant. Those are WTI, SPX and Dollar.

2. Assesing relationship between all independent variables

Scatterplots of independent variables pairs can be again found in APPENDIX A, as it is too large to fit here.

After looking at the scatterplots, it seems like oneYear and tenYear have high correlation, so it is basically impossible for them both to be in the final model. The same goes for WTI and SPX and few other pairs. However, let see the correlation table before reaching any conclusion. The table can be seen below in Figure 27.

Figure 26: Simple regressions, 2nd crisis

Source: Self-made, data taken from FRED and Stooq

The correlation table confirms that oneYear and tenYear are highly correlated. There are more pairs with high correlation, but none is as high as the first mentioned. Lets move onto the final step now and see whether we will have to remove some highly correlated variables.

3. Calculating and evaluating the model

As in the first analysis, I will firstly calculate the model using all of variables. This model can be seen in Figure 28. This model explains 83,7% of variance in the independent variable and is significant, however it seems like most of the coefficients are not. For those reasons, I again decided to go for the model obtained with forward selection. This model can be seen in Figure 29.

Figure 27: Independent variables correlation table, 2nd crisis

Self-made, data taken from FRED and Stooq

Self-made, data taken from FRED and Stooq Self-made, data taken from FRED and Stooq

Figure 29: Full model, 2nd crisis Figure 28: Model obtained by forward selection, 2nd crisis

Through forward selection, I finally obtained model the is overall significant and also has significant coefficients. It explains 82.8% of variance in the independent variable.

Unfortunately it doesn’t explain as much as the model obtained during analysis of previous crisis, but R squared is still rather high with adjusted value not being significantly lower. The model can be seen in (3.2).

𝑃𝑟𝑖𝑐𝑒 𝑜𝑓𝑔𝑜𝑙𝑑 = 2824,8 + 10,85(𝑝𝑟𝑖𝑐𝑒 𝑜𝑓 𝑊𝑇𝐼) − 8,61(𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ 𝑜𝑓 𝑈𝑆𝐷) 3.2 + 4,5(𝑆𝑃𝑋) − 0,39(𝑅𝑒𝑎𝑙 𝐺𝐷𝑃)

3.3.3 3^rd crisis

The third crisis to be analyzed is the Early 90s recession that spanned most of the Western countries. It is said to be the result of 1990 oil price shock and sequent tightening of monetary policy due to fears of rising inflation.

It lasted approximately from beginning of 1990 to beginning of 1992, meaning that I will be analyzing period from February 1990 to February 1992.

1. Assesing relationship between each independent variable and dependent variable separately

Scatterplots

As before, I will start with scatterplots. The scatterplots seen in Figure 30 show that Gold might share have some relationship with variables SPX, oneYear, tenYear and WTI. However, to be certain, the correlation matrix needs to be examined.

Figure 30: Independent variables versus dependent variables scatterplots, 3rd crisis

Source: Self-made, data taken from FRED and Stooq

30 Correlation table

The correlation table in Figure 31 somewhat confirms findings from scatterplots. SPX is indeed highly correlated with Gold, while oneYear and tenYear present slightly lower correlation coefficient. The same goes for CPI, however WTI does not seem correlated enough.

Simple regression for each pair

As suspected from correlation table, SPX does a good job at explaining variance in the independent variable while also being significant. Both Treasury rates (tenYear and oneYear) explain less variance, but still are significant. GDP, CPI and WTI also appear to be significant, however they explain very low proportion of variance in Gold. The worst variable in this cast is Dollar, which explains almost no variance while not being significant almost at all. Complete table containing simple regressions can be found below in Figure 32.

2. Assessing relationship between all independent variables

Scatterplots for each independent variables pair can be found in APPENDIX A. It appears that several independent variables are highly correlated, namely tenYear and oneYear, SPX and tenYear, GDP and SPX and some others.

Figure 31: Dependent variable and independent variables correlation table, 3rd crisis

Source: Self-made, data taken from FRED and Stooq

Figure 32: Simple regressions, 3rd crisis

Source: Self-made, data taken from FRED and Stooq

Variables tenYear and oneYear are indeed highly correlated. The same goes for tenYear and SPX and GDP and SPX. Whole table can be seen in Figure 33.

As of right now, I suspect that the best model will be achieved by using variables SPX and CPI.

3. Calculating and evaluating the model I will again start with model that uses all of variables.

This model can be seen in Figure 34. It shows that even when all variables are feeded to the model, it does not explain high proportion of variance in Gold, while the only significant variable is SPX. It seems like the model the model suffers from strong presence of multicollinearity.

Unfortunately, forward selection yielded no better result with the model explaining only 63.6% of variance in Gold and SPX again being the only significant variable.

Figure 33: Independent variables correlation table, 3rd crisis

Source: Self-made, data taken from FRED and Stooq

Figure 34: Full model, 3rd crisis

Source: Self-made, data taken from FRED and Stooq

32 For those reasons, it appears that the best overall model can be obtained by using only the SPX variable. This model can be seen in Figure 35.

However, it explains only 59,8% of variance in the independent variable. This means probably that more variables would need to be included on top of those that I selected.

In document Hlavní práce74293_chea02.pdf, 2.1 MB Stáhnout (Stránka 24-49)