• Nebyly nalezeny žádné výsledky

5. Practical section

5.2. Performance of the Cost Objective

5.2.4. Assumptions

In order to perform a regression analysis, some assumptions must be met. The importance of the assumptions relies on the ability to draw conclusions from the regression analysis, and interpretation of the results. Since this bachelor’s thesis includes a multiple variable regression analysis, four main assumptions are required. Following Newbold, Carlson, and Thorne (2012), the following tests were performed in order to fulfil the assumptions.

Multicollinearity: This assumption requires that no multicollinearity exists among the independent variables. The concept means that one independent variable significantly predicts another independent variable. To test for this assumption, the variance inflation factor (VIF) is

30

Table 8. Variance inflation factor calculated for the year 2010

For the interpretation of the results, no multicollinearity is assumed when the VIF number does not exceed 10 (The Pennsylvania State University, n.d.). As it can be observed from the table, none of the values is higher than 10, therefore, the assumption is fulfilled. The same tests were performed for the years 2012 and 1014.

Variable VIF 2012 VIF 2014

Firm size 4.834 4.455

Profit/loss before tax 2.854 3.145

Added Value 6.727 7.336

Economies of Scale 1.273 1.270

Skilled blue collars 1.905 1.914 Unskilled blue collars 2.104 2.137

Table 9. Variance inflation factor calculated for the year 2012 and 2014

31

Similarly to results obtained for 2010, referring VIF for 2012 and 2014, no multicollinearity is assumed since the vales are lower than 10. Therefore, the assumption is fulfilled.

Homoscedasticity: This assumption states that there should be a constant variance of the error term. Homoscedasticity is assumed for statistical analysis; accordingly, the aim is to reject the opposite: Heteroscedasticity. The distribution of the residuals should be random without trends or funnel shapes (Newbold et al., 2013). The following scatterplot of the standardized predicted value and the standardized residual can be used to visually assess the existence of Homoscedasticity. It was constructed using the dependent variable Labour Productivity for the year 2010 before its logarithmic transformation.

Figure 2. Scatterplot for regression standardized residual and regression standardize predicted value for the non-log-transformed dependent variable in the year 2010

It can be observed from the graph that the values are forming clusters. Therefore, in this case Homoscedasticity is not assumed. For this reason, the dependent variable was transformed with natural logarithm. The next chart was constructed using the standardized predicted value and the standardized residuals from the regression model estimating the log-transformed dependent variable, for the same year.

32

Figure 3. Scatterplot for regression standardized residual and regression standardize predicted value after the log-transformation for the dependent variable in the year 2010

It this case, it can be observed that the values are more randomly distributed with similar distances throughout the plot. Accordingly, homoscedasticity is assumed.

Similarly, the variables Labour Productivity for the years 2012 and 2014 were also transformed using logarithms. The respective scatterplots are presented in the figures.

Figure 4. Scatterplot for regression standardized residual and regression standardize predicted value after the log-transformation for the dependent variable in the year 2012

Figure 5. Scatterplot for regression standardized residual and regression standardize predicted value after the log-transformation for the dependent variable in the year 2014

Henceforth, homoscedasticity is assumed for all years.

Autocorrelation: According to Newbold et al. (2012), autocorrelation means that observation’s error terms are correlated. The author further explains that the most common test to measure autocorrelation is the Durbin-Watson statistic. The test is interpreted as follow:

33

- Values of 0 to 2: indicates positive autocorrelation.

- 2: means no autocorrelation

- Values 2 to 4: indicates negative autocorrelation. (Newbold et al., 2012).

The subsequent figure summarizes the output of the Durbin-Watson test before and after log-transformation of Labour Productivity for each year.

Year Durbin-Watson (no log-transformation)

Durbin-Watson (after log-transformation)

2010 2.082 2.004

2012 2.113 2.027

2014 2.041 2.000

Table 10. Durbin-Watson statistic before and after log-transformation for every year As it can be observed from the figures above, the non-transformed variable showed only very low autocorrelation. Nevertheless, the log-transformation significantly improved the results of the test, to even lower to zero autocorrelation. For this reason, using the log-transformed variable, it can be stablished that this assumption is met.

Normal distribution of the error terms: Newbold et al. (2012) states that this test can assess using P-P Plot, or a frequency histogram of standardized residual. The figures were constructed with the values of the year 2010, before the dependent variable was log-transformed.

Figures 5 and 6. Histogram and P-P Plot before the log-transformation for the dependent variable in the year 2010

34

For a normal distribution to exist, the histogram shows a bell-shaped figure. As it can be observed from figure x, the distribution is skewed to the right. Moreover, in the P-P Plot, a normal distribution is indicated by the values closely following the straight line. Nevertheless, the values diverge from the line. This can serve as reiteration as to why use the logarithmic transformation. Below, the same plots are presented with the log-transformed Labour Productivity variable for the year 2010.

Figures 7 and 8. Histogram and P-P Plot after the log-transformation for the dependent variable in the year 2010

As a result from the logarithmic transformation, the histogram shows a rather normal distribution and the values in the P-P Plot closely follow the line.

In conclusion, the dependent variable was log-transformed in order to fulfil the assumptions needed for the multiple variable regression analysis. Accordingly, the models using the log-transformed Labour Productivity were chose to perform the regression analysis.