• Nebyly nalezeny žádné výsledky

4. Case study

4.4 Data visualizations

4.4.1 Cases and deaths

The code for the following visualizations can be found in the Jupyter Notebook titled Visu-alizations of the data (cases and deaths). They are split into those relating to the world and those relating to Europe. Below we have those we have found to be the most interesting and the rest of them can be seen in the Notebook Visualizations of the data (cases and deaths).

In the figure below we are looking at the countries with the absolute highest number of cases with relation to the life expectancy in the given country. No pattern emerges from that, but the United States, India, Brazil and India are in the top ten most populated countries in the world. This makes it understandable they have the absolute highest number of cases.

Figure 4.6: World: countries with most cases (total cases per country). Date: 07.10.2020.

Figure 4.7: World: countries with most cases (total cases per million). Date: 07.10.2020.

Figure above shows how that all countries with the highest number of cases per million at the time, have a low number of hospital beds per thousand with the exception of Czech Republic.

This has been one of the biggest issues during the pandemic. Scientists have been warning about the dangers of a potential pandemic and one of the recommendations had always been

to improve the structure of medical institutions, by increasing the number of hospital beds, amount of material with long shelf-life and the number of medical personnel. Had such a threat been taken more seriously in advance there could have been less stress about where to place all of the patients. Tracking the number of beds and their capacity can also help countries sign agreements to help each other by transferring patients and providing medical help in their hospitals.

Figure below shows that all of the countries with the most total deaths per million are countries that have low GDP per capita with the exception of the United States and San Marino. Low GDP per capita is typical for poorer, less developed countries that also have lacking infrastructure. This can indicate that these countries possibly did not have a good healthcare system due to the lack of finances, which led to inability to hospitalize everyone who needed it. One possible explanation for San Marino is that it is practically a small enclave of about 30000 citizens in a severely hit part of Italy and additionally an older population could have contributed slightly as well.

Figure 4.8: World: countries with most deaths (total deaths per million). Date: 07.10.2020.

One important thing to note for the figure below is that the Vatican has 0 for the hu-man_development_index variable due to the specificity of the country but it is widely ac-cepted that if the human development index were to be calculated for the Vatican, it would be nearing 1. This would mean all of the countries with most total cases per million have a very high human development index. This could also indicate that there is more movement in these countries and therefore an easier spread of the virus.

The three following figures were added for better understanding and easier comparison with the figures created for Europe.

Figure 4.9: World - Countries with most deaths - Total deaths per million per country. Date:

07.10.2020.

Figure 4.10: World - Countries with most cases - Total cases per country. Date: 07.10.2020.

Figure 4.11: World - Countries with most cases - Total cases per million per country. Date:

07.10.2020.

Figure 4.12: Europe: countries with most cases (total cases per million). Date: 07.10.2020.

Figure 4.13: Europe: countries with most deaths (total deaths per million). Date: 07.10.2020.

4.4.2 Testing

Testing is very important when fighting a pandemic. First of all, we want to make sure our tests will not provide us with many false negatives, since this would mean that infectious people could possibly go on with their lives and infect others. Second of all, there has to be a proper testing strategy that catches as many positive cases without slowing down the systems due to the lack of manpower or materials to handle larger capacities.

In the chapter Comparison with other pandemics we have already explained what Case Fa-tality Rate is. That is another reason why not only knowing total testing numbers, but also understanding different testing strategies in different countries is important. An example of how CFR can be deluding without sufficient testing data follows. At the time of writing San Marino has the most total deaths per million and a CFR of 1.8%, whereas Bosnia and Herzegovina is currently 8th for the most total deaths per million but it has a CFR of 3.8%.

This is directly impacted by the number of tests. San Marino has done 1.5 million tests per 1 million residents, with 132k cases/mil, 29 tests per case. Bosnia and Herzegovina have 234k tests per 1 million residents, with 49k cases/mil, less than 5 tests per case.

In the figures below we see that the countries that have the most total tests per million are also countries prevalently with a higher human development index. This is self explanatory as those are the countries that are typically trying to provide the healthiest and longest lives to their citizens.

Figure 4.14: Europe: countries with most tests (total tests per country). Date: 07.10.2020.

Figure 4.15: Europe: countries with most tests (total tests per million). Date: 07.10.2020.

Figure 4.16: Europe: countries with most positive rate of tests. Date: 07.10.2020.

This figure above could potentially be very important. These countries have the most positive rates of tests, meaning that they have the highest percentage of people positive for COVID-19 from those they test. This could mean that they are properly targeting the people that need to be tested and therefore enabling a faster containment of the spread. This could mean they are properly tracing and testing only those that are positive. But a bigger picture is necessary since this also possibly means that some countries only test when it is certain someone is positive and are therefore not testing enough allowing for faster spread.

5. Modelling

We will perform a cluster analysis for both the European countries and the entire world. The process and the code will look very similar for both of those, therefore we will explain it into more detail only the first time that we show the code which will later on be used with minor modifications. In case of bigger modifications those will be shown as well. Our data is already cleaned and prepared but in order to use the k-means algorithm we will have to remove the categorical variables. In order to create a clustering model we have to pick the number of clusters, for which we will use the elbow method. To get even more insights from the data we will take a look at different granularity levels and experiment with different numbers of clusters. We will visualize the clusters applying principal component analysis and we will visualize the countries on a world map.