- I chose to examine what if any relationsships existed in the following:
- 1) Examine any overall trends in countries and medals won for both Summer and Winter Olympic Games.
- 2) Determine if any trends emerge for teams winning seasonal events by countries with favourable geography and climate for that event.
- 3) Determine if any trends emerged over time for medals won, participating countries.
- The dataset is publicly available and consists of 2 separate .csv files for Olympic Events participants and Medals won from 1900 to 2016.
- Initial data set examination was performed with Pandas and Excel to look for general features of interest & potential problems with the data.
- Significant data cleaning and formatting was required to prepare the data for further evaluation, examples including but not limited to:
- The main tools used in exploring the data set were SQL,and Pandas with some "on-the-fly" visualizations created using Matplotlib, Pandas, Seaborn and Excel.
- I created the ERD for the data sets using MySQl, but performed the queries using PostgreSQL in PgAdmin.
- Some exploratory analyses were inconclusive and thus excluded in the final results (e.g. regression analysis using Scikit-Learn & Seaborn).
- These limitations were due primarily to the dataset itself, and I omitted inconsequential or trivial analyses results (e.g. athletes ages).
- Individually former countries W. and E. Germany won a large number of events, but this was reflected overall for Germany as a leading medal winner.
- To examine the overall medals won by Germany, I also combined modern and former East and West to evaluate the number of medals won by them.
- Overall a small number of the same countries(teams) consistently won the majority of medals.
- The countries that consistently won the most awards were the USA, Great Britain and the former USSR, and Germany.
- Notable was that by combining medals won by former East and West Germany, clarified the data that Germany was one of the leaders for medals won.
- As suspected, countries that naturally support some events (e.g. Winter Sports) ranked higher in relevant events.
- One outlier for the countries with the most medals won, was Canada (Ice Hockey). This however, also seemed to support the hypothesis for geographic / climate tendencies in seasonal event performance, as Canada was a consistent leader in this event.