BI-Dev-test

This is for prospective BI Developers interviewing at Holman.

Directions:

Fork this repo to your personal Github account.
Complete the below data exercises in the best way you see fit. Your project should be shareable with a link prior to your 2nd round interview. Here are some potential implementation options:

Run the ingest.py script in the ./tables directory to have usable parquet files to run analytics on.
Create a new python file or notebook, and read the two parquet files as dataframes to begin working.

Transform one or both of the olympics/countries tables to facilitate a join across these tables.

Normalize the data by applying a foreign key(s) to one/both tables. The key should be unique to represent 1 country, and ensure no cartesian joins occur. (We are aware that no true key exists, and an artificial key(s) will need to be produced)
Transform the 2 data objects via denormalization as you see fit to answer the below questions.
- Who has won the most silver medals across all years of data?
- Which year did that country win the most total medals?
- Is there a correlation between Population Density and winning medals?
- Is there a correlation between GDP and winning gold medals?
Separate the Olympics and Countries data into a star schema as you would in a gold layer before bringing into a PowerBI semantic model.
Pick another two tables from ./datasets/countries_health and include them in the star schema structure noting what columns would be used for relationships.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
datasets		datasets
silver		silver
LICENSE		LICENSE
README.md		README.md