Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 68 additions & 0 deletions 02_activities/assignments/a3_appendix_visualisation1_code.ipynb

Large diffs are not rendered by default.

40 changes: 40 additions & 0 deletions 02_activities/assignments/a3_visualisation1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
Visualisation 1

<<<<<<< HEAD
> What software did you use to create your data visualization?
For this visualization, I used Python, specifically the pandas, seaborn, and matplotlib libraries. I chose Python because it lets me clean and analyze the data in the same place where I build the chart, which made the whole process easier to manage. The tools worked well together: pandas helped me organize the data, and seaborn and matplotlib helped me turn that cleaned data into a clear, readable bar chart showing total funding by organization.
> Who is your intended audience?
The main audience for this visualization is anyone who wants to understand how funding for the Ontario Bridge Training Program was distributed across different organizations. This includes policy makers, nonprofit leaders, researchers, and even students like myself who are trying to learn about public funding patterns. I also considered the general public, since people may be interested in how government money is being allocated to support newcomer programs.
> What information or message are you trying to convey with your visualization?
The goal of this visualization is to show which organizations received the most money from the program during the 2012–2013 period. By laying out the organizations from highest funding to lowest, I wanted to make it easy to compare how much different groups received. The visualization highlights the large differences in funding levels, which might help explain which organizations were prioritized or had larger program needs.
> What aspects of design did you consider when making your visualization? How did you apply them? With what elements of your plots?
When designing the chart, I focused on keeping it simple, readable, and honest about what the data shows. I chose a horizontal bar chart because many organization names are long, and horizontal labels are easier to read. I sorted the bars so the biggest amounts appear at the top, which makes the ranking very easy to see. I used a blue color palette that is calm and consistent, and I formatted the x‑axis in dollars so the viewer knows right away that the values represent money. I avoided extra decorations or unnecessary labels, because I wanted the focus to stay on the funding amounts.
> How did you ensure that your data visualizations are reproducible? If the tool you used to make your data visualization is not reproducible, how will this impact your data visualization?
I made the visualization reproducible by writing the entire process as a Python script that cleans the data, calculates the totals, and produces the chart automatically. This means anyone with the dataset and the script can run it again and get the same result. I also saved a separate summary file with the total funding amounts so others can verify the numbers. If someone were trying to recreate this without Python, it might take longer, but all the steps are clearly shown in the code.
> How did you ensure that your data visualization is accessible?
To make the visualization more accessible, I used readable fonts, a high‑contrast color palette, and a layout that works well for people using screen magnifiers. The horizontal layout also helps with readability because the labels don’t overlap. I included a descriptive title so people immediately know what the chart represents. The chart can also be paired with alt‑text, and I provided a summary CSV, which allows screen‑reader users to access the data behind the chart.
> Who are the individuals and communities who might be impacted by your visualization?
The communities most impacted by this visualization are immigrants and internationally trained professionals who rely on Bridge Training Program services. The organizations themselves may also be affected, because the chart shows how much funding each one receives compared to others. Policy makers and government administrators might use information like this to make decisions about future funding. Because of this, it’s important that the visualization be accurate and fair, so it doesn’t create misleading impressions about any organization.
> How did you choose which features of your chosen dataset to include or exclude from your visualization?
I included only the parts of the dataset that were directly related to my goal: the Organization Name and the Grant Amount. I decided to leave out things like addresses, phone numbers, and websites because they didn’t help answer the question of “who received the most funding.” Keeping only the necessary columns made the chart cleaner and more focused. I also didn’t include extra program details because they didn’t affect the total funding, which is what I wanted to visualize.
> What ‘underwater labour’ contributed to your final data visualization product?
=======
> What software did you use to create your data visualisation?
For this visualisation, I used Python, specifically the pandas, seaborn, and matplotlib libraries. I chose Python because it lets me clean and analyze the data in the same place where I build the chart, which made the whole process easier to manage. The tools worked well together: pandas helped me organize the data, and seaborn and matplotlib helped me turn that cleaned data into a clear, readable bar chart showing total funding by organization.
> Who is your intended audience?
The main audience for this visualisation is anyone who wants to understand how funding for the Ontario Bridge Training Program was distributed across different organizations. This includes policy makers, nonprofit leaders, researchers, and even students like myself who are trying to learn about public funding patterns. I also considered the general public, since people may be interested in how government money is being allocated to support newcomer programs.
> What information or message are you trying to convey with your visualisation?
The goal of this visualisation is to show which organizations received the most money from the program during the 2012–2013 period. By laying out the organizations from highest funding to lowest, I wanted to make it easy to compare how much different groups received. The visualisation highlights the large differences in funding levels, which might help explain which organizations were prioritized or had larger program needs.
> What aspects of design did you consider when making your visualisation? How did you apply them? With what elements of your plots?
When designing the chart, I focused on keeping it simple, readable, and honest about what the data shows. I chose a horizontal bar chart because many organization names are long, and horizontal labels are easier to read. I sorted the bars so the biggest amounts appear at the top, which makes the ranking very easy to see. I used a blue color palette that is calm and consistent, and I formatted the x‑axis in dollars so the viewer knows right away that the values represent money. I avoided extra decorations or unnecessary labels, because I wanted the focus to stay on the funding amounts.
> How did you ensure that your data visualisations are reproducible? If the tool you used to make your data visualisation is not reproducible, how will this impact your data visualisation?
I made the visualisation reproducible by writing the entire process as a Python script that cleans the data, calculates the totals, and produces the chart automatically. This means anyone with the dataset and the script can run it again and get the same result. I also saved a separate summary file with the total funding amounts so others can verify the numbers. If someone were trying to recreate this without Python, it might take longer, but all the steps are clearly shown in the code.
> How did you ensure that your data visualisation is accessible?
To make the visualisation more accessible, I used readable fonts, a high‑contrast color palette, and a layout that works well for people using screen magnifiers. The horizontal layout also helps with readability because the labels don’t overlap. I included a descriptive title so people immediately know what the chart represents. The chart can also be paired with alt‑text, and I provided a summary CSV, which allows screen‑reader users to access the data behind the chart.
> Who are the individuals and communities who might be impacted by your visualisation?
The communities most impacted by this visualisation are immigrants and internationally trained professionals who rely on Bridge Training Program services. The organizations themselves may also be affected, because the chart shows how much funding each one receives compared to others. Policy makers and government administrators might use information like this to make decisions about future funding. Because of this, it’s important that the visualisation be accurate and fair, so it doesn’t create misleading impressions about any organization.
> How did you choose which features of your chosen dataset to include or exclude from your visualisation?
I included only the parts of the dataset that were directly related to my goal: the Organization Name and the Grant Amount. I decided to leave out things like addresses, phone numbers, and websites because they didn’t help answer the question of “who received the most funding.” Keeping only the necessary columns made the chart cleaner and more focused. I also didn’t include extra program details because they didn’t affect the total funding, which is what I wanted to visualize.
> What ‘underwater labour’ contributed to your final data visualisation product?
>>>>>>> main
A lot of the work behind the scenes involved cleaning and preparing the data so the final chart would be accurate. This included fixing the “Grant Amount” column, which originally had dollar signs and commas, and converting it into a real number that Python could calculate with. I also had to deal with encoding issues while loading the file, remove missing values, and test different chart sizes so the labels wouldn’t overlap.
Binary file added 02_activities/assignments/a3_visualisation1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 22 additions & 0 deletions 02_activities/assignments/a3_visualisation1_code.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load dataset
df = pd.read_csv("mci_ontario_bridge_training_program.csv")

# Clean grant amount column
df['Grant Amount'] = df['Grant Amount'].replace('[\$,]', '', regex=True).astype(float)

# Aggregate total funding by organization
org_funding = df.groupby('Organization Name')['Grant Amount'].sum().sort_values(ascending=False)

# Plot
plt.figure(figsize=(12, 18))
sns.barplot(x=org_funding.values, y=org_funding.index, palette="Blues_r")

plt.title("Total Grant Amount by Organization (Ontario Bridge Training Program 2012–2013)", fontsize=16)
plt.xlabel("Grant Amount (CAD)", fontsize=14)
plt.ylabel("Organization", fontsize=14)
plt.tight_layout()
plt.show()
Loading