Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions 02_activities/assignments/a3_visualisation1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Visualisation 1

> What software did you use to create your data visualization?
For this visualization, I used Python, specifically the pandas, seaborn, and matplotlib libraries. I chose Python because it lets me clean and analyze the data in the same place where I build the chart, which made the whole process easier to manage. The tools worked well together: pandas helped me organize the data, and seaborn and matplotlib helped me turn that cleaned data into a clear, readable bar chart showing total funding by organization.
> Who is your intended audience?
The main audience for this visualization is anyone who wants to understand how funding for the Ontario Bridge Training Program was distributed across different organizations. This includes policy makers, nonprofit leaders, researchers, and even students like myself who are trying to learn about public funding patterns. I also considered the general public, since people may be interested in how government money is being allocated to support newcomer programs.
> What information or message are you trying to convey with your visualization?
The goal of this visualization is to show which organizations received the most money from the program during the 2012–2013 period. By laying out the organizations from highest funding to lowest, I wanted to make it easy to compare how much different groups received. The visualization highlights the large differences in funding levels, which might help explain which organizations were prioritized or had larger program needs.
> What aspects of design did you consider when making your visualization? How did you apply them? With what elements of your plots?
When designing the chart, I focused on keeping it simple, readable, and honest about what the data shows. I chose a horizontal bar chart because many organization names are long, and horizontal labels are easier to read. I sorted the bars so the biggest amounts appear at the top, which makes the ranking very easy to see. I used a blue color palette that is calm and consistent, and I formatted the x‑axis in dollars so the viewer knows right away that the values represent money. I avoided extra decorations or unnecessary labels, because I wanted the focus to stay on the funding amounts.
> How did you ensure that your data visualizations are reproducible? If the tool you used to make your data visualization is not reproducible, how will this impact your data visualization?
I made the visualization reproducible by writing the entire process as a Python script that cleans the data, calculates the totals, and produces the chart automatically. This means anyone with the dataset and the script can run it again and get the same result. I also saved a separate summary file with the total funding amounts so others can verify the numbers. If someone were trying to recreate this without Python, it might take longer, but all the steps are clearly shown in the code.
> How did you ensure that your data visualization is accessible?
To make the visualization more accessible, I used readable fonts, a high‑contrast color palette, and a layout that works well for people using screen magnifiers. The horizontal layout also helps with readability because the labels don’t overlap. I included a descriptive title so people immediately know what the chart represents. The chart can also be paired with alt‑text, and I provided a summary CSV, which allows screen‑reader users to access the data behind the chart.
> Who are the individuals and communities who might be impacted by your visualization?
The communities most impacted by this visualization are immigrants and internationally trained professionals who rely on Bridge Training Program services. The organizations themselves may also be affected, because the chart shows how much funding each one receives compared to others. Policy makers and government administrators might use information like this to make decisions about future funding. Because of this, it’s important that the visualization be accurate and fair, so it doesn’t create misleading impressions about any organization.
> How did you choose which features of your chosen dataset to include or exclude from your visualization?
I included only the parts of the dataset that were directly related to my goal: the Organization Name and the Grant Amount. I decided to leave out things like addresses, phone numbers, and websites because they didn’t help answer the question of “who received the most funding.” Keeping only the necessary columns made the chart cleaner and more focused. I also didn’t include extra program details because they didn’t affect the total funding, which is what I wanted to visualize.
> What ‘underwater labour’ contributed to your final data visualization product?
A lot of the work behind the scenes involved cleaning and preparing the data so the final chart would be accurate. This included fixing the “Grant Amount” column, which originally had dollar signs and commas, and converting it into a real number that Python could calculate with. I also had to deal with encoding issues while loading the file, remove missing values, and test different chart sizes so the labels wouldn’t overlap.
22 changes: 22 additions & 0 deletions 02_activities/assignments/a3_visualisation1_code.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load dataset
df = pd.read_csv("mci_ontario_bridge_training_program.csv")

# Clean grant amount column
df['Grant Amount'] = df['Grant Amount'].replace('[\$,]', '', regex=True).astype(float)

# Aggregate total funding by organization
org_funding = df.groupby('Organization Name')['Grant Amount'].sum().sort_values(ascending=False)

# Plot
plt.figure(figsize=(12, 18))
sns.barplot(x=org_funding.values, y=org_funding.index, palette="Blues_r")

plt.title("Total Grant Amount by Organization (Ontario Bridge Training Program 2012–2013)", fontsize=16)
plt.xlabel("Grant Amount (CAD)", fontsize=14)
plt.ylabel("Organization", fontsize=14)
plt.tight_layout()
plt.show()
20 changes: 20 additions & 0 deletions 02_activities/assignments/a3_visualisation2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Visualisation 2

> What software did you use to create your data visualization?
For this visualization, I used Microsoft Excel. I cleaned the data in Excel by removing commas and symbols from the Grant Amount column so the numbers would calculate properly. Then I used a PivotTable to add up the total funding for each city. After that, I made a bar chart directly in Excel and formatted it so that the chart would be easy to read. I exported the final chart as a PNG and also saved the cleaned data and city totals in an Excel workbook so everything is organized and easy to reuse.
> Who is your intended audience?
My intended audience includes program administrators and government staff who need a quick understanding of how the funding was distributed across different cities in Ontario. It’s also meant for organizations that might want to compare the funding in their city with others. Even students or members of the public who just want a general idea of where most of the money went can understand this chart easily.
> What information or message are you trying to convey with your visualization?
The purpose of this visualization is to show which cities received the most grant funding from the Bridge Training Program in 2012–2013. The chart makes it clear that Toronto received the highest amount of funding, followed by Ottawa and North York. By sorting the bars from highest to lowest, it becomes easy to see how the funding levels compare across cities.
> What aspects of design did you consider when making your visualization? How did you apply them? With what elements of your plots?
When designing the chart in Excel, I wanted it to be simple and easy to read. I chose a horizontal bar chart because it works well when city names are different lengths. I sorted the cities from the highest total grant amount to the lowest so the message is clear right away. I used a single dark‑blue color for the bars so the chart looks neat and not overwhelming. I also formatted the numbers as currency and made sure the title and labels were easy to understand. I removed any extra elements, like unnecessary legends, to keep the focus on the data.
> How did you ensure that your data visualizations are reproducible? If the tool you used to make your data visualization is not reproducible, how will this impact your data visualization?
The visualization is reproducible because everything was done in Excel in a clear, step-by-step way. The Excel file saves both the cleaned dataset and the PivotTable that shows the total funding by city. Anyone who opens the Excel file can see exactly how the numbers were calculated and can recreate the chart using the same steps. Even someone who isn’t very experienced with Excel would be able to follow the process because all the data cleaning and formulas are visible in the workbook.
> How did you ensure that your data visualization is accessible?
I made the visualization more accessible by using clear labels, a readable font size, and high‑contrast colors. The horizontal layout helps keep the city names readable without them overlapping. The chart can also have alt‑text added in Excel, which is useful for people who use screen‑reading tools. In addition to the visual chart, the data is available in table format in the same Excel file, which makes it easier for people who rely on assistive technology to access the information.
> Who are the individuals and communities who might be impacted by your visualization?
This visualization affects communities that rely on newcomer settlement and training programs, because it shows how much funding their city received. It may also impact the organizations that deliver these services, since their city’s funding level is shown in comparison to others. Policymakers might use this kind of chart to make decisions about where to invest more funding in the future. Because it deals with resource distribution, it could shape people’s opinions about fairness or priorities, so accuracy is important.
> How did you choose which features of your chosen dataset to include or exclude from your visualization?
I chose to only include the city and the total grant amount because these were the most relevant for showing where funding was concentrated. I didn’t include organization names or addresses in the chart because they would make the graph too crowded and weren’t necessary to understand the city‑level distribution.
> What ‘underwater labour’ contributed to your final data visualization product?
A lot of behind‑the‑scenes work went into preparing the data before making the chart. I had to clean the Grant Amount column by removing dollar signs and commas so Excel could calculate properly. I also checked for missing or incorrect entries and made sure the PivotTable was set up correctly to total the amounts by city. I experimented with different chart formats in Excel before choosing the one that looked the clearest.
28 changes: 17 additions & 11 deletions 02_activities/assignments/assignment_2.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,26 +9,32 @@
- You can find data visualizations at https://public.tableau.com/app/discover or https://datavizproject.com/, or anywhere else you like!
- For each visualization (good and bad):
- Explain (with reference to material covered up to date, along with readings and other scholarly sources, as needed) why you classified that visualization the way you did.
- How could this data visualization have been improved?

- Good Example
```
Your answer...
A good example of data visualization can be found in the *Stock Market Trends* chart on Tableau Public. This visualization tracks daily price movements and trading volume for a company’s stock, and it works well because it is clear, accurate, and easy to understand. The data is directly tied to actual market performance, with transparent axes and scales that make it trustworthy. The use of line charts and volume bars is familiar to most viewers, which keeps the cognitive load low and allows people to quickly see volatility and trends without confusion (Sweller, 1994). The layout is clean, the colors are simple, and interactive features like hovering for details make the visualization engaging without being overwhelming. Since financial data is inherently temporal, the choice of a time-series line chart is appropriate and aligns with best practices for showing change over time. This reflects Tufte’s (1997) principle that good design should focus on clarity and integrity rather than unnecessary decoration.
The visualization could be improved by adding annotations to highlight major events, so viewers can connect data trends to real-world causes. Accessibility could also be enhanced by using color palettes like Viridis, which are designed to be distinguishable for people with colorblindness (Lundgard & Satyanarayan, 2022). Finally, simplifying the trading volume bars by smoothing or aggregating them would reduce visual noise and make the chart easier to interpret. Overall, this visualization demonstrates how effective design can support decision-making and communication in finance.







References
Sweller, J. (1994). *Cognitive Load Theory, Learning Difficulty, and Instructional Design*. Learning and Instruction, 4(4), 295–312.
Tufte, E. R. (1997). *Visual Explanations: Images and Quantities, Evidence and Narrative*. Graphics Press.
Lundgard, A., & Satyanarayan, A. (2022). *Accessible Visualization Practices*. IEEE Computer Graphics and Applications.
```
- How could this data visualization have been improved?
- Bad Example
```
Your answer...

A 3D bar chart from the Data Viz Project is a clear example of a bad visualization. The main problem is that the 3D effect distorts the data, making it hard to compare values accurately. Bars in the back look smaller or hidden even if they represent larger numbers, which misleads the viewer. This adds confusion and forces people to work harder to interpret the chart, increasing what Sweller (1994) calls extraneous cognitive load. The design also adds unnecessary clutter, which doesn’t help explain the data and instead distracts from it, violating Tufte’s principle of avoiding “chartjunk.” While the data itself may be correct, the way it is shown undermines substantive quality by exaggerating or minimizing differences. As Cairo (2016) and Few (2009) argue, 3D charts often obscure rather than clarify information. A simple 2D bar chart would have been much clearer and more effective.
To improve this visualization, the chart should be flattened into 2D so comparisons are accurate. Adding clear numeric labels would reduce reliance on visual estimation, and using consistent colors with good contrast would improve accessibility. Including proper axis titles, legends, and source citations would also reinforce transparency. By removing the 3D effects and focusing on clarity, the visualization could shift from being misleading to serving as a reliable tool for communication.

References
Cairo, A. (2016). The Truthful Art: Data, Charts, and Maps for Communication. New Riders.
Few, S. (2009). Now You See It: Simple Visualization Techniques for Quantitative Analysis. Analytics Press.
Sweller, J. (1994). Cognitive Load Theory, Learning Difficulty, and Instructional Design. Learning and Instruction, 4(4), 295–312.
Tufte, E. R. (1997). Visual Explanations: Images and Quantities, Evidence and Narrative. Graphics Press.





```
- Word count should not exceed (as a maximum) 500 words for each visualization (i.e.
300 words for your good example and 500 for your bad example)
Expand Down
6 changes: 4 additions & 2 deletions 02_activities/assignments/assignment_3.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@

### Requirements:
- We will finish this class by giving you the chance to use what you have learned in a practical context, by creating data visualizations from raw data.
- Choose a dataset of interest from the [City of Toronto’s Open Data Portal](https://www.toronto.ca/city-government/data-research-maps/open-data/) or [Ontario’s Open Data Catalogue](https://data.ontario.ca/).
- Choose a dataset of interest from the [City of Toronto’s Open Data Portal](https://www.toronto.ca/city-government/data-research-maps/open-data/) or [Ontario’s Open Data Catalogue](

).
- Using Python and one other data visualization software (Excel or free alternative, Tableau Public, any other tool you prefer), create two distinct visualizations from your dataset of choice.
- For each visualization, describe and justify:
> What software did you use to create your data visualization?
Expand All @@ -27,7 +29,7 @@

- This assignment is intentionally open-ended - you are free to create static or dynamic data visualizations, maps, or whatever form of data visualization you think best communicates your information to your audience of choice!
- Total word count should not exceed **(as a maximum) 1000 words**

### Why am I doing this assignment?:
- This ongoing assignment ensures active participation in the course, and assesses the learning outcomes:
* Create and customize data visualizations from start to finish in Python
Expand Down