Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 16 additions & 5 deletions 02_activities/assignments/assignment_2.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,29 @@
- For each visualization (good and bad):
- Explain (with reference to material covered up to date, along with readings and other scholarly sources, as needed) why you classified that visualization the way you did.
```
Your answer...


Example of bad visualization: https://transformative-mobility.org/multimedia/street-traffic-and-social-interaction/
While the overall message of this visualization is relatively straightforward, the specific details it aims to convey are ambiguous. It is unclear what the data are explicitly representing—particularly what the blocks and the connecting lines signify. Although the lines appear to suggest social interactions, it is not evident whether the blocks represent individual people, buildings, or another unit of analysis, nor whether their shapes carry any specific meaning. The number of blocks here also seem to differ between the groups, with more blocks in the lower traffic region, which then may subsequently explain the greater number of connecting lines. This ambiguity is compounded by the absence of legends, labels, or axes to guide interpretation.

Additionally, the visualization lacks clear quantitative values, making it difficult to compare differences between traffic conditions, as no scale or axis is provided. The inclusion of decorative elements—such as road textures and architectural shapes—adds visual complexity without contributing meaningful context, ultimately distracting from the underlying data.

Example of good visualization: https://public.tableau.com/app/profile/whitney6892/viz/bostonrunners/Dashboard1

This visualization of the Boston Marathon clearly presents finish times from the 2023 race using a histogram format that effectively communicates the key features of the data. The x-axis represents completion time, with evenly spaced intervals, while the y-axis clearly indicates the number of participants within each time bin. The use of pictographs to represent individual runners adds an additional layer of information by encoding gender, allowing viewers to observe patterns in finishing times across groups. For example, the visualization shows that the majority of runners finishing within the first three hours were male, while female runners comprise a larger proportion of the overall field, and non-binary runners represent a very small fraction of participants.

The histogram format also makes it easy to identify the most common finishing times as well as the overall distribution of race completion times, including the median. The author further enhances the visualization by explicitly highlighting the average finishing time, which is not immediately apparent from the histogram alone and allows for comparison between the mean and median.


```
- How could this data visualization have been improved?
```
Your answer...
Bad visualization:
The same information could be communicated more effectively using simpler, more conventional graphics. For example, a bar chart showing the average number of friends or social acquaintances per person across different traffic conditions or geographic contexts would clearly convey the measured variables, grouping strategy, and magnitude of differences, improving both interpretability and analytical value. Furthermore, the authors should indicate the specific population used to create each traffic group, and the number of individuals surveyed to create these averages. The data could also address the potential of biases in this data, such as lower traffic being associated with differences in geographical population - i.e. indicate whether they looked at communities with similar populations but different traffic conditions.


Good visualization:
Overall, this visualization requires minimal improvement, but it could benefit from a few refinements. For instance, clarifying whether finish times are rounded to the nearest minute would improve transparency. Additionally, given the large volume of data, adding gridlines could help viewers more easily estimate the number of participants within each time bin.




Expand Down Expand Up @@ -51,7 +62,7 @@
🚨 **Please review our [Assignment Submission Guide](https://github.com/UofT-DSI/onboarding/blob/main/onboarding_documents/submissions.md)** 🚨 for detailed instructions on how to format, branch, and submit your work. Following these guidelines is crucial for your submissions to be evaluated correctly.

### Submission Parameters:
* Submission Due Date: `23:59 - 10/26/2025`
* Submission Due Date: `23:59 - 01/26/2026`
* The branch name for your repo should be: `assignment-2`
* What to submit for this assignment:
* This markdown file (assignment_2.md) should be populated and should be the only change in your pull request.
Expand Down