Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Assignment 2
## Exploring the Trends in COVID-19 Cases and COVID-19 related Death in 2020

#### Data Selection
For this assignment, the dataset used was obtained from [Kaggle](https://www.kaggle.com/sudalairajkumar/covid19-in-usa?select=us_states_covid19_daily.csv "Dataset").
This dataset had a usability score of 9.7.
The data described the number of COVID-19 cases and related deaths across the United States by state and counties in 2020.

#### Goals
The aim of this interactive application is to analyze trends in COVID-19 cases and related deaths across the United States.
Understanding the regional trends can provide better insight into the patterns of changes in COVID-19 cases.
Moreover, a regional analysis can help develop a stronger understanding of the patterns and make future predictions. For instance, if the number of COVID-19 cases increased in **state A** from the beginning of 2020 to the end of 2020 and the number of COVID-19 cases increases in **state B** only for a few months and then plateaues, this gives us an insight into how well the state has the situation under control.
Moreover, identifying states that are better handling the COVID-19 situation and have been able to successfully reduce or prevent increases in the number of COVID-19 cases can encourage other states to adopt policies that have helped improve public health.

In order to allow users to identify trends in changes of COVID-19 cases and related deaths, this applications enable to analyze both cases and related deaths. In each case, users can view the changes in average monthly cases/deaths across all states through an animated map. This animation highlights regions with dramatic changes. To further observe those changes in detail for a particular state, users can click on the state.
This generates a follow-up graph for that state allowing users to visualize the trends in the absolute number of COVID-19 cases/deaths. Users can also view the realtive changes in the number of COVID-19 cases/deaths by observing the relative plot.

#### Rationale Behind Decisions
The first aim was to allow user to analyze both COVID-19 cases and related deaths. Since both of them can be seperately visualized, I added the option to switch between the two using a dropdown menu. This reduced the clutter on the page.

The next aim was to visualize the trends in the changes across all the states in the US. Maps usually help visualize geographic information. Consequently, I colored the states in the map depending on the number of COVID-19 cases/deaths. However, the time component still had to be integrated. Placing maps side by side made it difficult to see the changes. So I decided to animate the map to make it easier to view the changes over the months.

Once we saw a pattern, the next step would be to allow users to visualize the trend for the specific state. Since there were to many states, making a dropdown menu seemed infeasible. Instead, the map was designed to be interactive, so that selecting a state on the map would generate plots for that specific state and allow users to further explore the trend. For this visualization, I added two types of plots - absolute and relative trends. Absolute trends make it easier to see the number of cases/deaths while relative trends make it easier to analyze the percentage change in cases/deaths.

#### Development Process
I did the project by myself. It took me an hour to select the dataset. Once I had that, creating the map plots was challenging. It took me 5 hours to successfully install the relevant libraries and have a baseline map. Next, making that map interactive took me over 3 hours. Once I had the pictoral components ready, it took me two hours to set up the pipeline, clean and process the data and write the code. Most of the time was spend on learning new features of the code and implementing them. Debugging took a long time too. Altogether, I spent over 12 hours project.
16 changes: 16 additions & 0 deletions month_labels.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@

month_labels = dict()
month_labels['January'] = '2020-01'
month_labels['February'] = '2020-02'
month_labels['March'] = '2020-03'
month_labels['April'] = '2020-04'
month_labels['May'] = '2020-05'
month_labels['June'] = '2020-06'
month_labels['July'] = '2020-07'
month_labels['August'] = '2020-08'
month_labels['September'] = '2020-09'
month_labels['October'] = '2020-10'
month_labels['November'] = '2020-11'
month_labels['December'] = '2020-12'

labels_to_months = dict(map(reversed, month_labels.items()))
8 changes: 8 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
geopandas==0.10.2
shapely==1.7.1
pyshp==2.1.3
plotly==5.1.0
streamlit==1.0.0
numpy==1.21.2
pandas==1.3.3
streamlit-plotly-events==0.0.6
97 changes: 97 additions & 0 deletions src.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
#!/usr/bin/env python
# coding: utf-8

import geopandas as gp
import shapely
import shapefile
import plotly
import streamlit as st
from us_state_abbr import abbrev_to_us_state, us_state_to_abbrev

import numpy as np
import pandas as pd
import plotly.express as px
from plotly.figure_factory._county_choropleth import create_choropleth
from streamlit_plotly_events import plotly_events
from month_labels import month_labels, labels_to_months


data_file = "us_counties_covid19_daily.csv"
df = pd.read_csv(data_file)
df['month'] = df.date.str.extract(r'^([0-9]+-[0-9]+)-')

st.sidebar.title("Exploring 2020 COVID-19 Trends Across the U.S.")

category = st.sidebar.selectbox('Select Category', ('Cases', 'Deaths'))
category = category.lower()

title = ""
desc = ""
if category == 'cases':
title = "COVID-19 Cases"
desc = "Average Number of COVID-19 Cases per Month"
elif category == 'deaths':
title = "COVID-19 related Deaths"
desc = "Average Number of COVID-19 Related Deaths per Month"
st.title(title + " Across the U.S.")

all_plot_state_data = df[['state', 'month', category]].dropna().groupby(['state', 'month']).mean().reset_index().sort_values('month')
all_plot_state_data["Month "] = [' ' + labels_to_months[i] for i in all_plot_state_data.month]
all_states = [us_state_to_abbrev[i] for i in all_plot_state_data.state]

fig = px.choropleth(all_plot_state_data,
locations = all_states,
locationmode="USA-states",
color=all_plot_state_data[category],
animation_frame="Month ",
scope="usa")

fig.layout.template = None
st.plotly_chart(fig)
st.caption(desc + "\n across the United States in 2020.")


st.title("Average Number of Daily " + title)
annual_data = df[['state', category]].dropna().groupby(['state']).mean().reset_index()
all_states = [us_state_to_abbrev[i] for i in annual_data.state]

fig = px.choropleth(annual_data,
locations = all_states,
locationmode="USA-states",
color=annual_data[category],
color_continuous_scale="Emrld",
scope="usa")

fig.layout.template = None
selected_points = plotly_events(fig)
st.caption("Average Number of Daily " + title + " Across the United States in 2020. Select a State on the Map above to Analyze the Trends in that State.")

num = 0
if len(selected_points) > 0:
num = selected_points[0]["pointNumber"]
state = list(annual_data['state'])[num]

st.title(title + " Across " + state)

method = st.radio("Plot Style", ("Absolute", "Relative"))
st.write('<style>div.row-widget.stRadio > div{flex-direction:row;}</style>', unsafe_allow_html=True)

state_df = all_plot_state_data[all_plot_state_data['state'] == state]

if method == "Absolute":
state_trend_fig = px.line(x=state_df['Month '], y=state_df[category], labels=dict(y="Average Number of " + title, x="Months (in 2020)"))
st.plotly_chart(state_trend_fig)
st.caption(desc + "\n Across " + state + " in 2020.")

if method == "Relative":
months = list(state_df['Month '])[1:]
vals = list(state_df[category])
percent_change = [int((vals[i] - vals[i-1])*200/(vals[i-1] + vals[i] + 0.0000001)) for i in range(1, len(vals))]
state_trend_fig = px.line(x=months, y=percent_change, labels=dict(y="Percentage Increase in " + title + " (%)", x="Months (in 2020)"))
st.plotly_chart(state_trend_fig)
st.caption("Percentage Change in " + desc + "\n Across " + state + " in 2020.")





Loading