-
Notifications
You must be signed in to change notification settings - Fork 6
/
01-starting-out.Rmd
93 lines (65 loc) · 2.25 KB
/
01-starting-out.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
title: "Data Storytelling 1 - Starting Out"
output: html_notebook
---
# Learning Objectives
- Learn about the `tv_shows` dataset
- Review `ggplot2` syntax
- Look at a simple visualization that we're going to improve
# RStudio Review (1 min)
- Script view
- File Viewer
# Let's look at the basic dataset first (3 min)
We're going to look at a dataset about tv shows that was collected from ImDb (The internet movie database).
Run the code chunk below to load the data in and look at a table of the data.
```{r}
library(tidyverse)
library(here)
tv_shows <- read.csv(here("data/tv_shows.csv"))
DT::datatable(tv_shows)
```
# Part 1: Your Basic Plot (5 min)
We're going to start working with a basic visualization [define ggplot here] A ggplot has the following recipe:
```
A dataset +
A mapping of visual aesthetics to variables +
A geom(etry)
```
Remember, we use `+` to add elements to a ggplot.
```{r}
library(ggplot2)
#A Dataset
ggplot(tv_shows) +
#A mapping of visual aesthetics to variables
aes(x = seasonNumber,
y = av_rating,
group = title,
color = title) +
#A geometry
geom_line()
```
# Discussion (5 min)
While this visualization might be good for exploratory purposes, for explanatory purposes, it is a little too complicated to communicate our findings about the data.
Before you move on, discuss on slack:
- What does this graph communicate?
- How could it be better?
- Are there any shows that stand out?
# What do we want to show?
We're going to highlight those shows that are `risers` in the data and make the point that shows that increase in ratings overall tend to stop early.
# Assigning our plot so we can reuse it (3 min).
Let's save our plot into an R object called a ggplot. We'll use the `<-` (left arrow) to assign it to the variable called `my_plot`:
```{r}
my_plot <-
ggplot(tv_shows) +
aes(x = seasonNumber,
y = av_rating,
group = title,
color = title) +
geom_line()
```
Now, when we want to modify our plot, we can use `my_plot`. More on this in the next notebook. We're basically going to add commands to modify our plot. I like to think of it as a `paper doll` approach: we are dressing our plot in different clothes.
```{r}
my_plot
```
# What's next?
Open up `02-decluttering.Rmd`.