-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathstructure.html
194 lines (130 loc) · 4.56 KB
/
structure.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
---
title: "Structure"
quote:
text: "Another response of the wizards, when faced with a new and unique situation, was to look through their libraries to see if it had ever happened before. This was…a good survival trait. It meant that in times of danger you spent the day sitting very quietly in a building with very thick walls."
author: "Terry Pratchett"
---
# What problems are we trying to solve?
- Decomposition
- "What is a 'project'?"
- Findability
- "Where is everything?"
- Evolution
- "Oh great—another dataset…"
---
# When to create a project
- A dataset used by several groups in several ways
- Has its own data collection and tidying scripts
- A one-of-a-kind analysis for an NGO
- Data subsets, Jupyter notebooks, generated PDFs
- A software package
- With a tutorial, documentation, and sample data
---
# Four approaches
- One repository per publication
- Only if datasets and tools are their own projects
- One per tool
- Only if you are comfortable creating packages
- One per team
- But teams change over time
- One per regular meeting
---
# Standard files
## First 3 of 5
- Put these in the project's root directory
- With or without a `.md` suffix
- `README`: brief description of project
- `CITATION`: how to cite this project
- `CONTRIBUTING`
- How to set up for development
- What goes where
- Governance
---
# Standard files
## #4: License
- `LICENSE(.md)`: who can do what
- Will discuss in detail in [Sharing](./06-sharing.html)
- *Do not write your own license*
---
# Standard files
## #5: Code of Conduct
- `CONDUCT(.md)`: how are participants expected to behave
- Signals that you want everyone to feel welcome
- Prevents people saying "but you didn't tell me"
- Spell out the complaint and enforcement process
- Rules mean nothing if no one knows how to apply them
- Use the Contributor Covenant
---
<h1 class="project-lead">As project lead</h1>
- Choose a license
- If your institution hasn't mandated one
- Review `README` and `CONTRIBUTING` quarterly
- If your project has a calendar, add an entry
- Add to `CITATION` after each publication
- Define enforcement for `CONDUCT`
---
# Noble's Rules
- Choose filenames for easy wildcard matching
- Tab completion means you don't have to type them all
{% include fig img="noble-high-level.png" alt="Noble's Rules (high level)" %}
---
# Noble's Rules
## One sub-directory for each report
- Rename `ajcs-yyyy` as needed when the publication year is known
{% include fig img="noble-per-report.png" alt="Noble's Rules (per report)" %}
---
# Noble's Rules
## All runnable code together
- `bin` is a Unix convention ("binary" meant "compiled program")
{% include fig img="noble-bin-directory.png" alt="Noble's Rules (bin directory)" %}
---
# Noble's Rules
## Source for compiled programs
- `Makefile` puts compiled programs into `../bin`
{% include fig img="noble-src-directory.png" alt="Noble's Rules (bin directory)" %}
---
# Noble's Rules
## Raw data
- Don't modify raw data
{% include fig img="noble-raw-data.png" alt="Noble's Rules (raw data)" %}
---
# Noble's Rules
## Generated data sets
- Only save if regenerating is expensive
{% include fig img="noble-generated-data.png" alt="Noble's Rules (generated data)" %}
---
# Noble's Rules
## Web site
- Contents may be generated from multiple sources
{% include fig img="noble-web-site.png" alt="Noble's Rules (web site)" %}
---
<h1 class="project-lead">As project lead</h1>
- Decide which results need to be in version control and which can be regenerated
- Add `.gitignore` to `results` to ignore certain files
---
# Static site generators
- Separate content from presentation
- Source files are Markdown, notebooks, and code
- Extract specially-formatted comments to create docs
- Regenerate consistent web pages
- GitHub Pages (uses Jekyll)
- Blogdown (R Markdown, uses Hugo)
- Sphinx (Python)
---
<h1 class="project-lead">As project lead</h1>
- Choose a theme for your website
- Don't create one
- Fill in the first few pages
- Set a calendar entry to check it quarterly
- You are using the project's calendar, right?
---
<h1 class="exercise">How is your project currently organized?</h1>
1. What files go where?
1. Where and how are your datasets documented?
1. Who chose the project's license?
1. How is your project's website maintained?
---
<h1 class="exercise">How do notebooks change things?</h1>
Noble's Rules were written before computational notebooks became widespread.
1. Does it make sense to put notebooks in the project's root directory rather than in sub-directories?
1. Where should saved figures go?