Skip to content

Commit 5a37e45

Browse files
committed
sharing and zenodo episode
1 parent cc8b564 commit 5a37e45

9 files changed

+295
-8
lines changed

content/img/ai/record-player.png

342 KB
Loading

content/img/ai/turntable.png

542 KB
Loading

content/img/license-models.png

95 KB
Loading
2.21 MB
Loading

content/img/turing-way/README.txt

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Obtained from https://zenodo.org/record/3332808.
2+
3+
When using any of the images, please credit it with
4+
"This image was created by Scriberia for The Turing Way community and is used under a CC-BY licence."

content/index.md

+4-2
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,8 @@ them to own projects**.
8383
- {ref}`refactoring-concepts` (15 min)
8484

8585
- 15:00-16:30 - How to release and publish your code
86-
- {ref}`licensing-publishing` (45 min)
86+
- {ref}`software-licensing` (30 min)
87+
- {ref}`publishing` (15 min)
8788
- {ref}`packaging` (45 min)
8889

8990
- 16:45-18:00 - **Debriefing and Q&A**
@@ -120,7 +121,8 @@ testing
120121
reusable
121122
refactoring-demo
122123
refactoring-concepts
123-
licensing-publishing
124+
software-licensing
125+
publishing
124126
packaging
125127
profiling
126128
```

content/licensing-publishing.md

-6
This file was deleted.

content/publishing.md

+134
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
(publishing)=
2+
3+
# How to publish your code
4+
5+
:::{objectives}
6+
- Make our code citable and persistent.
7+
- Make our Notebook reusable and persistent.
8+
:::
9+
10+
11+
## Is putting software on GitHub/GitLab/... publishing?
12+
13+
```{figure} img/turing-way/8-fair-principles.jpg
14+
:alt: FAIR principles
15+
:width: 70%
16+
17+
FAIR principles. (c) [Scriberia](http://www.scriberia.co.uk) for [The Turing Way](https://the-turing-way.netlify.com), CC-BY.
18+
```
19+
20+
Is it enough to make the code public for the code to remain **findable and accessible**?
21+
- No. Because nothing prevents me from deleting my GitHub repository or
22+
rewriting the Git history and we have no guarantee that GitHub will still be around in 10 years.
23+
- **Make your code citable and persistent**:
24+
Get a persistent identifier (PID) such as DOI in addition to sharing the
25+
code publicly, by using services like [Zenodo](https://zenodo.org) or
26+
similar services.
27+
28+
29+
## How to make your software citable
30+
31+
```{discussion} Discussion (Citation-1): Explain how you currently cite software
32+
- Do you cite software that you use? How?
33+
- If I wanted to cite your code/scripts, what would I need to do?
34+
```
35+
36+
**Checklist for making a release of your software citable**:
37+
38+
- Assigned an appropriate license
39+
- Described the software using an appropriate metadata format
40+
- Clear version number
41+
- Authors credited
42+
- Procured a persistent identifier
43+
- Added a recommended citation to the software documentation
44+
45+
This checklist is adapted from: N. P. Chue Hong, A. Allen, A. Gonzalez-Beltran,
46+
et al., Software Citation Checklist for Developers (Version 0.9.0). Zenodo.
47+
2019b. ([DOI](https://doi.org/10.5281/zenodo.3482769))
48+
49+
**Our practical recommendations**:
50+
- Add a file called [CITATION.cff](https://citation-file-format.github.io/) ([example](https://github.com/bast/runtest/blob/main/CITATION.cff)).
51+
- Get a [digital object identifier
52+
(DOI)](https://en.wikipedia.org/wiki/Digital_object_identifier) for your code
53+
on [Zenodo](https://zenodo.org/) ([example](https://zenodo.org/record/8003695)).
54+
- Make it as easy as possible: clearly say what you want cited.
55+
56+
This is an example of a simple `CITATION.cff` file:
57+
```yaml
58+
cff-version: 1.2.0
59+
message: "If you use this software, please cite it as below."
60+
authors:
61+
- family-names: Doe
62+
given-names: Jane
63+
orcid: https://orcid.org/1234-5678-9101-1121
64+
title: "My Research Software"
65+
version: 2.0.4
66+
doi: 10.5281/zenodo.1234
67+
date-released: 2021-08-11
68+
```
69+
70+
More about `CITATION.cff` files:
71+
- [GitHub now supports CITATION.cff files](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files)
72+
- [Web form to create, edit, and validate CITATION.cff files](https://citation-file-format.github.io/cff-initializer-javascript/)
73+
- [Video: "How to create a CITATION.cff using cffinit"](https://www.youtube.com/watch?v=zcgLIT5Qd4M)
74+
75+
76+
## Papers with focus on scientific software
77+
78+
Where can I publish papers which are primarily focused on my scientific
79+
software? Great list/summary is provided in this blog post: ["In which
80+
journals should I publish my software?" (Neil P. Chue
81+
Hong)](https://www.software.ac.uk/top-tip/which-journals-should-i-publish-my-software)
82+
83+
84+
## How to cite software
85+
86+
```{admonition} Great resources
87+
- A. M. Smith, D. S. Katz, K. E. Niemeyer, and FORCE11 Software Citation
88+
Working Group, "Software citation principles," PeerJ Comput. Sci., vol. 2,
89+
no. e86, 2016 ([DOI](https://doi.org/10.7717/peerj-cs.86))
90+
- D. S. Katz, N. P. Chue Hong, T. Clark, et al., Recognizing the value of
91+
software: a software citation guide [version 2; peer review: 2 approved].
92+
F1000Research 2021, 9:1257 ([DOI](https://doi.org/10.12688/f1000research.26932.2))
93+
- N. P. Chue Hong, A. Allen, A. Gonzalez-Beltran, et al., Software Citation
94+
Checklist for Authors (Version 0.9.0). Zenodo. 2019a. ([DOI](https://doi.org/10.5281/zenodo.3479199))
95+
- N. P. Chue Hong, A. Allen, A. Gonzalez-Beltran, et al., Software Citation
96+
Checklist for Developers (Version 0.9.0). Zenodo. 2019b. ([DOI](https://doi.org/10.5281/zenodo.3482769))
97+
```
98+
99+
Recommended format for software citation is to ensure the following information
100+
is provided as part of the reference (from [Katz, Chue Hong, Clark,
101+
2021](https://doi.org/10.12688/f1000research.26932.2) which also contains
102+
software citation examples):
103+
- Creator
104+
- Title
105+
- Publication venue
106+
- Date
107+
- Identifier
108+
- Version
109+
- Type
110+
111+
112+
113+
## Exercise/demo
114+
115+
:::{exercise}
116+
- We will add a `CITATION.cff` file to our example repository.
117+
- We will get a DOI using the [Zenodo sandbox](https://sandbox.zenodo.org):
118+
- We will log into the [Zenodo sandbox](https://sandbox.zenodo.org) using
119+
GitHub.
120+
- We will follow [these steps](https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content)
121+
and finally create a GitHub release and get a DOI.
122+
- We will use the [Binder badge on our example repository](https://github.com/coderefinery/imgfilters) to run the Notebook
123+
in the cloud and discuss how we could make it persistent and citable.
124+
:::
125+
126+
:::{discussion}
127+
- Why did we use the Zenodo sandbox and not the "real" Zenodo for our exercise?
128+
:::
129+
130+
131+
## More resources
132+
133+
- [Social coding lesson material](https://coderefinery.github.io/social-coding/software-citation/)
134+
- [Sharing Jupiter Notebooks](https://coderefinery.github.io/jupyter/sharing/)

content/software-licensing.md

+153
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
(software-licensing)=
2+
3+
# Choosing a software license
4+
5+
:::{objectives}
6+
- Knowing about what derivative work is and whether we can share it.
7+
- Get familiar with terminology around licensing.
8+
- We will add a license to our example project.
9+
:::
10+
11+
12+
## Copyright and derivative work: Sampling/remixing
13+
14+
:::{figure} img/ai/record-player.png
15+
:alt: Generated image of a monk operating a record player
16+
:width: 50%
17+
:::
18+
[Midjourney, CC-BY-NC 4.0]
19+
20+
:::{figure} img/ai/turntable.png
21+
:alt: Generated image of a monk operating two record players
22+
:width: 50%
23+
:::
24+
[Midjourney, CC-BY-NC 4.0]
25+
26+
- Copyright controls whether and how we can distribute
27+
the original work or the **derivative work**.
28+
- In the **context of software** it is more about
29+
being able to change and **distribute changes**.
30+
- Changing and distributing software is similar to changing and distributing
31+
music
32+
- You can do almost anything if you don't distribute it
33+
34+
**Often we don't have the choice**:
35+
- We are expected to publish software
36+
- Sharing can be good insurance against being locked out
37+
38+
**Can we distribute our changes** with the research community or our future selves?
39+
40+
41+
## Why software licenses matter
42+
43+
You find some great code that you want to reuse for your own publication.
44+
45+
- This is good for the original author - you will cite them. Maybe other people who cite you will cite them.
46+
47+
- You modify and remix the code.
48+
49+
- Two years later ... ⌛
50+
51+
- Time to publish: You realize **there is no license to the original work** 😱
52+
53+
**Now we have a problem**:
54+
- 😬 "Best" case: You manage to publish the paper without the software/data.
55+
Others cannot build on your software and data.
56+
- 😱 Worst case: You cannot publish it at all.
57+
Journal requires that papers should come with data and software so that they are reproducible.
58+
59+
60+
## Taxonomy of software licenses
61+
62+
:::{figure} img/license-models.png
63+
:alt: "European Union Public Licence (EUPL): guidelines July 2021"
64+
65+
European Commission, Directorate-General for Informatics, Schmitz, P., European Union Public Licence (EUPL): guidelines July 2021, Publications Office, 2021, <https://data.europa.eu/doi/10.2799/77160>
66+
:::
67+
68+
Comments:
69+
- Arrows represent compatibility (A -> B: B can reuse A)
70+
- Proprietary/custom: Derivative work typically not possible (no arrow goes from proprietary to open)
71+
- Permissive: Derivative work does not have to be shared
72+
- Copyleft/reciprocal: Derivative work must be made available under the same license terms
73+
- NC (non-commercial) and ND (non-derivative) exist for data licenses but not really for software licenses
74+
75+
**Great resource for comparing software licenses**: [Joinup Licensing Assistant](https://joinup.ec.europa.eu/collection/eupl/solution/joinup-licensing-assistant/jla-find-and-compare-software-licenses)
76+
- Provides comments on licenses
77+
- Easy to compare licenses ([example](https://joinup.ec.europa.eu/licence/compare/BSD-3-Clause;Apache-2.0))
78+
- [Joinup Licensing Assistant - Compatibility Checker](https://joinup.ec.europa.eu/collection/eupl/solution/joinup-licensing-assistant/jla-compatibility-checker)
79+
- Not biased by some company agenda
80+
81+
82+
## Exercise/demo
83+
84+
:::{exercise}
85+
- Let us choose a license for our example project.
86+
- We will add a LICENSE to the repository.
87+
:::
88+
89+
:::{discussion}
90+
- What if my code uses libraries like `scikit-image`, `scikit-learn`, `numpy`,
91+
`matplotlib`, etc. Do we need to look at their licenses? In other words,
92+
**is our project derivative work** of something else?
93+
:::
94+
95+
96+
## More resources
97+
98+
- Presentation slides "Practical software licensing" (R. Bast): <https://doi.org/10.5281/zenodo.11554001>
99+
- [Social coding lesson material](https://coderefinery.github.io/social-coding/)
100+
- [UiT research software licensing guide (draft)](https://research-software.uit.no/blog/2023-software-licensing-guide/)
101+
- [Research institution policies to support research software (compiled by the Research Software Alliance)](https://www.researchsoft.org/software-policies/)
102+
- More [reading material](https://coderefinery.github.io/social-coding/software-licensing/#great-resources)
103+
104+
105+
## More exercises
106+
107+
:::{exercise} Exercise: What constitutes derivative work?
108+
109+
Which of these are derivative works? Also reflect/discuss how this affects the
110+
choice of license.
111+
- A. Download some code from a website and add on to it
112+
- B. Download some code and use one of the functions in your code
113+
- C. Changing code you got from somewhere
114+
- D. Extending code you got from somewhere
115+
- E. Completely rewriting code you got from somewhere
116+
- F. Rewriting code to a different programming language
117+
- G. Linking to libraries (static or dynamic), plug-ins, and drivers
118+
- H. Clean room design (somebody explains you the code but you have never seen it)
119+
- I. You read a paper, understand algorithm, write own code
120+
121+
```{solution}
122+
- Derivative work: A-F
123+
- Not derivative work: G-I
124+
- E and F: This depends on how you do it, see "clean room design".
125+
```
126+
:::
127+
128+
:::{exercise} Exercise: Licensing situations
129+
130+
Consider some common licensing situations. If you are part of an exercise
131+
group, discuss these with others:
132+
1. What is the StackOverflow license for code you copy and paste?
133+
2. A journal requests that you release your software during publication. You have
134+
copied a portion of the code from another package, which you have forgotten.
135+
Can you satisfy the journal's request?
136+
3. You want to fix a bug in a project someone else has released, but there is no license. What risks are there?
137+
4. How would you ask someone to add a license?
138+
5. You incorporate MIT, GPL, and BSD3 licensed code into your project. What possible licenses can you pick for your project?
139+
6. You do the same as above but add in another license that looks strong copyleft. What possible licenses can you use now?
140+
7. Do licenses apply if you don't distribute your code? Why or why not?
141+
8. Which licenses are most/least attractive for companies with proprietary software?
142+
143+
```{solution}
144+
1. As indicated [here](https://stackoverflow.com/help/licensing), all publicly accessible user contributions are licensed under [Creative Commons Attribution-ShareAlike](https://creativecommons.org/licenses/by-sa/4.0/) license. See Stackoverflow [Terms of service](https://stackoverflow.com/legal/terms-of-service/public#licensing) for more detailed information.
145+
2. "Standard" licensing rules apply. So in this case, you would need to remove the portion of code you have copied from another package before being able to release your software.
146+
3. By default you are no authorized to use the content of a repository when there is no license. And derivative work is also not possible by default. Other risks: it may not be clear whether you can use and distribute (publish) the bugfixed code. For the repo owners it may not be clear whether they can use and distributed the bugfixed code. However, the authors may have forgotten to add a license so we suggest you to contact the authors (e.g. make an issue) and ask whether they are willing to add a license.
147+
4. As mentionned in 3., the easiest is to fill an issue and explain the reasons why you would like to use this software (or update it).
148+
5. Combining software with different licenses can be tricky and it is important to understand compatibilities (or lack of compatibilities) of the various licenses. GPL license is the most protective (BSD and MIT are quite permissive) so for the resulting combined software you could use a GPL license. However, re-licensing may not be necessary.
149+
6. Derivative work would need to be shared under this strong copyleft license (e.g. AGPL or GPL), unless the components are only plugins or libraries.
150+
7. If you keep your code for yourself, you may think you do not need a license. However, remember that in most companies/universities, your employer is "owning" your work and when you leave you may not be allowed to "distribute your code to your future self". So the best is always to add a license!
151+
8. The least attractive licenses for companies with proprietary software are licenses where you would need to keep an open license when creating derivative work. For instance GPL and and AGPL. The most attractive licenses are permissive licenses where they can reuse, modify and relicense with no conditions. For instance MIT, BSD and Apache License.
152+
```
153+
:::

0 commit comments

Comments
 (0)