How do we provide information to users of our research
-software?
+
How do we provide information to users of our
+research software?
Why is documenting code useful for researchers?
What does well-documented software look like?
@@ -415,7 +415,8 @@
Questions
Objectives
Understand the basic purpose of this course
-
Learn the motivation for learning to document software
+
Learn the motivation for learning to document
+software
Be introduced to good software documentation practices
@@ -468,23 +469,23 @@
Challenge
Advantages of good documentation
-
There are many advantages to writing guidance to go along with your
-research software. Software documentation helps yourself and others to
-use it successfully in the future and read your code ensuring that its
-value is sustained.
-
Research outputs often depend upon the code used to generate them.
-Clarity and confidence are essential in using code to perform
-calculations, simulations, or data analysis. All kinds of research
-processes and analysis pipelines can be made more
+
There are many advantages to writing guidance to go
+along with your research software. Software documentation helps yourself
+and others to use it successfully in the future and read your code
+ensuring that its value is sustained.
+
Research outputs often depend upon the code used to
+generate them. Clarity and confidence are essential in using code to
+perform calculations, simulations, or data analysis. All kinds of
+research processes and analysis pipelines can be made more
reproducible by providing clear context and
instructions for using it.
-
There are many advantages to making your code more readable.
-Well-documented software is easier to maintain and has greater
-sustainability, which means it can continue to be used and modified for
-a longer period of time, despite changes in technology. If software is
-more reusable then it encourages others to use it for their research,
-increasing the number of citations of that software and its overall
-research impact.
+
There are many advantages to making your code more
+readable. Well-documented software is easier to maintain and has greater
+sustainability, which means it can continue to be used
+and modified for a longer period of time, despite changes in technology.
+If software is more reusable then it encourages others to use it for
+their research, increasing the number of citations of that software and
+its overall research impact.
@@ -505,39 +506,43 @@
Challenge
In the long run, it can help you to develop your own software
engineering practice by getting into the habit of reflecting on what the
-purpose of the software is and to articulate what each component or
-module is for.
+purpose of the software is and to
+articulate what each component or module is for.
Writing a useful software package that is well-documented and can be
reused in the future means that your code could take on a life of its
own, with benefits that extend beyond yourself to your collaborators and
other researchers in the future. High-quality documentation is a key
-part of ensuring a healthy software lifecycle. It can make the different
-between accidentally creating an abandoned piece of “gradware” (a slang
-term for mysterious code that a former student wrote and nobody else can
-use) and a successful long-term software project with lasting
-impact.
+part of ensuring a healthy software lifecycle. It can
+make the different between accidentally creating an abandoned piece of
+“gradware” (a slang term for mysterious code that a former student wrote
+and nobody else can use) and a successful long-term software project
+with lasting impact.
When should I write documentation?
Now! Start writing and sharing documentation for your reseach code
-from the beginning of your project. It should be a consideration in your
-software management plan, which is a concept discussed in the
-Module 1a on Software Lifecycle Planning. It’s never too late to start
-documentaing an old code project.
+from the beginning of your project. It doesn’t have to
+be perfect straight away, but a first draft is more useful than nothing.
+It should be a consideration in your software management plan,
+which is a concept discussed in the Module 1a on Software Lifecycle
+Planning. Also, it’s never too late to start documentaing an old code
+project.
This might include design notes, diagrams, or the various kinds of
software documentation we’ll discuss in this module. The best practice
for modern, collaborative research involving digital methods and tools
-is to document your processes early and often. Not only will
-writing notes about your code help other people to read and use that
-code, it will clarify your thought process as you design your system,
-focussing your work on the important parts of the task at hand.
-
Keep in touch with other developers and users of the research code
-and make a note of their feedback. Common questions and problems are a
-sign that there are issues that must be covered more clearly and in
-greater depth in the software documentation. Incorporate this feedback
-into the software documentation using the whichever method is most
-appropriate, following the guidance in this module.
+is to document your processes early and often. Not only
+will writing notes about your code help other people to read and use
+that code, it will clarify your thought process as you design your
+system, focussing your work on the important parts of the task at
+hand.
+
Keep in touch with other developers and users of the
+research code and make a note of their feedback. Common questions and
+problems are a sign that there are issues that must be covered more
+clearly and in greater depth in the software documentation.
+Incorporate this feedback into the software
+documentation using the whichever method is most appropriate, following
+the guidance in this module.
Examples
@@ -563,7 +568,7 @@
-
+
This is some research code that is contained in a Python
function.
@@ -574,7 +579,7 @@
PYTHON<
return x * weird_num - angle**3/ (weird_num *2)
-
+
This is some research code that is contained in an R function.
R
@@ -610,7 +615,7 @@
Challenge
out more about weird_num? This is effectively a “magic”
number that is arbitrarily stated but unexplained.
The logic of the calculation is also… rather cryptic.
-
Maybe the code works, maybe it doesn’t but it could be made clearer
+
Maybe the code works, maybe it doesn’t; but it could be made clearer
and easier to maintain and modify in the future.
@@ -631,14 +636,9 @@
R
-
-
+
This is a function written in the Python programming language that
calculates a mathematical result, the details of which aren’t relevant.
This code has plenty of documentation to help us read and understand
@@ -674,7 +674,7 @@
PYTHON<
return sin_value
-
+
This is a function written in the R programming language that
calculates a mathematical result, the details of which aren’t relevant.
This code has plenty of documentation to help us read and understand it.
@@ -710,34 +710,7 @@
Rreturn(sin_value)}
-
-
-
-
-
-
Read and evaluate this code.
-
-
Can you tell what the purpose of the function is?
-
What is the meaning of the variables?
-
Which code would you prefer to use?
-
-
-
Read and evaluate this code.
-
-
Can you tell what the purpose of the function is?
-
What is the meaning of the variables?
-
Which code would you prefer to use?
-
-
-
Challenge
-
-
Read and evaluate this code.
-
-
Can you tell what the purpose of the function is?
-
What is the meaning of the variables?
-
Which code would you prefer to use?
-
@@ -745,7 +718,7 @@
Challenge
-
Challenge
+
Discussion
Read and evaluate this code.
@@ -756,9 +729,6 @@
Challenge
-
-
-
This time, the function name is a verb that describes what the code
will attempt to do. The description of the function is also written out
clearly in a note for the user. There are comment lines (starting with
@@ -767,8 +737,8 @@
Challenge
intuitive to read. An existing library is used to calculate the
factorial, which means we can look up the usage for the
factorial() function elsewhere.
-
This approach means that our code is much easier to interpret,
-maintain, and make changes to in the future.
+
This approach means that our code is much easier to
+interpret, maintain, and make changes to in the future.
Of course, there may be some syntax in this example that is
unfamiliar to you—but don’t worry, we’ll learn the basics in this
course!
@@ -783,23 +753,23 @@
NumPy user guide
NumPy is a mathematical package for the Python programming language
that’s used for linear algebra. The NumPy User
-Guide is a thorough website that organised into sections that cover
-the different aspects of using that package. It includes a beginner’s
-guide, tutorials for different use-cases, and in-depth write-ups of
-technical details of certain aspects of the code. Some of the content is
-written for a target audience with no assumed knowledge, while other
-parts are written as a reference for people with some background in
-mathematics and computer programming.
+Guide is a thorough website that organised into
+sections that cover the different aspects of using that package.
+
It includes a beginner’s guide, tutorials for different use-cases,
+and in-depth write-ups of technical details of certain aspects of the
+code. Some of the content is written for a target audience with no
+assumed knowledge, while other parts are written as a reference for
+people with some background in mathematics and computer programming.
ggplot2 documentation site
ggplot2 is a package for the R statistical language that generates
data visualisations and graphics. The ggplot2
-documentation has a simple, accessible layout and walks a new user
-through installing and getting up-and-running with the tool. The page
-provides a “cheat sheet” which is a reference guide that lists
-commonly-used commands in an attractice two-page layout. The
+documentation has a simple, accessible layout and walks a
+new user through installing and getting up-and-running with the
+tool. The page provides a “cheat sheet” which is a reference guide that
+lists commonly-used commands in an attractice two-page layout. The
documentation site is moderate in scope and links to several external
resources, such as online courses hosted elsewhere.
The source code is neatly organised into R code files
@@ -840,7 +810,7 @@
How do we introduce our software to new researchers
and developers?
-
What is a README file?
-
How do I write a README for my research code?
-
What are the contents of a good README file?
+
How do I structure the basic notes for my research code?
+
What are the contents of good documentation?
@@ -868,9 +837,10 @@
Questions
Objectives
-
Explain why and how to write a README file for research
-software
-
Learn how to structure a README file into sections
+
Explain why and how to write a README file for
+research software
+
Learn how to structure documentation into
+sections
Understand the important components of a good README
@@ -881,28 +851,29 @@
Objectives
What is a README file?
-
A README file is the first thing a user sees when they find your
-software. It should give them an approachable overview of the package,
-define what’s possible to achieve with this code, and get them started
-on the right track to use the software effectively for their
-research.
-
A README contains a brief introduction to the code and shows them how
-to get started using it. For larger packages, the README forms a concise
-beginner guide and might link to a more detailed user guide that is
-located elsewhere.
-
The audience for a README file is the end user. It’s
-important to consider the person will read your documentation,
-and to see things from their point of view. It may be someone who is
-unfamiliar with certain technical terms, or a researcher will less
-experience of advanced computing. A good approach is to imagine writing
-a manual for a new user who has never seen this software before.
+
A README file is the first thing a user sees when
+they find your software. It should give them an approachable overview of
+the package, define what’s possible to achieve with this code, and get
+them started on the right track to use the software effectively for
+their research.
+
A README contains a brief introduction to the code
+and shows them how to get started using it. For larger packages, the
+README forms a concise beginner guide and might link to a more detailed
+user guide that is located elsewhere.
+
The audience for a README file is the end
+user, such as a researcher. It’s important to consider the
+person will read your documentation, and to see things from their
+point of view. It may be someone who is unfamiliar with certain
+technical terms, or a researcher will less experience of advanced
+computing. A suitable approach is to imagine writing a manual for a new
+user who has never seen this software before.
How to write a README
-
To start writing a README file, the simplest way is to just create an
-empty text file called README.txt and start writing. This
-file should be located in the directory (or folder) that contains your
-software project.
+
To start writing a README file, the simplest way is to create
+an empty text file called README.txt and start
+writing. This file should be located in the directory (or folder) that
+contains your software project.
@@ -920,7 +891,7 @@
Challenge
-
+
Follow these general steps to create a README file. The specific
details for each operating system are detailed below.
Use the File
Manager to create a new directory called oddsong.
Inside that folder, create a new text file called
@@ -977,7 +948,7 @@
BASHnano oddsong/README.txt
-
+
Use the Finder
file manager to create a new directory called oddsong.
Inside that folder, create a new text file called
@@ -1015,10 +986,10 @@
BASH
-
It can be useful to signpost to other, related useful software tools
-by providing links and explaining how other software is related or
-different to this project when it comes to addressing these kinds of
-research problems.
+
It can be useful to signpost to related useful
+methods and software tools by providing links and explaining how other
+software is related or different to this project when it comes to
+addressing these kinds of research problems.
@@ -1027,30 +998,44 @@
BASHWalk a mile in the user’s shoes
Put yourself in the position of a researcher who has encountered your
-software for the first time. Consider, if you had to start from square
+software for the first time. Imagine that you had to start from square
one, how would you like the code to be introduced to you?
-
Remember: things that are obvious to you and your colleagues may not
-be clear to others. What assumed knowledge must you
-explicitly explain to get them up to speed?
-
For research code, it’s often important to explain the context in
-which the software was written and the theory behind it. For example,
-many researchers write analysis packages or workflows that are based on
-previously-published research, statistical methods, or theoretical
-models for which citations can be provided. By including references to
-research papers we better help the users to understand the methods that
-are implemented by our software, which enables its users to properly
-cite their sources and increases the users’ confidence that you have
-applied those methods correctly.
+
+
+
+
+
+
Discussion
+
+
Consider your field of research and the technologies you commonly
+use.
+
+
What things are obvious to you that may not be clear to others?
+
What assumed knowledge must you explain to new
+colleagues to get them up to speed?
+
+
+
+
+
For research code, it’s often important to explain the
+context in which the software was written and the theory behind
+it. For example, many researchers write analysis packages or workflows
+that are based on previously-published research, statistical methods, or
+theoretical models for which citations can be provided.
+By including references to research papers we better help the users to
+understand the methods that are implemented by our software, which
+enables its users to properly cite their sources and increases the
+users’ confidence that you have applied those methods correctly.
Installation instructions
Provide instructions for installing your research software. These
steps should be laid out in simple, clear language and organised in a
step-by-step manner.
-
+
@@ -1071,29 +1056,36 @@
Discussion
-
For most research code, the user will need to install the programming
-language onto their computer, such as R or Python, so it’s useful to
-link to the download pages and provide a link to the package manager
-tools that are commonly used in those ecosystems. This might also
-include listing any prerequisites such as hardware or software that must
-be installed first, such as device drivers.
+
+
Installing prerequisites
+
+
Most research code has several dependencies, such as
+libraries. The user will need to install the programming language onto
+their computer, such as R or Python, so it’s useful to link to the
+download pages and provide a link to the package manager tools that are
+commonly used in those ecosystems. This might also include listing any
+prerequisites such as hardware or software that must be installed first,
+such as device drivers.
Consider how the installation method might differ for users of other
common operating systems, such as Windows, Linux, and Mac OS.
+
User guide
-
All software should include some short guidance on how to use it and
-what the main options and features are. This might be a “quick start”
-guide with simple examples of common use-cases, or a walkthrough that
-uses a sample data set. Explain how the software can be configured or
-customised, including examples of commonly-used options. If the software
-integrates with other tools or uses specific file formats for its input
-and output, it’s useful to explain this here too. It’s a good idea to
-include links to further documentation if available.
-
Many users will benefit from a frequently asked questions (FAQ) or
-troubleshooting notes, which describes common error messages, explains
-why they occur, and the steps to resolve them.
+
All software should include some short guidance on how to use
+it and what the main options and features are. This might be a
+“quick start” guide with simple examples of common use-cases, or a
+walkthrough that uses a sample data set.
+
Explain how the software can be configured or
+customised, including examples of commonly-used options. If the
+software integrates with other tools or uses specific file formats for
+its input and output, it’s useful to explain this here too. It’s a good
+idea to include links to further documentation if available.
+
Many users will benefit from a frequently asked
+questions (FAQs) or troubleshooting notes, which describes
+common error messages, explains why they occur, and the steps to resolve
+them.
@@ -1106,8 +1098,8 @@
Writing style
technical terms and acronyms should be explained. However, don’t
reinvent the wheel by defining all the terms used, instead link to a
reliable external source or journal article.
-
For more information about good writing style, please refer to these
-style
+
Diagrams can be particularly useful to explain complex concepts and
workflows. Screenshots may also provide a visual demonstration of how
the software will work.
-
+
@@ -1135,7 +1127,8 @@
Discussion
Not all READMEs must follow this structure. Always adapt the format
-of your documentation to suit the specific needs of your audience.
+of your documentation to suit the specific needs of
+your audience.
Accessibility
@@ -1167,7 +1160,7 @@
Discussion
-
+
For more information on this topic, please see the following
resources:
@@ -1187,22 +1180,24 @@
Text formatting
-
Most people prefer to use a file format that allows you to create
-headers to organise the content into sections or chapters, which is much
-clearer for the reader.
+
Most people prefer to use a file format that allows you to format
+text and create headers to organise the content into sections or
+chapters, which makes the content more comprehensible for the
+reader.
In this case, a Markdown
document may be used. Markdown is a simple markup language that
-allows you to format your text using symbols to represent headers, bold
-text, bullet lists, etc. that are displayed to the user in an appealing
-way. The Markdown syntax will be converted into appealing visual styles
-that make your documentation more aesthetically pleasing and easier to
-read.
+indicates into semantic labelling (such as emphasis and
+structure) and visual styles that make your documentation more
+aesthetically pleasing and easier to navigate. It allows you to format
+your text using symbols to represent headers, bold text, bullet lists,
+etc. that are displayed to the user using their screen or other device,
+depending upon accessibility requirements.
-
+
A markup
language is a system of special characters that are used to
@@ -1237,7 +1232,7 @@
Challenge
-
+
Follow these steps to rename README.txt to
README.md.
An example README file in Markdown format is shown below, in a file
-called README.md where .md is the file extension for
+called README.md where “.md” is the file extension for
Markdown files.
Section headers
-
You can separate your document into hierarchical sections with
-headings using the # symbol. This makes your README easier
-to navigate. For example:
+
You can separate your document into hierarchical
+sections with headings using the # symbol. This
+makes your README easier to navigate. For example:
MARKDOWN
# Birdsong identification tool
-This user guide provides instructions on how to use this birdsong identifier. The software is designed to assist users in identifying bird species based on their vocalisations.
-
-# Installation
+This user guide provides instructions on how to use this birdsong
+identifier. The software is designed to assist users in
+identifying bird species based on their vocalisations.
-To install this software...
+# Installation
-# Usage
+To install this software, follow the steps below...
-To use this package...
-
-
The “#” symbol means that line will be converted into a header and
-displayed to the reader in a large, bold font. This makes it easier for
-the reader to find the part of your text they’re looking for, just like
-having chapters in a book.
+# Usage
+
+To use this package, start by configuring...
+
+
The hash # symbol means that line will be converted into
+a header and displayed to the reader in a large, bold font. This makes
+it easier for the reader to find the part of your text they’re looking
+for, just like having chapters in a book.
@@ -1344,7 +1341,7 @@
Challenge
-
+
We can create the commonly-used headers used in READMEs by using the
Markdown syntax shown below
@@ -1401,10 +1398,11 @@
MARKDOWN
-
If your code is published on GitHub, the home page of your code
-repository will display the README.md file, including a navigation menu
-that is automatically created to easily select the section of the
-document to view.
+
If your code is published on GitHub,
+the home page of your code repository will display
+the README file, including a table
+of contents that is automatically created to easily select the
+section of the document to view.
“This repository has a README file with
chapters, making navigation easier.”
> The eastern towhee (Pipilo erythrophthalmus) is a large New World
+> sparrow. The taxonomy of the towhees has been under debate in
+> recent decades, and formerly this bird and the spotted towhee
+> were considered a single species, the rufous-sided towhee.
This will be rendered with the following apperearance:
-
This text is part of a blockquote.
+
The eastern towhee (Pipilo erythrophthalmus) is a large New World
+sparrow. The taxonomy of the towhees has been under debate in recent
+decades, and formerly this bird and the spotted towhee were considered a
+single species, the rufous-sided towhee.
+
(This text was retrieved from the Wikipedia page on the Eastern
+towhee.)
Code blocks
If you’d like to present the user will examples of source code, use
-code fences to display the code in a special text box with syntax
-highlighting. For example:
-
-
pi=3.14
-
+code
+fences to display the code in a special text box with syntax
+highlighting. To do this, wrap the code in three backticks
+`. For example:
+
Classes use capitalised words, where each word in a phrase starts
+with an upper-case letter and there are no spaces between them.
PYTHON
class Bird:pass
-
Variables use lower case with underscores
PYTHON
-
bird_name ="Blue jay"
+
class ConservationStatus:
+"""
+ IUCN Red List of Threatened Species
+ """
+ EX ="Extinct"
+ EW ="Extinct in the wild"
+ CR ="Critically Endangered"
+ EN ="Endangered"
+ LC ="Least Concern"
+
+
Variables use lower case with underscores
+
+
PYTHON
+
+
bird_name ="Blue jay"
Constants are named using upper case with underscores
Constants are named using upper case with underscores
-
+
R
@@ -2750,12 +2793,12 @@
Python comments
start with a hash character (#) and are ignored when the
code runs.
-
+
PYTHON
-
# Add three to my age
-age =21
-age +=3
+
# Add three to my age
+age =21
+age +=3
There’s more information about Python
operators such as += in the documentation for that
@@ -2765,7 +2808,7 @@
PYTHON<
Python comments
start with a hash character (#) and are ignored when the
code runs.
-
+
R
@@ -2831,10 +2874,10 @@
data we expect each variable to contain by using the syntax below. This
colon means that the age variable should contain a value
with the integer type, int.
-
operation is also expected to be an integer, so the return type is
labelled with the arrow syntax on the first line of the function
declaration as -> int.
-
+
PYTHON
-
def add(a: int, b: int) ->int:
-"""Add two numbers"""
-return a + b
+
def add(a: int, b: int) ->int:
+"""Add two numbers"""
+return a + b
@@ -2903,7 +2946,7 @@
PYTHON<
types of variable arguments, but this can be done with the roxygen2 library. The code block
below shows a docstring (which we covered earlier in the course) that
labels the types of the inputs and output of the function.
-
+
R
@@ -2930,22 +2973,22 @@
RType hints quiz
What do you expect to happen when the following code runs?
-
+
PYTHON
-
add(42, 1)
+
add(42, 1)
What about this code?
-
+
PYTHON
-
add(42.5, 1e5)
+
add(42.5, 1e5)
Will an error occur when we use strings as the input arguments?
-
+
PYTHON
-
add('cheese', 'cake')
+
add('cheese', 'cake')
@@ -2955,14 +2998,14 @@
PYTHON<
-
+
None of these code examples will cause an error because type hints
-are just passive labels that document our code. They don’t enforce any
-type checking or rules that are asserted when the code is executed. This
-means that, while type hints are very useful for static
-analysis of code, where we learn something about a piece of
-software without running it.
+are just passive labels that document our code. They
+don’t enforce any type checking or rules that are asserted when the code
+is executed. This means that, while type hints are very useful for
+static analysis of code, where we learn something about a piece
+of software without running it.
How do I present comprehensive information to users
-of my research software?
-
How do I generate a website containing a user guide to my code?
-
What should a good documentation website contain?
-
How do I publish my software documentation on the internet?
+
How do I introduce new contributors to my research
+software project?
+
What is the best way to communicate processes such
+as bug reporting?
+
Where should I write up the design and structure of
+the system?
@@ -3041,42 +3085,49 @@
Questions
Objectives
-
Learn about documentation websites for software
-packages.
-
Gain basic familiarity with some common website generation
-tools.
-
Understand the basics of structuring a documentation website.
-
Be able to set up a static site deployment workflow.
+
Learn to write a contribution guide for research
+code
+
Learn about software coding standards
+
+
Implement ways to facilitate communication between
+researchers that are engaged in the project
+
Provide a high-level understanding of an existing
+codebase
+
-
Documentation websites
+
Collaborative research software development
-
A documentation website is a user guide and reference manual for a
-library of research code. Up to now, we’ve looked at ways to put helpful
-notes in our code, but now we’ll learn how to write a longer, more
-complete guide to the research tools you create.
-
A documentation site bring all your user guidance into one place.
-This kind of resource may be prepared for research software and will
-usually contain an introduction, installation instructions, a user
-guide, troubleshooting tips, and an in-depth reference section.
-
To get an idea of this, here are some links documentation websites
-for widely-used data analysis and research software packages:
-
-
-pandas is a data
-processing library for the Python programming language.
-
-ggplot2 is a
-plotting package for the R statistical language.
-
-scikit-learn
-is a machine learning library for the Python programming language.
-
+
Often, in today’s research environment, much analytics software is
+written in a collaborative manner, involving multiple
+specialists from within a team, or from multiple institutions. For the
+long-term health of a software package, it’s important to encourage
+potential contributors to get in touch and feel welcome to take part.
+Useful research software can take on a life of its own.
+
+
+
+
+
+
Research software project management
+
+
For more information on planning the development of research software
+and project governance, see Module 1a.
+
+
+
+
It’s often published using an open source licence, which means that
+all the code is publicly available and may be used and modified by
+anyone, within certain conditions (see module 1b to learn more about
+software licensing.)
+
There’s a lot more creating and managing a sustainable community
+aorund a research software project, but having a central piece of
+documentation for contributors is a great start!
@@ -3084,53 +3135,43 @@
Objectives
Discussion
-
Evaluate these documentation sites.
+
Consider these questions amongst the group:
-
What do you like about them?
-
How approachable are they as a new user?
-
What do you find difficult to understand in this material?
+
How can we effectively foster a collaborative environment for
+research software development?
+
How can barriers to participation be removed for a diverse range of
+individuals and institutions?
+
What strategies can be implemented to ensure that all contributors
+feel valued and included?
-
Why create a website?
+
Contribution guides
-
There are many advantages to building a documentation site to provide
-a information-rich resource for researchers who use your code at
-institutions all around the world.
-
-
Advantages
-
-
These sites can work as hubs for collaboration, sharing the latest
-updates, and encouraging people to take up your system and get involved
-in improving it. The effort of setting one up will be rewarded in the
-long run because you will have created a valuable asset that will foster
-collaboration and knowledge sharing in your research community.
-
A key foundation stone of modern digital research practices is the
-ability to replicate results by reproducing analysis workflows. Clear,
-thorough documentation of the research code ensures that researchers can
-repeat processes and verify results and other people’s outputs.
-
Documentation sites are really useful for introducing new users to
-your software. It makes it much easier and faster for new users to get
-started using your software to boost their research. It’s one of the
-most effective ways to create a user base that has a sophisticated
-understanding of the research code, which is essential for them to adapt
-it to the complex problems that often raise in research contexts.
-
They’re also a valuable resource for your existing user base,
-enabling them to look up reference material or search the manual to find
-new capabilities they weren’t aware of before. This will increase the
-potential for your software to accellerate the productivity of other
-research teams and boost scientific progress.
-
-
-
When to use one
-
-
Although the advantages are numerous, not all software packages
-require a comprehensive documentation website. However, for any code
-project that is growing in the number of collaborators, users, and
-technical complexity, consider coordinating the team to write one as
-soon as possible to help the project grow in a healthy manner.
+
Contribution guidelines help users and understand how they can
+help to improve the software, whether that’s by
+submitting bug reports, suggesting new features, or writing better code
+and documentation. All of these aspects are vital to produce reusable
+research software.
+
Potential collaborators should be able to easily find out how to take
+part and contribute. Developers should be encouraged to use
+appropriate communication channels to ask questions and
+inform others of proposed software changes. The contact details for the
+project administrator or committee should be available and they should
+be welcome and responsive to any queries.
+
It’s important to explain how the project is managed so the process
+for evaluating new features and getting them implemented is clear, such
+as the code review and approval process. For many projects, a
+ticket system may be used to raise issues and suggest
+new features. Software developers often propose new code by creating a
+branch on the version control system (such as Git) and requesting for
+those changes to be merged into the main codebase.
+
Contribution guides will save you time in the long run, because it
+provides an on-ramp for people to get involved, prevents them from
+getting confused, and reduces the amount of incorrectly-submitted bug
+reports or requests for change, etc.
Documentation pages contain comphrehensive information about a
-particular piece of research software. Think of it like a user manual
-for your car or an instruction guide for building a piece of
-furniature.
-
-
Research context
-
-
For research software, it may be important to explain the theoretical
-background or statistical methods that are used and explain the
-domain-specific assumptions that were made when the code was designed
-and written. It’s good practice to provide a concise summary of the
-relevent concepts and link to external sources such as papers, books,
-and other websites for users to take a deeper dive into the principles
-and algorithms used.
-
-
Installation instructions
+
How to write contributor guidance
-
This section provides a detailed walkthrough of the steps required to
-install the package onto their computer, with details that are specific
-to their operating system.
+
The standard practice for authoring a contribution guide for a
+software project is to create a file called CONTRIBUTING.md
+in the root folder of your project. This is a Markdown file that
+introduces new people to the project. It lets people know the ways they
+can take part in the research software project and what to do to get
+involved.
+
The specific contents of this file depend upon the kind of research
+project, but some useful information to provide typically includes:
+
+
An introduction to the organisation and structure of the
+code, possibly including diagrams.
+
Instructions to raising issues, suggesting new
+features, and proposing code changes.
+
Links to additional documentation that’s hosted
+elsewhere, such as a code of conduct or discussion forum.
+
A walkthrough to setting up a development environment, such as
+guidance on installing developer tools or other prerequisites.
+
+
On code repository hosting platforms such as GitHub, the contribution
+guide will be created automatically using this
+CONTRIBUTING.md Markdown file.
+
+
+
-
-
Tutorials
-
-
It can be very useful to include an in-depth “Getting Started” guide
-that provides step-by-step instructions to introduce a new user to your
-software package. It might guide the user through each aspect of the
-tool’s functionality and features so they’re able to become familiar
-with it in a more approachable way.
-
A series of code examples to demonstrate how to use the software in
-different contexts can be very useful for users to get off the ground in
-implementing common research workflows to achieve their specific
-goals.
+
+
Challenge
+
+
Create a new file called CONTRIBUTING.md and populate it
+with a few sentences.
+
+
What are the most important things for a new contributor to
+know?
+
What should a user do if they encounter a bug?
+
What are the common questions that a new developer might have when
+they work on this research software?
+
-
-
User reference
-
-
If you have written functions that are intended to be use in other
-reseachers’ code, then an on-depth explaination of these procedures is
-essential reference material. In the world of software engineering,
-these detailed appendices are called
-API
-references, which list each function and describe how the arguments may
-be used to control how the code works. This content may be automatically
-generated from the documentation strings.
-
-
Troubleshooting
-
-
As issues come up with your research code, and are eventually
-resolved and clarified, make a note of the causes of these troubles and
-make them available to the entire user base in your documentation site.
-This will help users to identify and fix common misunderstandings and
-technical problems they may run into when utilising your code.
-
This prevents a situation where potential solutions to common issues
-do exist, but are scattered around the internet are the exclusive
-knowledge of a few individuals and are hard to find.
-
-
FAQs
-
-
An appendix containing frequently asked questions (FAQs) is very
-useful to save yourself time in responding to common queries from the
-users of your code.
-
Writing style
-
-
-
As we discussed in the episode on READMEs,
-it’s important to strive to use everyday, jargon-free language. It helps
-to set an approachable tone that encourages others to use the software
-and get involved with the project. This will en sure that the code is
-accesible to the widest possible layers of the research community and
-foster collaboration.
-
Always consider the target audience of your documentation, because
-your user base may be unaware of some of the unstated assumptions and
-technical backgroud knowledge that you take for granted.
-
Tools
+
Software project governance
-
There are various tools available to build documentation sites for
-your research software.
+
Project governance defines the scope and aims of a
+research software engineering project, and determines how decisions will
+be made and carried out. It sets out the processes and
+responsibilities that collaborators must understand to take
+part. This is something that should be considered when preparing a
+software management plan, as discussed in Module 1a of this course. This
+is important to make sure that questions of who does what, and how, are
+stated clearly so that everyone can understand and collaborate
+effectively to produce excellent research software. It’s worthwhile to
+think about this early on in a project to avoid potential pitfalls later
+on!
-
GitHub Wiki
+
Code of conduct
-
If you are publishing your code on GitHub, which is a web service
-that hosts costs repositories, then one of the easiest ways to create a
-documentation site is to use the wiki feature on that platform. This is
-a great way to write detailed, structured documents containing long-form
-content that describes aspects of your software. What’s more, it’s
-available alongside your code so your documentation and software are
-located in one place.
To create a wiki, which is a simple, easy-to-edit web site, go to the
-main page of your code repository on GitHub and click on the Wiki button
-on the top menu. For a detailed walkthrough of this process, please read
-adding
-or editing wiki pages on the GitHub documentation.
-
+
A code of conduct provides guidelines for the expected
+behaviour of people who are involved in the project. You may
+want to provide some general tips to create a productive community of
+researchers around the software, such as creating positive interactions
+between contributors, treat others with respect and dignity, and
+recommendations for processes for handling differences of opinion.
+
This has the following advantages:
+
+
Fosters a healthy, collaborative working
+environment where people feel respected, included, and can
+freely share ideas.
+
+Managing expectations and creating clear rules will
+reduce the amount of time wasted due to misunderstanding and
+conflicts.
+
+Build a communinity: an ethically-run and
+transparent project will encourage contributors to share the values of
+the project and remain engaged.
+
+
For many working in a research context, there are additional
+considerations to ensure that institutional policies, ethics, and data
+protection regulations are carefully observed. These protocols are
+outside the scope of this document, but these factors should be clearly
+communicated to all contributors.
Many open-source research software projects adopt the Contributor Covenant,
+which is a template charter that may be customised to suit the needs of
+your collaborators.
+
Developer notes
+
+
+
For people who are contributing code to the project, they’ll need the
+following information:
+
+
Which version control system is being used. Typically, this will be
+git or similar tools, as discussed in Module 2 of this
+course.
+
How to add automatic tests and whether a testing framework is in
+place.
+
Describe the code organisation and package structure.
+
-
Documentation sites for R packages
+
Technical documentation
-
It’s also possible to generate a documentation site to accompany R
-packages that you create. For more information about this, please refer
-to the book R Packages by Hadley Wickham, which has a chapter
-on documentation
-websites.
+
System documentation is important for new contributors to
+familiarise themselves with the codebase and as a
+reference for existing engineers. There should be a concise description
+of how the system works from a more technical perspective, with the
+intended audience being software developers, rather than the research
+users.
+
An architecture diagram is an efficient way to
+provide a “map” to help developers to understand and navigate a complex
+system.
-
Sphinx
+
Coding conventions
-
Sphinx is a tool for
-building documentation websites that is commonly used amongst developers
-of Python packages, although it’s also compatible with other programming
-languages. It doesn’t currently support packages written using the R
+
Many projects follow a set of programming standards to manage
+code quality. A coding style guide will help to ensure
+consistency across all the code written as part of a collaborative
+project, which helps others to read and interpret the code, making it
+easier to maintain in the long run. The code style rules should cover
+things like the way to describe functions, how to indent code, and
+naming conventions for variables.
+
This might include guidance and advice, or more strict rules as
+standards that are checked by a code linter. A code
+linter is an analysis tool that inspects code and checks for common
+errors and problems, producing a report for the developer to read and
+act upon. Common coding style standards include the PEP 8 style guide for the
+Python programming language and the tidyverse style guide in the R
statistical language.
-
Sphinx is a documentation generator tool takes plain text files that
-use a markup syntax (such as reStructuredText or Markdown) for
-formatting the content of your documentation site and transforms them
-into various output formats, ready to be published on the internet. It
-has a number of useful features, but in this module we’ll learn the
-basics to document our research code.
Why are coding conventions important for collaborative research
+projects?
+
How can we establish and enforce coding style guidelines that
+promote consistency and readability?
+
-
-
Getting started
-
-
Let’s use Sphinx to create a documentation site for our Python
-code.
-
-
Installing Sphinx
-
-
Navigate to the root folder of your code project. Create a virtual
-environment using venv which is a
-separate area in which to install the Sphinx package. This command will
-create a virtual environment in a directory called
-.venv/
-
-
-
-
-
BASH
-
-
python-m venv .venv
+
+
+
-
-
-
BASH
-
-
python3-m venv .venv
-
+
+
Key Points
+
+
+
+Encourage collaboration: There are many
+ways to contribute to a research software project, including
+bug reoprts, feature suggests, design discussions, documentation, and
+software engineering.
+
+Clear processes: Explain the process for making
+changes and having them included into the code
+
+Bug reports: Create simple ways for users to report
+issues and have these problems resolved in a timely manner.
+
+Communication: Create appropriate communication
+channels so that design discussions and proposed changes may be worked
+through transparently.
+
-
-
-
BASH
-
-
python-m venv .venv
+
Further resources
+
+
+
To find out more about creating healthy communities of developers to
+collaborate on research software engineering projects, please visit the
+following resources:
How do I present comprehensive information to users
+of my research software?
+
How do I generate a website containing a user guide to my code?
+
What should a good documentation website contain?
+
How do I publish my software documentation on the internet?
+
-
This will create a subdirectory that contains the packages we’ll need
-to complete the exercises in this section.
-
Run the activation script to enable the virtual environment. The
-specific command needed to activate the virtual environment depends on
-the operating system you are using.
-
-
-
-
-
BASH
-
-
.venv\Scripts\activate
+
+
+
+
Objectives
+
+
Learn about documentation websites for software
+packages.
+
Gain basic familiarity with some common website generation
+tools.
+
Understand the basics of structuring a documentation website.
+
Be able to set up a static site deployment workflow.
+
-
-
-
BASH
-
-
source .venv/bin/activate
-
-
-
BASH
-
-
source .venv/bin/activate
+
Documentation websites
+
+
+
A documentation website is a user guide and reference manual for a
+library of research code. Up to now, we’ve looked at ways to put helpful
+notes in our code, but now we’ll learn how to write a longer, more
+complete guide to the research tools you create.
+
A documentation site bring all your user guidance into one place.
+This kind of resource may be prepared for research software and will
+usually contain an introduction, installation instructions, a user
+guide, troubleshooting tips, and an in-depth reference section.
+
To get an idea of this, here are some links documentation websites
+for widely-used data analysis and research software packages:
+
+
+pandas is a data
+processing library for the Python programming language.
+
+ggplot2 is a
+plotting package for the R statistical language.
+
+scikit-learn
+is a machine learning library for the Python programming language.
+
+
+
+
+
+
Discussion
+
+
Evaluate these documentation sites.
+
+
What do you like about them?
+
How approachable are they as a new user?
+
What do you find difficult to understand in this material?
Sphinx includes a command to set up a new project called sphinx-quickstart.
-Navigate to your project’s root folder and run the following
-command.
-
-
BASH
+
Why create a website?
+
+
+
There are many advantages to building a documentation site to provide
+a information-rich resource for researchers who use your code at
+institutions all around the world.
+
+
Advantages
-
sphinx-quickstart docs --no-sep--ext-autodoc
+
These sites can work as hubs for collaboration,
+sharing the latest updates, and encouraging people to take up your
+system and get involved in improving it. The effort of setting one up
+will be rewarded in the long run because you will have created a
+valuable asset that will foster collaboration and knowledge sharing in
+your research community.
+
A key foundation stone of modern digital research practices is the
+ability to replicate results by reproducing analytic
+workflows. Clear, thorough documentation of the research code ensures
+that researchers can repeat processes and verify results and other
+people’s outputs.
+
Documentation sites are really useful for introducing new
+users to your software. It makes it much easier and faster for
+new users to get started using your software to boost their research.
+It’s one of the most effective ways to create a user base that has a
+sophisticated understanding of the research code, which is essential for
+them to adapt it to the complex problems that often raise in research
+contexts.
+
They’re also a valuable resource for your existing user base,
+enabling them to look up reference material or search the manual to find
+new capabilities they weren’t aware of before. This will increase the
+potential for your software to increase the productivity of other
+research teams.
-
This will initialise the configuration files for a new Sphinx site in
-a subdirectory called docs/ and prompt you to enter the
-following options:
-
-
Project name: Birdsong Identifier
-
Author name(s): Bill Oddie
-
Project release []: 1.0
-
-
+
+
When to use one
+
+
Although the advantages are numerous, not all software packages
+require a comprehensive documentation website. However, for any code
+project that is growing in the number of collaborators, users, and
+technical complexity, consider coordinating the team to write one as
+soon as possible to help the project continue its’ healthy growth.
+
-
+
-
-
Sphinx options
+
+
Discussion
-
To find out more about the Sphinx configuration files, please read
-their guide to defining
-document structure on the Sphinx documentation.
-
+
When is it appropriate to establish a documentation website? Consider
+the following factors:
+
+
How many resources will it take to write and maintain?
+
How many end-users need the information?
+
Is there a simpler format that can convey the same information?
+
-
-
Building the site
-
-
In this context, building means taking our collection of
-Sphinx files and converting them into the source code files that define
-a website. Sphinx will create HyperText Markup Language (HTML)
-files, which is the markup language for pages that display in a web
-browser commonly used on the internet.
-
To build our site, we run the sphinx-build
-command using the -M option to select
-HTML syntax as the output
-format.
-
-
BASH
+
Contents
+
+
+
Documentation pages contain comphrehensive
+information about a particular piece of research software.
+Think of it like a user manual for your car or an instruction guide for
+building a piece of furniature.
+
+
Research context
-
sphinx-build-M html docs docs/_build
-
-
Sphinx will load our files from the docs/ directory and
-output the built HTML files in the docs/_build
-directory.
-
The file docs/_build/html/index.html contains the home
-page of your new documentation site! Open that file to view your
-handiwork.
-
The Sphinx homepage for our documentation
-site
-
+
For research software, it may be important to explain the
+theoretical background or statistical methods that are
+used and explain the domain-specific assumptions that were made when the
+code was designed and written. It’s good practice to provide a concise
+summary of the relevant concepts and link to external sources such as
+papers, books, and other websites for users to take a deeper dive into
+the principles and algorithms used.
-
-
Autodoc
-
-
It can be useful to automatically populate our documentation sites by
-converting our documentation strings into
-formatted text. We can achieve this using the autodoc
-plugin for Sphinx.
-
-
Configuring Autodoc
-
-
Let’s set up the options for autodoc. (If you struggle
-with these steps, please refer to the template
-project.)
-
Add the following lines to docs/conf.py which
-
-
PYTHON
+
+
Installation instructions
-
# Our Python code may be imported from the parent directory
-import os
-import sys
-sys.path.insert(0, os.path.abspath('..'))
+
This section provides a detailed walkthrough of the steps required to
+install the package onto their computer, with details that are specific
+to their operating system.
-
This ensures that Sphinx can access our Python code by pointing at
-the root directory of our project. The .. syntax means “one
-folder up”, which means autodoc will search in the root
-directory for code to import.
-
-
-
-
-
-
The Python code uses sys.path,
-a list of locations to search for code. By modifying the Python
-module search path, we allow autodoc to locate and
-import our code modules from a specific directory that is not in the
-default search path.
-
This is often necessary when working with project structures that
-involve multiple directories, helping the interpreter to find code that
-isn’t installed in the standard library location.
+
+
Tutorials
+
+
It can be very useful to include an in-depth “Getting Started” guide
+that provides step-by-step instructions to introduce a new user to your
+software package. It might guide the user through each aspect of the
+tool’s functionality and features so they’re able to become familiar
+with it in a more approachable way.
+
A series of code examples to demonstrate how to use the software in
+different contexts can be very useful for users to get off the ground in
+implementing common research workflows to achieve their specific
+goals.
+
+
User reference
+
+
If you have written functions that are intended to be use in other
+reseachers’ code, then an on-depth explaination of these procedures is
+essential reference material. In the world of software engineering,
+these detailed appendices are called
+API
+references, which list each function and describe how the arguments may
+be used to control how the code works. This content may be automatically
+generated from the documentation strings.
+
+
Troubleshooting
+
+
As issues come up with your research code, and are eventually
+resolved and clarified, make a note of the causes of these troubles and
+make them available to the entire user base in your documentation site.
+This will help users to identify and fix common misunderstandings and
+technical problems they may run into when utilising your code.
+
This prevents a situation where potential solutions to common issues
+do exist, but are scattered around the internet are the exclusive
+knowledge of a few individuals and are hard to find.
+
+
FAQs
+
+
An appendix containing frequently asked questions (FAQs) is very
+useful to save yourself time in responding to common queries from the
+users of your code.
-
Next, edit docs/index.rst and add the following lines to
-instruct Sphinx to automatically generation documentation for our Python
-module.
-
-
RST
+
Writing style
+
+
+
As we discussed in the episode on READMEs,
+it’s important to strive to use everyday, jargon-free language. It helps
+to set an approachable tone that encourages others to use the software
+and get involved with the project. This will en sure that the code is
+accesible to the widest possible layers of the research community and
+foster collaboration.
+
Always consider the target audience of your documentation, because
+your user base may be unaware of some of the unstated assumptions and
+technical backgroud knowledge that you take for granted.
+
Tools
+
+
+
There are various tools available to build documentation sites for
+your research software.
+
+
GitHub Wiki
-
.. automodule:: oddsong.song
-:members:
+
If you are publishing your code on GitHub, which is a web service
+that hosts costs repositories, then one of the easiest ways to create a
+documentation site is to use the wiki feature on that platform. This is
+a great way to write detailed, structured documents containing long-form
+content that describes aspects of your software. What’s more, it’s
+available alongside your code so your documentation and software are
+located in one place.
To create a wiki, which is a simple, easy-to-edit web site, go to the
+main page of your code repository on GitHub and click on the Wiki button
+on the top menu. For a detailed walkthrough of this process, please read
+adding
+or editing wiki pages on the GitHub documentation.
-.. indicates a directive within a
-reST document that is used to
-configure Sphinx.
-
-automodule:: indicates a specific directive to use
-autodoc to automatically generate documentation for a
-module.
-
-oddsong.song is the path to our Python module, for
-which documentation will be created.
-
-:members: is an optional argument for the automodule
-directive that instructs Sphinx to include documentation for all members
-(functions, classes, variables) defined within the specified
-module.
Now, when we build our site, Sphinx will scan the contents of the
-oddsong Python module and automatically generate a useful
-reference guide to our functions.
-
-
BASH
+
+
+
Documentation sites for R packages
-
sphinx-build-M html docs docs/_build
+
It’s also possible to generate a documentation site to accompany R
+packages that you create. For more information about this, please refer
+to the book R Packages by Hadley Wickham, which has a chapter
+on documentation
+websites.
-
The result looks something like this:
-
Python documentation string rendered as
-HTML
-
+
+
Sphinx
+
+
Sphinx is a tool for
+building documentation websites that is commonly used amongst developers
+of Python packages, although it’s also compatible with other programming
+languages. It doesn’t currently support packages written using the R
+statistical language.
+
Sphinx is a documentation generator tool takes plain text files that
+use a markup syntax (such as reStructuredText or Markdown) for
+formatting the content of your documentation site and transforms them
+into various output formats, ready to be published on the internet. It
+has a number of useful features, but in this module we’ll learn the
+basics to document our research code.
+
-
+
-
-
Automatically generate content
+
+
Callout
-
Try using autodoc to analyise your own code and build a
-documentation site by following the steps above.
-
After the sphinx-build command has completed
-successfully, browse the contents of the docs/_build/html
-folder and discuss what you find.
Let’s use Sphinx to create a documentation site for our Python
+code.
+
+
Installing Sphinx
+
+
Navigate to the root folder of your code project. Create a virtual
+environment using venv which is a
+separate area in which to install the Sphinx package. This command will
+create a virtual environment in a directory called
+.venv/
+
+
+
+
+
BASH
+
+
python-m venv .venv
+
+
+
BASH
+
+
python3-m venv .venv
-
Publishing
-
-
-
Now that you’ve started writing your documentation website, there are
-various ways to upload it to the internet so that others can read
-it.
-
There are several hosting services that can be used to publish your
-documentation site, such as GitHub
-Pages and Read the
-Docs.
-
The detailed of setting up the deployment of your site to these
-platforms is beyond the scope of this course.
-
-
-
+
+
+
BASH
+
+
python-m venv .venv
-
-
Key Points
-
-
-
Structured documentation websites are very useful for users to learn
-to use all kinds of digital systems, ensuring its successful adoption by
-the wider research community.
How do I introduce new contributors to my research
-software project?
-
What is the best way to communicate processes such
-as bug reporting?
-
Where should I write up the design and structure of
-the system?
-
+
This will create a subdirectory that contains the packages we’ll need
+to complete the exercises in this section.
+
Run the activation script to enable the virtual environment. The
+specific command needed to activate the virtual environment depends on
+the operating system you are using.
+
+
+
+
+
BASH
+
+
.venv\Scripts\activate
-
-
-
-
Objectives
-
-
Learn to write a contribution guide for research
-code
-
Learn about software coding standards
-
-
Implement ways to facilitate communication between
-researchers that are engaged in the project
-
Provide a high-level understanding of an existing
-codebase
-
-
+
+
+
BASH
+
+
source .venv/bin/activate
+
+
+
BASH
+
+
source .venv/bin/activate
-
Collaborative research software development
-
-
-
Often, in today’s research environment, much analytics software is
-written in a collaborative manner, involving multiple
-specialists from within a team, or from multiple institutions. For the
-long-term health of a software package, it’s important to encourage
-potential contributors to get in touch and feel welcome to take part.
-Useful research software can take on a life of its own.
-
-
-
-
-
Research software project management
-
-
For more information on planning the development of research software
-and project governance, see Module 1a.
Sphinx includes a command to set up a new project called sphinx-quickstart.
+Navigate to your project’s root folder and run the following
+command.
+
+
BASH
+
+
sphinx-quickstart docs --no-sep--ext-autodoc
-
It’s often published using an open source licence, which means that
-all the code is publicly available and may be used and modified by
-anyone, within certain conditions (see module 1b to learn more about
-software licensing.)
-
There’s a lot more creating and managing a sustainable community
-aorund a research software project, but having a central piece of
-documentation for contributors is a great start!
-
+
This will initialise the configuration files for a new Sphinx site in
+a subdirectory called docs/ and prompt you to enter the
+following options:
+
+
Project name: Birdsong Identifier
+
Author name(s): Bill Oddie
+
Project release []: 1.0
+
+
-
+
-
-
Discussion
+
+
Sphinx options
-
Consider these questions amongst the group:
-
-
How can we effectively foster a collaborative environment for
-research software development?
-
How can barriers to participation be removed for a diverse range of
-individuals and institutions?
-
What strategies can be implemented to ensure that all contributors
-feel valued and included?
-
+
To find out more about the Sphinx configuration files, please read
+their guide to defining
+document structure on the Sphinx documentation.
-
Contribution guides
-
-
-
Contribution guidelines help users and understand how they can help
-to improve the software, whether that’s by submitting bug reports,
-suggesting new features, or writing better code and documentation. All
-of these aspects are vital to produce reusable research software.
-
Potential collaborators should be able to easily find out how to take
-part and contribute. Developers should be encouraged to use appropriate
-communication channels to ask questions and inform others of proposed
-software changes. The contact details for the project administrator or
-committee should be available and they should be welcome and responsive
-to any queries.
-
It’s important to explain how the project is managed so the process
-for evaluating new features and getting them implemented is clear, such
-as the code review and approval process. For many projects, a ticket
-system may be used to raise issues and suggest new features. Software
-developers often propose new code by creating a branch on the version
-control system (such as Git) and requesting for those changes to be
-merged into the main codebase.
-
Contribution guides will save you time in the long run, because it
-provides an on-ramp for people to get involved, prevents them from
-getting confused, and reduces the amount of incorrectly-submitted bug
-reports or requests for change, etc.
-
-
-
-
-
Discussion
-
-
Discuss these issues amongst the group:
-
-
What essential components should be included in a comprehensive
-documentation for research software contributors?
-
How can we make onboarding new contributors a smooth and welcoming
-process, ensuring they have the necessary information and support to be
-successful?
-
How can we balance the need for clear guidelines with the desire to
-encourage creativity and innovation?
-
+
+
Building the site
+
+
In this context, building means taking our collection of
+Sphinx files and converting them into the source code files that define
+a website. Sphinx will create HyperText Markup Language (HTML)
+files, which is the markup language for pages that display in a web
+browser commonly used on the internet.
+
To build our site, we run the sphinx-build
+command using the -M option to select
+HTML syntax as the output
+format.
+
+
BASH
+
+
sphinx-build-M html docs docs/_build
+
Sphinx will load our files from the docs/ directory and
+output the built HTML files in the docs/_build
+directory.
+
The file docs/_build/html/index.html contains the home
+page of your new documentation site! Open that file to view your
+handiwork.
+
The Sphinx homepage for our documentation
+site
+
-
-
How to write contributor guidance
+
+
Autodoc
+
+
It can be useful to automatically populate our documentation sites by
+converting our documentation strings into
+formatted text. We can achieve this using the autodoc
+plugin for Sphinx.
+
+
Configuring Autodoc
+
+
Let’s set up the options for autodoc. (If you struggle
+with these steps, please refer to the template
+project.)
+
Add the following lines to docs/conf.py which
+
+
PYTHON
-
The standard practice for authoring a contribution guide for a
-software project is to create a file called CONTRIBUTING.md
-in the root folder of your project. This is a Markdown file that
-introduces new people to the project. It lets people know the ways they
-can take part in the research software project and what to do to get
-involved.
-
The specific contents of this file depend upon the kind of research
-project, but some useful information to provide typically includes:
-
-
An introduction to the organisation and structure of the code,
-possibly including diagrams.
-
Instructions to raising issues, suggesting new features, and
-proposing code changes.
-
Links to additional documentation that’s hosted elsewhere, such as a
-code of conduct or discussion forum.
-
A walkthrough to setting up a development environment, such as
-guidance on installing developer tools or other prerequisites.
-
-
On code hosting platforms such as GitHub, the contribution guide will
-be created automatically using this CONTRIBUTING.md
-Markdown file.
-
-
-
+
# Our Python code may be imported from the parent directory
+import os
+import sys
+sys.path.insert(0, os.path.abspath('..'))
-
-
Challenge
-
-
Create a new file called CONTRIBUTING.md and populate it
-with a few sentences.
-
-
What are the most important things for a new contributor to
-know?
-
What should a user do if they encounter a bug?
-
What are the common questions that a new developer might have when
-they work on this research software?
-
+
This ensures that Sphinx can access our Python code by pointing at
+the root directory of our project. The .. syntax means “one
+folder up”, which means autodoc will search in the root
+directory for code to import.
+
+
+
+
+
+
The Python code uses sys.path,
+a list of locations to search for code. By modifying the Python
+module search path, we allow autodoc to locate and
+import our code modules from a specific directory that is not in the
+default search path.
+
This is often necessary when working with project structures that
+involve multiple directories, helping the interpreter to find code that
+isn’t installed in the standard library location.
-
Software project governance
-
-
-
Project governance defines the scope and aims of a research software
-engineering project, and determines how decisions will be made and
-carried out. It sets out the processes and responsibilities that
-collaborators must understand to take part. This is something that
-should be considered when preparing a software management plan, as
-discussed in Module 1a of this course. This is important to make sure
-that questions of who does what, and how, are stated clearly so that
-everyone can understand and collaborate effectively to produce excellent
-research software. It’s worthwhile to think about this early on in a
-project to avoid potential pitfalls later on!
-
-
Code of conduct
+
Next, edit docs/index.rst and add the following lines to
+instruct Sphinx to automatically generation documentation for our Python
+module.
+
+
RST
-
A code of conduct provides guidelines for the expected behaviour of
-people who are involved in the project. You may want to provide some
-general tips to create a productive community of researchers around the
-software, such as creating positive interactions between contributors,
-treat others with respect and dignity, and recommendations for processes
-for handling differences of opinion.
Fosters a healthy, collaborative working
-environment where people feel respected, included, and can
-freely share ideas.
-Managing expectations and creating clear rules will
-reduce the amount of time wasted due to misunderstanding and
-conflicts.
+.. indicates a directive within a
+reST document that is used to
+configure Sphinx.
-Build a communinity: an ethically-run and
-transparent project will encourage contributors to share the values of
-the project and remain engaged.
+automodule:: indicates a specific directive to use
+autodoc to automatically generate documentation for a
+module.
+
+oddsong.song is the path to our Python module, for
+which documentation will be created.
+
+:members: is an optional argument for the automodule
+directive that instructs Sphinx to include documentation for all members
+(functions, classes, variables) defined within the specified
+module.
-
For many working in a research context, there are additional
-considerations to ensure that institutional policies, ethics, and data
-protection regulations are carefully observed. These protocols are
-outside the scope of this document, but these factors should be clearly
-communicated to all contributors.
-
-
Many open-source research software projects adopt the Contributor Covenant,
-which can also be customised to suit the needs of your
-collaborators.
For people who are contributing code to the project, they’ll need the
-following information:
-
-
Which version control system is being used. Typically, this will be
-git or similar tools, as discussed in Module 2 of this
-course.
-
How to add automatic tests and whether a testing framework is in
-place.
-
Describe the code organisation and package structure.
-
-
-
Technical documentation
-
-
System documentation is important for new contributors to familiarise
-themselves with the codebase and as a reference for existing engineers.
-There should be a concise description of how the system works from a
-more technical perspective, with the intended audience being software
-developers, rather than the research users.
-
An architecture diagram is an efficient way to provide a “map” to
-help developers to understand and navigate a complex system.
-
-
Coding conventions
+
+
Now, when we build our site, Sphinx will scan the contents of the
+oddsong Python module and automatically generate a useful
+reference guide to our functions.
+
+
BASH
-
Many projects following programming standards to manage code quality.
-A coding style guide will help to ensure consistency across all the code
-written as part of a collaborative project, which helps others to read
-and interpret the code, making it easier to maintain in the long run.
-The code style rules should cover things like the way to describe
-functions, how to indent code, and naming conventions for variables.
-
This might include guidance and advice, or more strict rules as
-standards that are checked by a code linter. A code linter is an
-analysis tool that inspects code and checks for common errors and
-problems, producing a report for the developer to read and act upon.
-Common coding style standards include the PEP 8 style guide for the
-Python programming language and the tidyverse style guide in the R
-statistical language.
-
+
sphinx-build-M html docs docs/_build
+
+
The result looks something like this:
+
Python documentation string rendered as
+HTML
+
-
-
Discussion
+
+
Automatically generate content
-
Discuss these issues as a group:
-
-
Why are coding conventions important for collaborative research
-projects?
-
How can we establish and enforce coding style guidelines that
-promote consistency and readability?
-
+
Try using autodoc to analyise your own code and build a
+documentation site by following the steps above.
+
After the sphinx-build command has completed
+successfully, browse the contents of the docs/_build/html
+folder and discuss what you find.
+
+
+
+
Publishing
+
+
+
Now that you’ve started writing your documentation website, there are
+various ways to upload it to the internet so that others can read
+it.
+
There are several hosting services that can be used to publish your
+documentation site, such as GitHub
+Pages and Read the
+Docs.
+
The detailed of setting up the deployment of your site to these
+platforms is beyond the scope of this course.
@@ -3947,42 +4010,38 @@
Discussion
Key Points
-
-Encourage collaboration: There are many
-ways to contribute to a research software project, including
-bug reoprts, feature suggests, design discussions, documentation, and
-software engineering.
-
-Clear processes: Explain the process for making
-changes and having them included into the code
-
-Bug reports: Create simple ways for users to report
-issues and have these problems resolved in a timely manner.
-
-Communication: Create appropriate communication
-channels so that design discussions and proposed changes may be worked
-through transparently.
+
Structured documentation websites are very useful for users to learn
+to use all kinds of digital systems, ensuring its successful adoption by
+the wider research community.
There are several libraries that may be used to generate
+documentation sites.
+
Documentation websites may be deployed to a hosting platform.
-
Further resources
-
To find out more about creating healthy communities of developers to
-collaborate on research software engineering projects, please visit the
-following resources:
+
Please review the following material which provides more information
+about some of the topics covered in this episode.