The rules for good code snippets

Code snippets are essential; each snippet is most of the time designed to focus on a particular feature/characteristic of a programming language, a library, a framework or a behavior. They help us to quickly test, experiment and remember how bigger projects or at least some parts of them are working. They can even be used as quick proofs of concept.

Learning or remembering is much quicker when you have at your disposal your own set of snippets. This set of snippets is a knowledge asset, we must become aware of its value for us but also from a project team or company point of view. Any newcomer on a project will have a faster learning curve if he/she has access to a set of snippets that capture the essence of the project.

Nowadays, with all the progress made on Artificial Intelligence, this asset is even becoming more and more important as it can be used for model training or fine-tuning purposes!

With some simple rules and a little discipline, you can make your snippets quite easy to manage and share. Let's review those rules and how I'm following them with dedicated software in a second step.

When writing source code snippets, try to take into account the maximum number of those rules to implement the best possible snippets.

One purpose rule

A good code snippet has one purpose, such as learning something, writing a quick proof of concept, automating a task, keeping the overall design principles of a project, testing some programming language feature, ...

The underlying goal is to speed up the reminding process. If you come back several months, years on a given snippet, you'll be able to quickly understand what it does and how it works.

A NIX regex snippet: The purpose is to quickly illustrate how regex works in the NIX language

let
  results = builtins.split "<([^>]+)>" "John Doe <[email protected]>";
in with builtins; {
  first_matching_1 = (head (head (tail results))) ;
  first_matching_2 = (elemAt results 1) ;
}

One file rule

A good code snippet is just one file. A single file with a meaningful directory tree and naming helps a lot to keep the complexity under control as well as it enhances the developer experience once the snippet has been shared. The directory tree can be used to group snippets by topic, by language, by technology stack, ... while the filename synthesizes their purpose.

A bash snippet stored in a file named bash-date-iso8601-formatted.sh, just from the filename we know it is about how to generate iso8601 dates from the bash shell:

date -Is   # not GMT
date -u +"%Y-%m-%dT%H:%M:%SZ"
date -u +"%Y-%m-%dT%H:%M:%S.%3NZ"

Easy-to-run rule

A good code snippet is easy to run, typically from the command line as a script or within a Read-Eval-Print-Loop (REPL) session. It also means you will have to provide all the necessary information to run it. When available the REPL is great to interactively design or test your snippet.

A Scala snippet executed from a REPL session which illustrates how to use the java data faker library:

$ scala-cli --dep com.github.javafaker:javafaker:1.0.2
Welcome to Scala 3.6.4 (23.0.1, Java OpenJDK 64-Bit Server VM).
Type in expressions for evaluation. Or try :help.

scala> val faker = new com.github.javafaker.Faker()
     |
     | val appName = faker.app.name
     | val author = faker.name.name
     | val email = faker.internet.emailAddress
     | val company = faker.company.name
     | val companyURL = faker.company.url
     | val country = faker.country.name
     |
     | println(s"""'$appName' an application By $author / $email""")
     | println(s"""'$company' / $country/ $companyURL""")
'Namfix' an application By Tamala Jacobi / [email protected]
'Hansen-Yost' / Turkmenistan/ www.macejkovicblickandblanda.co

The same snippet made executable from the command line:

$ ./data-fake.sc
'Gembucket' an application By Thad Deckow II / [email protected]
'Goodwin, Cremin and Koss' / Israel/ www.watersgerholdandnicolas.com

$ cat data-fake.sc
#!/usr/bin/env scala-cli
//> using scala "3.6.4"
//> using dep "com.github.javafaker:javafaker:1.0.2"
val faker = new com.github.javafaker.Faker()
val appName = faker.app.name
val author = faker.name.name
val email = faker.internet.emailAddress
val company = faker.company.name
val companyURL = faker.company.url
val country = faker.country.name
println(s"""'$appName' an application By $author / $email""")
println(s"""'$company' / $country/ $companyURL""")

The previously shown snippet shows that there is some contextual information provided to make it easy to run :

required execution runtime
- scala-cli is the runtime to use
needed software dependencies
- scala 3.6.4 is used (and automatically downloaded)
- com.github.javafaker:javafaker:1.0.2 is resolved and downloaded (automatically from maven central)

As just shown, the best developer experience is achieved when you use a technology stack which is able to resolve and load dependencies at runtime. Then you will have just one file for everything. This is the case at least for :

Scala : scala-cli with the //> special comment syntax
Rust : rust-script with the // cargo-deps : or //! special comment syntaxes
Java : jbang with the //DEP special comment syntax

You can also use a shebang to make your snippet directly executable while giving to the reader the information about how it is executed.

Self-describing rule

A good code snippet describes by itself what it does. When it makes sense, design your snippet to make its execution self-describing. This is quite simple if you use a unit test framework as most of them can generate a report on standard output.

A Scala snippet execution showing the behavior of the upickle library:

Run starting. Expected test count is: 6
JsonUpickleCookBook:
upickle
- should deserialize json array strings (36 milliseconds)
- should deserialize scala case classes ? (7 milliseconds)
- should deserialize scala case classes with optional field? default value is supported (3 milliseconds)
- should deserialize scala case classes with optional field? (3 milliseconds)
- should deserialize scala list (0 milliseconds)
- should provide an AST (13 milliseconds)
Run completed in 194 milliseconds.
Total number of tests run: 6
Suites: completed 1, aborted 0
Tests: succeeded 6, failed 0, canceled 0, ignored 0, pending 0
All tests passed.

So when executed, the output describes exactly what it does, and even more, as this is unit tests based, it will fail until you gave the right implementation! Or until you've understood what's going on ;)

Security and confidentiality rule

A good code snippet takes care of security and confidentiality. It should not contain any sensitive information, and should not be able to access any sensitive information.

Some good practices related to security and confidentiality :

No secrets (tokens, passwords, passphrases, cryptographic keys, ...)
- Use environment variables to inject secrets if needed
No sensible data (IP addresses, usernames, internal URL, ...)
- Use environment variables to inject sensible data
No sensible code in snippets you plan to share
- Stay as generic as possible
- Consider shared snippets as learning materials or basic tools
Always keep in mind trademarks, copyrights, licenses
- Really take care with copy/paste content from anywhere
- Avoid the use of internal AI assistant if you plan to share the snippet

Scala snippet which uses an environment variable to inject a secret :

//...
envOrNone("RANDOM_ORG_API_KEY") match {
  case None      => println("Signup to random.org, create an api key or reuse an existing one - https://api.random.org/dashboard")
  case Some(key) =>
    println(generate(key, 5, 1, 49).sorted.mkString(","))
    println(generate(key, 1, 1, 10).sorted.mkString(","))
}
//...

Manageable and shareable rule

A good code snippet comes with meta-data typically used to identify it uniquely, to describe how to share it, to add license information, description keywords, ...

Why? Because once you or your team have written hundreds of snippets, you'll quickly need to industrialize their life cycle. You'll need to find solutions for :

how to search through your snippets?
how to share them? Publicly or internally?
how to check their execution status?
how to manage change history?
how to let people react over them?
...

In relation to the one-file rule the right approach is to add meta-data as comments directly in the header of the snippet file. It enables resilience and simplicity.

Snippet meta-data header example, coming from my chosen approach :

// summary : take a look into models zoo
// keywords : djl, machine-learning, tutorial, detection, ai, zoo, @testable
// publish : gist
// authors : David Crosson
// license : Apache License Version 2.0 (https://www.apache.org/licenses/LICENSE-2.0)
// id : f6b25f1c-2d29-4cdb-a030-f7dbb3b94e96
// created-on : 2022-03-09T18:41:39+01:00
// managed-by : https://github.com/dacr/code-examples-manager
// run-with : scala-cli $file

Once you've achieved this, you will discover that you are currently building a knowledge base, which can be automatically processed by dedicated software.

Let's take a look at my experience returns and the dedicated software I'm using to achieve all of this.

Experience returns

I wrote my first "good" snippet back in 2018 (Fully asynchronous http client call with JSON response using akka-http and now 7 years later, I've a data set made of almost 1300 snippets (almost 1000 are publishable).

All my personal snippets are stored in a private GIT repository
All my team snippets are stored in a shared GIT repository
Almost everything is managed by a dedicated software called code-examples-manager :
- publishes/synchronizes on GitHub gists and Gitlab snippets
- generates and synchronizes a summary of the published snippets knowledge base
- executes all executable snippets to check their states and if some maintenance is needed
- check global snippets coherency and report various statistics
A tool coded as a snippet (cem-tool.sc) is used to feed an opensearch search engine
- Dashboard with various statistics,
- Snippets execution trends, ...
Stable set of meta-data keywords, see code-examples-manager for more information
Such a discipline over the years unlocks me various opportunities
- Elasticsearch/opensearch training with a part based on this dataset :)
- Various AI-related experiments again with this dataset :)
I've started with what I initially called code examples, now it can be anything, so I use the 'snippet' terminology
- snippets as documentation,
- snippets as configuration files,
- snippets as dedicated dataset (csv content for example),
- ...
My state of mind
- I'm happy to share my snippets
- I use them every day as they are fully integrated in my daily work
- Time to add more AI use cases with them

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The rules for good code snippets

Simple rules for good code snippets

One purpose rule

One file rule

Easy-to-run rule

Self-describing rule

Security and confidentiality rule

Manageable and shareable rule

Experience returns

About

Releases

Packages

License

dacr/the-rules-for-good-code-examples

Folders and files

Latest commit

History

Repository files navigation

The rules for good code snippets

Simple rules for good code snippets

One purpose rule

One file rule

Easy-to-run rule

Self-describing rule

Security and confidentiality rule

Manageable and shareable rule

Experience returns

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages