Skip to content
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
9d6c826
Update docs and tests with strict syntax
bentsherman Sep 25, 2024
189b560
Remove some references to Groovy
bentsherman Sep 26, 2024
24bb403
Rename "implicit workflow" -> "entry workflow"
bentsherman Sep 26, 2024
5c0ec1d
Initial syntax reference page
bentsherman Sep 28, 2024
39ee1c9
Revert non-docs changes
bentsherman Sep 28, 2024
00ba5d1
Cleanup
bentsherman Sep 28, 2024
69772bf
Add section on script definitions
bentsherman Sep 28, 2024
24e65e0
Update config syntax docs
bentsherman Sep 28, 2024
e9faf58
Apply suggestions from code review
bentsherman Oct 1, 2024
23e05b7
Update docs/module.md
bentsherman Oct 1, 2024
1a173a0
Apply suggestions from code review
bentsherman Oct 1, 2024
74480fc
Add remaining sections to syntax page
bentsherman Sep 30, 2024
4aee05a
Apply suggestions from code review
bentsherman Oct 1, 2024
e073700
minor edits
bentsherman Oct 1, 2024
6a66f71
Apply suggestions from review
bentsherman Oct 3, 2024
0d91594
Apply suggestions from code review
bentsherman Oct 3, 2024
3d29edb
Update description of expressions
bentsherman Oct 8, 2024
71236c9
Suggestions for syntax page
christopher-hakkaart Oct 9, 2024
505c98e
Revert "Suggestions for syntax page"
christopher-hakkaart Oct 9, 2024
0645746
Apply suggestions from review
bentsherman Oct 17, 2024
9444ed1
Apply suggestions from review
bentsherman Oct 18, 2024
110ede7
Apply suggestions from review
bentsherman Oct 18, 2024
48ac8b5
Update docs/dsl1.md
bentsherman Oct 21, 2024
30ea996
Revert "Update docs/dsl1.md"
bentsherman Oct 21, 2024
93154ee
Update docs/config.md
bentsherman Oct 21, 2024
04a311d
Remove redundant subscriber code snippet
bentsherman Oct 21, 2024
63633d1
Add note about implicit closure parameter
bentsherman Oct 23, 2024
17b7ab5
Merge branch 'master' into docs-strict-syntax
pditommaso Oct 23, 2024
3f6f377
Update docs [ci skip]
pditommaso Oct 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/channel.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ process foo {

workflow {
result = foo(1)
result.view { "Result: ${it}" }
result.view { file -> "Result: ${file}" }
}
```

Expand Down
10 changes: 9 additions & 1 deletion docs/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,15 @@ The same mechanism allows you to access environment variables defined in the hos

### Comments

Configuration files use the same conventions for comments used by the Groovy or Java programming languages. Thus, use `//` to comment a single line, or `/*` .. `*/` to comment a block on multiple lines.
You can use `//` to comment a single line, or `/* ... */` to comment a block on multiple lines:

```groovy
// single line comment

/*
* multi-line comment
*/
```

### Includes

Expand Down
4 changes: 3 additions & 1 deletion docs/developer/plugins.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ This page describes how to create, test, and publish third-party plugins.

The best way to get started with your own plugin is to refer to the [nf-hello](https://github.com/nextflow-io/nf-hello) repository. This repository provides a minimal plugin implementation with several examples of different extension points and instructions for building, testing, and publishing.

Plugins can be written in Java or Groovy.

The minimal dependencies are as follows:

```groovy
Expand Down Expand Up @@ -151,7 +153,7 @@ Refer to the source code of Nextflow's built-in executors to see how to implemen
:::{versionadded} 22.09.0-edge
:::

Plugins can define custom Groovy functions, which can then be included into Nextflow pipelines.
Plugins can define custom functions, which can then be included into Nextflow pipelines.

To implement a custom function, create a class in your plugin that extends the `PluginExtensionPoint` class, and implement your function with the `Function` annotation:

Expand Down
2 changes: 1 addition & 1 deletion docs/dsl1.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ In DSL1, the entire Nextflow pipeline must be defined in a single file (e.g. `ma
DSL2 introduces the concept of "module scripts" (or "modules" for short), which are Nextflow scripts that can be "included" by other scripts. While modules are not essential to migrating to DSL2, nor are they mandatory in DSL2 by any means, modules can help you organize a large pipeline into multiple smaller files, and take advantage of modules created by others. Check out the {ref}`module-page` to get started.

:::{note}
With DSL2, the Groovy shell used by Nextflow also imposes a 64KB size limit on pipeline scripts, so if your DSL1 script is very large, you may need to split your script into modules anyway to avoid this limit.
With DSL2, Nextflow scripts cannot exceed 64KB in size, so if your DSL1 script is very large, you may need to split your script into modules anyway to avoid this limit.
:::

## Deprecations
Expand Down
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ fusion
:caption: Reference
:maxdepth: 1

reference/syntax
reference/cli
reference/config
reference/env-vars
Expand Down
32 changes: 17 additions & 15 deletions docs/module.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@

# Modules

In Nextflow, a **module** is a script that may contain functions, processes, and workflows (collectively referred to as *components*). A module can be included in other modules or pipeline scripts and even shared across workflows.
Nextflow scripts can include **definitions** (workflows, processes, and functions) from other scripts. When a script is included in this way, it is referred to as a **module**. Modules can be included by other modules or pipeline scripts and can even be shared across workflows.

:::{note}
Modules were introduced in DSL2. If you are still using DSL1, see the {ref}`dsl1-page` page to learn how to migrate your Nextflow pipelines to DSL2.
:::

## Module inclusion

A component defined in a module script can be imported into another Nextflow script using the `include` keyword.
Any definition in a module can be included into another Nextflow script using the `include` keyword.

For example:

Expand All @@ -23,7 +23,7 @@ workflow {
}
```

The above snippet imports a process named `foo`, defined in the module script, into the main execution context. This way, `foo` can be invoked in the `workflow` scope.
The above snippet imports a process named `foo`, defined in the module, into the main execution context. This way, `foo` can be invoked in the `workflow` scope.

Nextflow implicitly looks for the script file `./some/module.nf`, resolving the path against the *including* script location.

Expand Down Expand Up @@ -57,7 +57,7 @@ Module directories allow the use of module scoped binaries scripts. See [Module

## Multiple inclusions

A Nextflow script can include any number of modules, and an `include` statement can import any number of components from a module. Multiple components can be included from the same module by using the syntax shown below:
A Nextflow script can include any number of modules, and an `include` statement can import any number of definitions from a module. Multiple definitions can be included from the same module by using the syntax shown below:

```groovy
include { foo; bar } from './some/module'
Expand All @@ -73,7 +73,7 @@ workflow {

## Module aliases

When including a module component, it's possible to specify an *alias* with the `as` keyword. Aliasing allows you to avoid module name clashes, by assigning them different names in the including context. For example:
When including definition from a module, it's possible to specify an *alias* with the `as` keyword. Aliasing allows you to avoid module name clashes, by assigning them different names in the including context. For example:

```groovy
include { foo } from './some/module'
Expand All @@ -85,7 +85,7 @@ workflow {
}
```

You can even include the same component multiple times under different names:
You can even include the same definition multiple times under different names:

```groovy
include { foo; foo as bar } from './some/module'
Expand All @@ -96,13 +96,15 @@ workflow {
}
```

(module-params)=

## Module parameters

:::{deprecated} 24.07.0-edge
As a best practice, parameters should be used in the entry workflow and passed to functions / processes / workflows as explicit inputs.
As a best practice, parameters should be used in the entry workflow and passed to workflows, processes, and functions as explicit inputs.
:::

A module script can define parameters using the same syntax as a Nextflow workflow script:
A module can define parameters using the same syntax as a Nextflow workflow script:

```groovy
params.foo = 'Hello'
Expand Down Expand Up @@ -182,9 +184,9 @@ Ciao world!

## Module templates

The module script can be defined in an external {ref}`template <process-template>` file. The template file can be placed in the `templates` directory where the module script is located.
Process script {ref}`templates <process-template>` can be included alongside a module in the `templates` directory.

For example, suppose we have a project L with a module script that defines two processes, P1 and P2, both of which use templates. The template files can be made available in the local `templates` directory:
For example, suppose we have a project L with a module that defines two processes, P1 and P2, both of which use templates. The template files can be made available in the local `templates` directory:

```
Project L
Expand All @@ -208,15 +210,15 @@ Pipeline B
└── main.nf
```

With the possibility to keep the template files inside the project L, A and B can use the modules defined in L without any changes. A future project C would do the same, just cloning L (if not available on the system) and including its module script.
With the possibility to keep the template files inside the project L, A and B can use the modules defined in L without any changes. A future project C would do the same, just cloning L (if not available on the system) and including its module.

Beside promoting the sharing of modules across pipelines, there are several advantages to keeping the module template under the script path:

1. module components are *self-contained*,
2. module components can be tested independently from the pipeline(s) that import them,
3. it is possible to create libraries of module components.
1. modules are *self-contained*,
2. modules can be tested independently from the pipeline(s) that import them,
3. it is possible to create libraries of modules.

Ultimately, having multiple template locations allows a more structured organization within the same project. If a project has several module components, and all of them use templates, the project could group module scripts and their templates as needed. For example:
Ultimately, having multiple template locations allows a more structured organization within the same project. If a project has several modules, and all of them use templates, the project could group module scripts and their templates as needed. For example:

```
baseDir
Expand Down
6 changes: 2 additions & 4 deletions docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,11 +95,9 @@ Read the {ref}`executor-page` to learn more about the Nextflow executors.

## Scripting language

Nextflow is designed to have a minimal learning curve, without having to pick up a new programming language. In most cases, users can utilise their current skills to develop Nextflow workflows. However, it also provides a powerful scripting DSL.
Nextflow is a workflow language, based on [Java](https://en.wikipedia.org/wiki/Java_(programming_language)) and [Groovy](https://groovy-lang.org/), which is designed to make it as simple as possible to write scalable and reproducible pipelines. In most cases, users can leverage their existing programming skills to develop Nextflow pipelines, without the steep learning curve that usually comes with a new programming language.

Nextflow scripting is an extension of the [Groovy programming language](<http://en.wikipedia.org/wiki/Groovy_(programming_language)>), which in turn is a super-set of the Java programming language. Groovy can be considered as Python for Java in that it simplifies the writing of code and is more approachable.

Read the {ref}`script-page` section to learn about the Nextflow scripting language.
Read the {ref}`script-page` page to learn about the Nextflow scripting language.

## Configuration options

Expand Down
86 changes: 29 additions & 57 deletions docs/process.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,39 +2,23 @@

# Processes

In Nextflow, a **process** is the basic processing primitive to execute a user script.
In Nextflow, a **process** is a function that is specialized for executing scripts in a scalable and portable manner.

The process definition starts with the keyword `process`, followed by process name and finally the process body delimited by curly braces. The process body must contain a string which represents the command or, more generally, a script that is executed by it. A basic process looks like the following example:
Here is an example process definition:

```groovy
process sayHello {
output:
path 'hello.txt'

script:
"""
echo 'Hello world!' > file
echo 'Hello world!' > hello.txt
"""
}
```

A process may contain any of the following definition blocks: directives, inputs, outputs, when clause, and the process script. The syntax is defined as follows:

```
process < name > {

[ directives ]

input:
< process inputs >

output:
< process outputs >

when:
< condition >

[script|shell|exec]:
< user script to be executed >

}
```
Refer to {ref}`syntax-process` in the syntax reference for a full description of the process syntax.

(process-script)=

Expand Down Expand Up @@ -139,9 +123,9 @@ Since the actual location of the interpreter binary file can differ across platf

### Conditional scripts

So far, our `script` block has always been a simple string expression, but in reality, the `script` block is just Groovy code that returns a string. This means that you can write arbitrary Groovy code to determine the script to execute, as long as the final statement is a string (remember that the `return` keyword is optional in Groovy).
So far, the `script` block has just been a string, but in reality, the `script` block is like a function that returns a string. This means that you can write arbitrary code to determine the script, as long as the final statement is a string (remember that the `return` keyword is optional).

For example, you can use flow control statements (`if`, `switch`, etc) to execute a different script based on the process inputs. The only difference here is that you must explicitly declare the `script` guard, whereas before it was not required. Here is an example:
For example, you can use if-else statements to produce a different script based on the task inputs. The only difference here is that you must explicitly declare the `script` guard, whereas before it was not required. Here is an example:

```groovy
mode = 'tcoffee'
Expand Down Expand Up @@ -171,7 +155,7 @@ process align {
}
```

In the above example, the process will execute one of the script fragments depending on the value of the `mode` parameter. By default it will execute the `tcoffee` command, but changing the `mode` variable will cause a different branch to be executed.
In the above example, the process will execute one of several scripts depending on the value of the `mode` parameter. By default it will execute the `tcoffee` command.

(process-template)=

Expand Down Expand Up @@ -250,7 +234,7 @@ In the above example, `$USER` is treated as a Bash variable, while `!{str}` is t

### Native execution

Nextflow processes can also execute native Groovy code as the task itself, using the `exec` block. Whereas the `script` block defines a script to be executed, the `exec` block defines Groovy code to be executed directly.
Whereas the `script` block defines a script that is executed as a separate job, the `exec` block simply executes the code that it is given, without launching a job.

For example:

Expand All @@ -276,6 +260,8 @@ Hello Mr. a
Hello Mr. c
```

A native process is very similar to a {ref}`function <syntax-function>`, but provides additional capabilities such as parallelism, caching, and progress logging.

(process-stub)=

## Stub
Expand Down Expand Up @@ -492,7 +478,7 @@ In this case, `x.name` returns the file name with the parent directory (e.g. `my

### Multiple input files

A `path` input can also accept a collection of files instead of a single value. In this case, the input variable will be a Groovy list, and you can use it as such.
A `path` input can also accept a collection of files instead of a single value. In this case, the input variable will be a list, and you can use it as such.

When the input has a fixed file name and a collection of files is received by the process, the file name will be appended with a numerical suffix representing its ordinal position in the list. For example:

Expand Down Expand Up @@ -584,7 +570,7 @@ The `env` qualifier allows you to define an environment variable in the process
```groovy
process printEnv {
input:
env HELLO
env 'HELLO'

'''
echo $HELLO world!
Expand Down Expand Up @@ -619,7 +605,7 @@ process printAll {

workflow {
Channel.of('hello', 'hola', 'bonjour', 'ciao')
| map { it + '\n' }
| map { v -> v + '\n' }
| printAll
}
```
Expand Down Expand Up @@ -841,7 +827,7 @@ workflow {
methods = ['prot', 'dna', 'rna']

receiver = foo(methods)
receiver.view { "Received: $it" }
receiver.view { method -> "Received: $method" }
}
```

Expand All @@ -868,9 +854,9 @@ workflow {
ch_dummy = Channel.fromPath('*').first()
(ch_var, ch_str, ch_exp) = foo(ch_dummy)

ch_var.view { "ch_var: $it" }
ch_str.view { "ch_str: $it" }
ch_exp.view { "ch_exp: $it" }
ch_var.view { var -> "ch_var: $var" }
ch_str.view { str -> "ch_str: $str" }
ch_exp.view { exp -> "ch_exp: $exp" }
}
```

Expand All @@ -890,7 +876,7 @@ process randomNum {

workflow {
numbers = randomNum()
numbers.view { "Received: ${it.text}" }
numbers.view { file -> "Received: ${file.text}" }
}
```

Expand Down Expand Up @@ -931,7 +917,7 @@ process splitLetters {
workflow {
splitLetters
| flatten
| view { "File: ${it.name} => ${it.text}" }
| view { chunk -> "File: ${chunk.name} => ${chunk.text}" }
}
```

Expand Down Expand Up @@ -1132,8 +1118,14 @@ In this example, the process is normally expected to produce an `output.txt` fil
While this option can be used with any process output, it cannot be applied to individual elements of a [tuple](#output-tuples-tuple) output. The entire tuple must be optional or not optional.
:::

(process-when)=

## When

:::{deprecated} 24.10.0
Use conditional logic (e.g. `if` statement, {ref}`operator-filter` operator) in the calling workflow instead.
:::

The `when` block allows you to define a condition that must be satisfied in order to execute the process. The condition can be any expression that returns a boolean value.

It can be useful to enable/disable the process execution depending on the state of various inputs and parameters. For example:
Expand All @@ -1154,32 +1146,12 @@ process find {
}
```

:::{tip}
As a best practice, it is better to define such control flow logic in the workflow block, i.e. with an `if` statement or with channel operators, to make the process more portable.
:::

(process-directives)=

## Directives

Directives are optional settings that affect the execution of the current process.

They must be entered at the top of the process body, before any other declaration blocks (`input`, `output`, etc), and have the following syntax:

```groovy
// directive with simple value
name value

// directive with list value
name arg1, arg2, arg3

// directive with map value
name key1: val1, key2: val2

// directive with value and options
name arg, opt1: val1, opt2: val2
```

By default, directives are evaluated when the process is defined. However, if the value is a dynamic string or closure, it will be evaluated separately for each task, which allows task-specific variables like `task` and `val` inputs to be used.

Some directives are only supported by specific executors. Refer to the {ref}`executor-page` page for more information about each executor.
Expand Down
Loading