nextflow-io · pditommaso · Oct 23, 2024 · Sep 25, 2024 · Sep 26, 2024 · Sep 26, 2024
@@ -52,7 +52,7 @@ process foo {
 
 workflow {
   result = foo(1)
-  result.view { "Result: ${it}" }
+  result.view { file -> "Result: ${file}" }
 }
 ```
 

@@ -48,7 +48,15 @@ The same mechanism allows you to access environment variables defined in the hos
 
 ### Comments
 
-Configuration files use the same conventions for comments used by the Groovy or Java programming languages. Thus, use `//` to comment a single line, or `/*` .. `*/` to comment a block on multiple lines.
+You can use `//` to comment a single line, or `/* ... */` to comment a block on multiple lines:
+
+```groovy
+// single line comment
+
+/*
+ * multi-line comment
+ */
+```
 
 ### Includes
 

@@ -7,6 +7,8 @@ This page describes how to create, test, and publish third-party plugins.
 
 The best way to get started with your own plugin is to refer to the [nf-hello](https://github.com/nextflow-io/nf-hello) repository. This repository provides a minimal plugin implementation with several examples of different extension points and instructions for building, testing, and publishing.
 
+Plugins can be written in Java or Groovy.
+
 The minimal dependencies are as follows:
 
 ```groovy
@@ -151,7 +153,7 @@ Refer to the source code of Nextflow's built-in executors to see how to implemen
 :::{versionadded} 22.09.0-edge
 :::
 
-Plugins can define custom Groovy functions, which can then be included into Nextflow pipelines.
+Plugins can define custom functions, which can then be included into Nextflow pipelines.
 
 To implement a custom function, create a class in your plugin that extends the `PluginExtensionPoint` class, and implement your function with the `Function` annotation:
 

@@ -88,7 +88,7 @@ In DSL1, the entire Nextflow pipeline must be defined in a single file (e.g. `ma
 DSL2 introduces the concept of "module scripts" (or "modules" for short), which are Nextflow scripts that can be "included" by other scripts. While modules are not essential to migrating to DSL2, nor are they mandatory in DSL2 by any means, modules can help you organize a large pipeline into multiple smaller files, and take advantage of modules created by others. Check out the {ref}`module-page` to get started.
 
 :::{note}
-With DSL2, the Groovy shell used by Nextflow also imposes a 64KB size limit on pipeline scripts, so if your DSL1 script is very large, you may need to split your script into modules anyway to avoid this limit.
+With DSL2, Nextflow scripts cannot exceed 64KB in size, so if your DSL1 script is very large, you may need to split your script into modules anyway to avoid this limit.
 :::
 
 ## Deprecations

@@ -111,6 +111,7 @@ fusion
 :caption: Reference
 :maxdepth: 1
 
+reference/syntax
 reference/cli
 reference/config
 reference/env-vars

@@ -2,15 +2,15 @@
 
 # Modules
 
-In Nextflow, a **module** is a script that may contain functions, processes, and workflows (collectively referred to as *components*). A module can be included in other modules or pipeline scripts and even shared across workflows.
+Nextflow scripts can include **definitions** (workflows, processes, and functions) from other scripts. When a script is included in this way, it is referred to as a **module**. Modules can be included by other modules or pipeline scripts and can even be shared across workflows.
 
 :::{note}
 Modules were introduced in DSL2. If you are still using DSL1, see the {ref}`dsl1-page` page to learn how to migrate your Nextflow pipelines to DSL2.
 :::
 
 ## Module inclusion
 
-A component defined in a module script can be imported into another Nextflow script using the `include` keyword.
+Any definition in a module can be included into another Nextflow script using the `include` keyword.
 
 For example:
 
@@ -23,7 +23,7 @@ workflow {
 }
 ```
 
-The above snippet imports a process named `foo`, defined in the module script, into the main execution context. This way, `foo` can be invoked in the `workflow` scope.
+The above snippet imports a process named `foo`, defined in the module, into the main execution context. This way, `foo` can be invoked in the `workflow` scope.
 
 Nextflow implicitly looks for the script file `./some/module.nf`, resolving the path against the *including* script location.
 
@@ -57,7 +57,7 @@ Module directories allow the use of module scoped binaries scripts. See [Module
 
 ## Multiple inclusions
 
-A Nextflow script can include any number of modules, and an `include` statement can import any number of components from a module. Multiple components can be included from the same module by using the syntax shown below:
+A Nextflow script can include any number of modules, and an `include` statement can import any number of definitions from a module. Multiple definitions can be included from the same module by using the syntax shown below:
 
 ```groovy
 include { foo; bar } from './some/module'
@@ -73,7 +73,7 @@ workflow {
 
 ## Module aliases
 
-When including a module component, it's possible to specify an *alias* with the `as` keyword. Aliasing allows you to avoid module name clashes, by assigning them different names in the including context. For example:
+When including definition from a module, it's possible to specify an *alias* with the `as` keyword. Aliasing allows you to avoid module name clashes, by assigning them different names in the including context. For example:
 
 ```groovy
 include { foo } from './some/module'
@@ -85,7 +85,7 @@ workflow {
 }
 ```
 
-You can even include the same component multiple times under different names:
+You can even include the same definition multiple times under different names:
 
 ```groovy
 include { foo; foo as bar } from './some/module'
@@ -96,13 +96,15 @@ workflow {
 }
 ```
 
+(module-params)=
+
 ## Module parameters
 
 :::{deprecated} 24.07.0-edge
-As a best practice, parameters should be used in the entry workflow and passed to functions / processes / workflows as explicit inputs.
+As a best practice, parameters should be used in the entry workflow and passed to workflows, processes, and functions as explicit inputs.
 :::
 
-A module script can define parameters using the same syntax as a Nextflow workflow script:
+A module can define parameters using the same syntax as a Nextflow workflow script:
 
 ```groovy
 params.foo = 'Hello'
@@ -182,9 +184,9 @@ Ciao world!
 
 ## Module templates
 
-The module script can be defined in an external {ref}`template <process-template>` file. The template file can be placed in the `templates` directory where the module script is located.
+Process script {ref}`templates <process-template>` can be included alongside a module in the `templates` directory.
 
-For example, suppose we have a project L with a module script that defines two processes, P1 and P2, both of which use templates. The template files can be made available in the local `templates` directory:
+For example, suppose we have a project L with a module that defines two processes, P1 and P2, both of which use templates. The template files can be made available in the local `templates` directory:
 
 ```
 Project L
@@ -208,15 +210,15 @@ Pipeline B
 └── main.nf
 ```
 
-With the possibility to keep the template files inside the project L, A and B can use the modules defined in L without any changes. A future project C would do the same, just cloning L (if not available on the system) and including its module script.
+With the possibility to keep the template files inside the project L, A and B can use the modules defined in L without any changes. A future project C would do the same, just cloning L (if not available on the system) and including its module.
 
 Beside promoting the sharing of modules across pipelines, there are several advantages to keeping the module template under the script path:
 
-1. module components are *self-contained*,
-2. module components can be tested independently from the pipeline(s) that import them,
-3. it is possible to create libraries of module components.
+1. modules are *self-contained*,
+2. modules can be tested independently from the pipeline(s) that import them,
+3. it is possible to create libraries of modules.
 
-Ultimately, having multiple template locations allows a more structured organization within the same project. If a project has several module components, and all of them use templates, the project could group module scripts and their templates as needed. For example:
+Ultimately, having multiple template locations allows a more structured organization within the same project. If a project has several modules, and all of them use templates, the project could group module scripts and their templates as needed. For example:
 
 ```
 baseDir

@@ -95,11 +95,9 @@ Read the {ref}`executor-page` to learn more about the Nextflow executors.
 
 ## Scripting language
 
-Nextflow is designed to have a minimal learning curve, without having to pick up a new programming language. In most cases, users can utilise their current skills to develop Nextflow workflows. However, it also provides a powerful scripting DSL.
+Nextflow is a workflow language, based on [Java](https://en.wikipedia.org/wiki/Java_(programming_language)) and [Groovy](https://groovy-lang.org/), which is designed to make it as simple as possible to write scalable and reproducible pipelines. In most cases, users can leverage their existing programming skills to develop Nextflow pipelines, without the steep learning curve that usually comes with a new programming language.
 
-Nextflow scripting is an extension of the [Groovy programming language](<http://en.wikipedia.org/wiki/Groovy_(programming_language)>), which in turn is a super-set of the Java programming language. Groovy can be considered as Python for Java in that it simplifies the writing of code and is more approachable.
-
-Read the {ref}`script-page` section to learn about the Nextflow scripting language.
+Read the {ref}`script-page` page to learn about the Nextflow scripting language.
 
 ## Configuration options
 

@@ -2,39 +2,23 @@
 
 # Processes
 
-In Nextflow, a **process** is the basic processing primitive to execute a user script.
+In Nextflow, a **process** is a function that is specialized for executing scripts in a scalable and portable manner.
 
-The process definition starts with the keyword `process`, followed by process name and finally the process body delimited by curly braces. The process body must contain a string which represents the command or, more generally, a script that is executed by it. A basic process looks like the following example:
+Here is an example process definition:
 
 ```groovy
 process sayHello {
+    output:
+    path 'hello.txt'
+
+    script:
     """
-    echo 'Hello world!' > file
+    echo 'Hello world!' > hello.txt
     """
 }
 ```
 
-A process may contain any of the following definition blocks: directives, inputs, outputs, when clause, and the process script. The syntax is defined as follows:
-
-```
-process < name > {
-
-  [ directives ]
-
-  input:
-    < process inputs >
-
-  output:
-    < process outputs >
-
-  when:
-    < condition >
-
-  [script|shell|exec]:
-    < user script to be executed >
-
-}
-```
+Refer to {ref}`syntax-process` in the syntax reference for a full description of the process syntax.
 
 (process-script)=
 
@@ -139,9 +123,9 @@ Since the actual location of the interpreter binary file can differ across platf
 
 ### Conditional scripts
 
-So far, our `script` block has always been a simple string expression, but in reality, the `script` block is just Groovy code that returns a string. This means that you can write arbitrary Groovy code to determine the script to execute, as long as the final statement is a string (remember that the `return` keyword is optional in Groovy).
+So far, the `script` block has just been a string, but in reality, the `script` block is like a function that returns a string. This means that you can write arbitrary code to determine the script, as long as the final statement is a string (remember that the `return` keyword is optional).
 
-For example, you can use flow control statements (`if`, `switch`, etc) to execute a different script based on the process inputs. The only difference here is that you must explicitly declare the `script` guard, whereas before it was not required. Here is an example:
+For example, you can use if-else statements to produce a different script based on the task inputs. The only difference here is that you must explicitly declare the `script` guard, whereas before it was not required. Here is an example:
 
 ```groovy
 mode = 'tcoffee'
@@ -171,7 +155,7 @@ process align {
 }
 ```
 
-In the above example, the process will execute one of the script fragments depending on the value of the `mode` parameter. By default it will execute the `tcoffee` command, but changing the `mode` variable will cause a different branch to be executed.
+In the above example, the process will execute one of several scripts depending on the value of the `mode` parameter. By default it will execute the `tcoffee` command.
 
 (process-template)=
 
@@ -250,7 +234,7 @@ In the above example, `$USER` is treated as a Bash variable, while `!{str}` is t
 
 ### Native execution
 
-Nextflow processes can also execute native Groovy code as the task itself, using the `exec` block. Whereas the `script` block defines a script to be executed, the `exec` block defines Groovy code to be executed directly.
+Whereas the `script` block defines a script that is executed as a separate job, the `exec` block simply executes the code that it is given, without launching a job.
 
 For example:
 
@@ -276,6 +260,8 @@ Hello Mr. a
 Hello Mr. c
 ```
 
+A native process is very similar to a {ref}`function <syntax-function>`, but provides additional capabilities such as parallelism, caching, and progress logging.
+
 (process-stub)=
 
 ## Stub
@@ -492,7 +478,7 @@ In this case, `x.name` returns the file name with the parent directory (e.g. `my
 
 ### Multiple input files
 
-A `path` input can also accept a collection of files instead of a single value. In this case, the input variable will be a Groovy list, and you can use it as such.
+A `path` input can also accept a collection of files instead of a single value. In this case, the input variable will be a list, and you can use it as such.
 
 When the input has a fixed file name and a collection of files is received by the process, the file name will be appended with a numerical suffix representing its ordinal position in the list. For example:
 
@@ -584,7 +570,7 @@ The `env` qualifier allows you to define an environment variable in the process
 ```groovy
 process printEnv {
     input:
-    env HELLO
+    env 'HELLO'
 
     '''
     echo $HELLO world!
@@ -619,7 +605,7 @@ process printAll {
 
 workflow {
   Channel.of('hello', 'hola', 'bonjour', 'ciao')
-    | map { it + '\n' }
+    | map { v -> v + '\n' }
     | printAll
 }
 ```
@@ -841,7 +827,7 @@ workflow {
   methods = ['prot', 'dna', 'rna']
 
   receiver = foo(methods)
-  receiver.view { "Received: $it" }
+  receiver.view { method -> "Received: $method" }
 }
 ```
 
@@ -868,9 +854,9 @@ workflow {
   ch_dummy = Channel.fromPath('*').first()
   (ch_var, ch_str, ch_exp) = foo(ch_dummy)
 
-  ch_var.view { "ch_var: $it" }
-  ch_str.view { "ch_str: $it" }
-  ch_exp.view { "ch_exp: $it" }
+  ch_var.view { var -> "ch_var: $var" }
+  ch_str.view { str -> "ch_str: $str" }
+  ch_exp.view { exp -> "ch_exp: $exp" }
 }
 ```
 
@@ -890,7 +876,7 @@ process randomNum {
 
 workflow {
   numbers = randomNum()
-  numbers.view { "Received: ${it.text}" }
+  numbers.view { file -> "Received: ${file.text}" }
 }
 ```
 
@@ -931,7 +917,7 @@ process splitLetters {
 workflow {
     splitLetters
         | flatten
-        | view { "File: ${it.name} => ${it.text}" }
+        | view { chunk -> "File: ${chunk.name} => ${chunk.text}" }
 }
 ```
 
@@ -1132,8 +1118,14 @@ In this example, the process is normally expected to produce an `output.txt` fil
 While this option can be used with any process output, it cannot be applied to individual elements of a [tuple](#output-tuples-tuple) output. The entire tuple must be optional or not optional.
 :::
 
+(process-when)=
+
 ## When
 
+:::{deprecated} 24.10.0
+Use conditional logic (e.g. `if` statement, {ref}`operator-filter` operator) in the calling workflow instead.
+:::
+
 The `when` block allows you to define a condition that must be satisfied in order to execute the process. The condition can be any expression that returns a boolean value.
 
 It can be useful to enable/disable the process execution depending on the state of various inputs and parameters. For example:
@@ -1154,32 +1146,12 @@ process find {
 }
 ```
 
-:::{tip}
-As a best practice, it is better to define such control flow logic in the workflow block, i.e. with an `if` statement or with channel operators, to make the process more portable.
-:::
-
 (process-directives)=
 
 ## Directives
 
 Directives are optional settings that affect the execution of the current process.
 
-They must be entered at the top of the process body, before any other declaration blocks (`input`, `output`, etc), and have the following syntax:
-
-```groovy
-// directive with simple value
-name value
-
-// directive with list value
-name arg1, arg2, arg3
-
-// directive with map value
-name key1: val1, key2: val2
-
-// directive with value and options
-name arg, opt1: val1, opt2: val2
-```
-
 By default, directives are evaluated when the process is defined. However, if the value is a dynamic string or closure, it will be evaluated separately for each task, which allows task-specific variables like `task` and `val` inputs to be used.
 
 Some directives are only supported by specific executors. Refer to the {ref}`executor-page` page for more information about each executor.