Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
c6d4950
Add arity option for path inputs and outputs
bentsherman Mar 1, 2023
9a16f01
Change default arity to depend on file pattern
bentsherman Mar 2, 2023
4738a97
Change arity "single"-ness to include optional single (`0..1`)
bentsherman Mar 2, 2023
058474f
Remove default arity, use original behavior if arity not specified
bentsherman Mar 8, 2023
2a015c7
Add ArityParam unit test
bentsherman Mar 8, 2023
4e17c99
Add tests for ArityParam (currently failing)
bentsherman Mar 9, 2023
80f944b
Merge branch 'master' into 2425-process-input-output-arity
pditommaso Mar 28, 2023
8c5e93c
Add arity to params unit tests
bentsherman Mar 30, 2023
f59fc96
Merge branch 'master' into 2425-process-input-output-arity
pditommaso Apr 9, 2023
04fbb86
Fix liftbot warning
bentsherman Apr 21, 2023
edc788d
Merge branch 'master' into 2425-process-input-output-arity
bentsherman Apr 21, 2023
7ec397f
Add support for AWS SSE env variables
pditommaso May 24, 2023
cefc7cc
Merge branch 'master' into 2425-process-input-output-arity
bentsherman May 26, 2023
49356e3
Merge branch 'master' into 2425-process-input-output-arity
bentsherman Jun 13, 2023
cf7b677
Merge branch 'master' into 2425-process-input-output-arity
pditommaso Jul 5, 2023
b2e6f78
Merge branch 'master' into 2425-process-input-output-arity
pditommaso Aug 15, 2023
bbdca3e
Update docs [ci skip]
pditommaso Aug 15, 2023
88da8bb
Add nullable inputs/outputs and other improvements
bentsherman Aug 15, 2023
356a70d
Add e2e test
bentsherman Aug 16, 2023
b63ec86
Add NullPath
bentsherman Aug 16, 2023
76de197
Fail if nullable input receives a list
bentsherman Aug 16, 2023
4be1236
Add `arity: true` to infer arity from file pattern
bentsherman Aug 16, 2023
b6a5fb3
Fix failing tests
bentsherman Aug 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 55 additions & 12 deletions docs/process.md
Original file line number Diff line number Diff line change
Expand Up @@ -471,22 +471,46 @@ workflow {
}
```

The `stageAs` option allows you to control how the file should be named in the task work directory. You can provide a specific name or a pattern as described in the [Multiple input files](#multiple-input-files) section:
Available options:

```groovy
process foo {
`arity`
: :::{versionadded} 23.09.0-edge
:::
: Specify the number of expected files. Can be a number or a range:

```groovy
input:
path x, stageAs: 'data.txt'
path('one.txt', arity: '1') // exactly one file is expected
path('pair_*.txt', arity: '2') // exactly two files are expected
path('many_*.txt', arity: '1..*') // one or more files are expected
path('optional.txt', arity: '0..1') // zero or one file is expected
```

"""
your_command --in data.txt
"""
}
When a task is created, Nextflow will check whether the received files for each path input match the declared arity, and fail if they do not.

workflow {
foo('/some/data/file.txt')
}
```
An input is *nullable* if the arity is exactly `0..1`. Nullable inputs can accept "null" files from nullable `path` outputs.

You can also set `arity: true` to infer the arity from the file pattern: `1..*` if the pattern is a glob, `1` otherwise.

`stageAs`
: Specify how the file should be named in the task work directory:

```groovy
process foo {
input:
path x, stageAs: 'data.txt'

"""
your_command --in data.txt
"""
}

workflow {
foo('/some/data/file.txt')
}
```

Can be a name or a pattern as described in the [Multiple input files](#multiple-input-files) section.

### Multiple input files

Expand Down Expand Up @@ -912,6 +936,25 @@ In the above example, the `randomNum` process creates a file named `result.txt`

Available options:

`arity`
: :::{versionadded} 23.09.0-edge
:::
: Specify the number of expected files. Can be a number or a range:

```groovy
output:
path('one.txt', arity: '1') // exactly one file is expected
path('pair_*.txt', arity: '2') // exactly two files are expected
path('many_*.txt', arity: '1..*') // one or more files are expected
path('optional.txt', arity: '0..1') // zero or one file is expected
```

When a task completes, Nextflow will check whether the produced files for each path output match the declared arity, and fail if they do not. If the arity is *single* (i.e. either `1` or `0..1`), a single file will be emitted. Otherwise, a list will always be emitted, even if only one file is produced.

An output is *nullable* if the arity is exactly `0..1`. Whereas optional outputs emit nothing if the output file does not exist, nullable outputs emit a "null" file that can only be accepted by a nullable `path` input.

You can also set `arity: true` to infer the arity from the file pattern: `1..*` if the pattern is a glob, `1` otherwise.

`followLinks`
: When `true` target files are return in place of any matching symlink (default: `true`)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ import nextflow.util.CacheHelper
import nextflow.util.Escape
import nextflow.util.LockManager
import nextflow.util.LoggerHelper
import nextflow.util.NullPath
import nextflow.util.TestOnly
import org.codehaus.groovy.control.CompilerConfiguration
import org.codehaus.groovy.control.customizers.ASTTransformationCustomizer
Expand Down Expand Up @@ -1567,6 +1568,10 @@ class TaskProcessor {
def path = param.glob ? splitter.strip(filePattern) : filePattern
def file = workDir.resolve(path)
def exists = param.followLinks ? file.exists() : file.exists(LinkOption.NOFOLLOW_LINKS)
if( !exists && param.isNullable() ) {
file = new NullPath(path)
exists = true
}
if( exists )
result = [file]
else
Expand All @@ -1576,15 +1581,18 @@ class TaskProcessor {
if( result )
allFiles.addAll(result)

else if( !param.optional ) {
else if( !param.optional && (!param.arity || param.arity.min > 0) ) {
def msg = "Missing output file(s) `$filePattern` expected by process `${safeTaskName(task)}`"
if( inputsRemovedFlag )
msg += " (note: input files are not included in the default matching set)"
throw new MissingFileException(msg)
}
}

task.setOutput( param, allFiles.size()==1 ? allFiles[0] : allFiles )
if( !param.isValidArity(allFiles) )
throw new IllegalArgumentException("Incorrect number of output files for process `${safeTaskName(task)}` -- expected ${param.arity}, found ${allFiles.size()}")

task.setOutput( param, allFiles.size()==1 && param.isSingle() ? allFiles[0] : allFiles )

}

Expand Down Expand Up @@ -1782,7 +1790,10 @@ class TaskProcessor {
return new FileHolder(source, result)
}

protected Path normalizeToPath( obj ) {
protected Path normalizeToPath( obj, boolean nullable=false ) {
if( obj instanceof NullPath && !nullable )
throw new ProcessUnrecoverableException("Path value cannot be null")

if( obj instanceof Path )
return obj

Expand All @@ -1805,7 +1816,7 @@ class TaskProcessor {
throw new ProcessUnrecoverableException("Not a valid path value: '$str'")
}

protected List<FileHolder> normalizeInputToFiles( Object obj, int count, boolean coerceToPath, FilePorter.Batch batch ) {
protected List<FileHolder> normalizeInputToFiles( Object obj, int count, boolean coerceToPath, boolean nullable, FilePorter.Batch batch ) {

Collection allItems = obj instanceof Collection ? obj : [obj]
def len = allItems.size()
Expand All @@ -1815,7 +1826,7 @@ class TaskProcessor {
for( def item : allItems ) {

if( item instanceof Path || coerceToPath ) {
def path = normalizeToPath(item)
def path = normalizeToPath(item, nullable)
def target = executor.isForeignFile(path) ? batch.addToForeign(path) : path
def holder = new FileHolder(target)
files << holder
Expand All @@ -1828,10 +1839,10 @@ class TaskProcessor {
return files
}

protected singleItemOrList( List<FileHolder> items, ScriptType type ) {
protected singleItemOrList( List<FileHolder> items, boolean single, ScriptType type ) {
assert items != null

if( items.size() == 1 ) {
if( items.size() == 1 && single ) {
return makePath(items[0],type)
}

Expand Down Expand Up @@ -2028,10 +2039,17 @@ class TaskProcessor {
for( Map.Entry<FileInParam,?> entry : secondPass.entrySet() ) {
final param = entry.getKey()
final val = entry.getValue()
final fileParam = param as FileInParam
final normalized = normalizeInputToFiles(val, count, fileParam.isPathQualifier(), batch)
final resolved = expandWildcards( fileParam.getFilePattern(ctx), normalized )
ctx.put( param.name, singleItemOrList(resolved, task.type) )
final normalized = normalizeInputToFiles(val, count, param.isPathQualifier(), param.isNullable(), batch)
final resolved = expandWildcards( param.getFilePattern(ctx), normalized )

if( !param.isValidArity(resolved) ) {
final msg = param.isNullable()
? "expected a nullable file (0..1) but a list was provided"
: "expected ${param.arity}, found ${resolved.size()}"
throw new IllegalArgumentException("Incorrect number of input files for process `${safeTaskName(task)}` -- ${msg}")
}

ctx.put( param.name, singleItemOrList(resolved, param.isSingle(), task.type) )
count += resolved.size()
for( FileHolder item : resolved ) {
Integer num = allNames.getOrCreate(item.stageName, 0) +1
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
/*
* Copyright 2023, Seqera Labs
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package nextflow.script.params

import groovy.transform.CompileStatic
import groovy.transform.EqualsAndHashCode

/**
* Implements an arity option for process inputs and outputs.
*
* @author Ben Sherman <[email protected]>
*/
@CompileStatic
trait ArityParam {

Range arity

Range getArity() { arity }

def setArity(String value) {
if( value.isInteger() ) {
def n = value.toInteger()
this.arity = new Range(n, n)
return this
}

def tokens = value.tokenize('..')
if( tokens.size() == 2 ) {
def min = tokens[0]
def max = tokens[1]
if( min.isInteger() && (max == '*' || max.isInteger()) ) {
this.arity = new Range(
min.toInteger(),
max == '*' ? Integer.MAX_VALUE : max.toInteger()
)
return this
}
}

throw new IllegalArgumentException("Path arity should be a number (e.g. '1') or a range (e.g. '1..*')")
}

/**
* Determine whether a null file is allowed.
*/
boolean isNullable() {
return arity && arity.min == 0 && arity.max == 1
}

/**
* Determine whether a single output file should be unwrapped.
*/
boolean isSingle() {
return !arity || arity.max == 1
}

/**
* Determine whether a collection of files has valid arity.
*
* If the param is nullable, there should be exactly one file (either
* a real file or a null file)
*
* @param files
*/
boolean isValidArity(Collection files) {
if( !arity )
return true

return isNullable()
? files.size() == 1
: arity.contains(files.size())
}

@EqualsAndHashCode
static class Range {
int min
int max

Range(int min, int max) {
this.min = min
this.max = max
}

boolean contains(int value) {
min <= value && value <= max
}

@Override
String toString() {
min == max
? min.toString()
: "${min}..${max == Integer.MAX_VALUE ? '*' : max}".toString()
}
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ import nextflow.script.TokenVar
*/
@Slf4j
@InheritConstructors
class FileInParam extends BaseInParam implements PathQualifier {
class FileInParam extends BaseInParam implements ArityParam, PathQualifier {

protected filePattern

Expand Down Expand Up @@ -153,4 +153,14 @@ class FileInParam extends BaseInParam implements PathQualifier {
return this
}

def setArity(boolean value) {
if( !value )
return

final str = filePattern?.contains('*') || filePattern?.contains('?')
? '1..*'
: '1'
setArity(str)
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ import nextflow.util.BlankSeparatedList
*/
@Slf4j
@InheritConstructors
class FileOutParam extends BaseOutParam implements OutParam, OptionalParam, PathQualifier {
class FileOutParam extends BaseOutParam implements OutParam, ArityParam, OptionalParam, PathQualifier {

/**
* ONLY FOR TESTING DO NOT USE
Expand Down Expand Up @@ -276,4 +276,14 @@ class FileOutParam extends BaseOutParam implements OutParam, OptionalParam, Path
(FileOutParam)super.setOptions(opts)
}

def setArity(boolean value) {
if( !value )
return

final str = filePattern?.contains('*') || filePattern?.contains('?')
? '1..*'
: '1'
setArity(str)
}

}
41 changes: 41 additions & 0 deletions modules/nextflow/src/main/groovy/nextflow/util/NullPath.groovy
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
/*
* Copyright 2013-2023, Seqera Labs
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package nextflow.util

import java.nio.file.Path
import java.nio.file.Paths

import groovy.transform.EqualsAndHashCode
import groovy.transform.PackageScope
import groovy.util.logging.Slf4j

/**
*
* @author Ben Sherman <[email protected]>
*/
@EqualsAndHashCode
@Slf4j
class NullPath implements Path {

@PackageScope
@Delegate
Path delegate

NullPath(String path) {
delegate = Paths.get(path)
}
}
Loading