Releases · broadinstitute/cromwell

21 Apr 18:34

grsterin

c6468c0

50 Release Notes

Changes and Warnings

Metadata Archival Config Change

Note: Unless you have already opted-in to GCS-archival of metadata during its development, this change will not affect you.
Cromwell's metadata archival configuration has changed in a backwards incompatible way to increase consistency,
please see
the updated documentation for details.

Assets 4

27 Feb 18:34

cjllanwarne

58c8443

49 Release Notes

Changes and Warnings

Job store database refactoring

The primary keys of Cromwell's job store tables have been refactored to use a BIGINT datatype in place of the previous
INT datatype. Cromwell will not be usable during the time the Liquibase migration for this refactor is running.
In the Google Cloud SQL with SSD environment this migration runs at a rate of approximately 40,000 JOB_STORE_SIMPLETON_ENTRY
rows per second. In deployments with millions or billions of JOB_STORE_SIMPLETON_ENTRY rows the migration may require
a significant amount of downtime so please plan accordingly. The following SQL could be used to estimate the number of
rows in this table:

SELECT table_rows FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'cromwell' AND table_name = 'JOB_STORE_SIMPLETON_ENTRY';

Execution Directory Layout (cache copies)

When an attempt to copy a cache result is made, you'll now see a cacheCopy directory in the call root directory.
This prevents them clashing with the files staged to the same directory for attempt 1 if the cache copy fails (see also: Bug Fixes).

The directory layout used to be:

[...]/callRoot/
  - script [from the cache copy attempt, or for execution attempt 1 if the cache copy fails]
  - stdout [from the cache copy attempt, or for execution attempt 1 if the cache copy fails]
  - output.file [from the cache copy attempt, or for execution attempt 1 if the cache copy fails]
  - attempt-2/ [if attempt 1 fails]
    - script
    - stdout
    - output.file

but is now:

[...]/callRoot/
  - cacheCopy/
    - script
    - stdout
    - output.file
  - script [for attempt 1 if the cache copy fails]
  - stdout [for attempt 1 if the cache copy fails]
  - output.file [for attempt 1 if the cache copy fails]
  - attempt-2/ [if attempt 1 fails]
    - script
    - stdout
    - output.file

New Functionality

Disable call-caching for tasks

It is now possible to indicate in a workflow that a task should not be call-cached. See details
here.

Delete Intermediate Outputs on PapiV2

Experimental: When a new workflow option delete_intermediate_output_files is submitted with the workflow,
intermediate File objects will be deleted when the workflow completes. See the Google Pipelines API Workflow Options
documentation
for more information.

Metadata Archival Support

Cromwell 49 now offers the option to archive metadata to GCS and remove the equivalent metadata from relational
database storage. Please see
the documentation for more details.

Adding support for Google Cloud Life Sciences v2beta

Cromwell now supports running workflows using Google Cloud Life Sciences v2beta API in addition to Google Cloud Genomics v2alpha1.
More information about migration to the new API from v2alpha1
here.

Note Google Cloud Life Sciences is the new name for newer versions of Google Cloud Genomics.
Note Support for Google Cloud Genomics v2alpha1 will be removed in a future version of Cromwell. Advance notice will be provided.

New Docs

Installation methods

Links to the conda package and docker container are now available in
the install documentation.

Bug Fixes

Fix a bug where zip files with directories could not be imported.
For example a zip with a.wdl and b.wdl could be imported but one with sub_workflows/a.wdl
and imports/b.wdl could not.
Fix a bug which sometimes allowed execution scripts copied by a failed cache-copy to be run instead
of the attempt-1 script for a live job execution.

Assets 4

13 Jan 15:59

aednichols

787cf8b

48 Release Notes

Womtool Graph for WDL 1.0

The womtool graph command now supports WDL 1.0 workflows.

Note: Generated graphs - including in WDL draft 2 - may look slightly different than they did in version 47.

Documentation

Documented the use of a HSQLDB file-based database so users can try call-caching without needing a database server.
Please checkout the database documentation.

Assets 4

07 Oct 18:33

mcovarr

1ce5bbf

47 Release Notes

Retry with more memory on Papiv2 (#5180)

Cromwell now allows user defined retries. With memory-retry config you can specify an array of strings which when encountered in the stderr
file by Cromwell, allows the task to be retried with multiplier factor mentioned in the config. More information here.

GCS Parallel Composite Upload Support

Cromwell 47 now supports GCS parallel composite uploads which can greatly improve delocalization performance.
This feature is turned off by default, it can be turned on by either a backend-level configuration setting or
on a per-workflow basis with workflow options. More details here.

Papi V2 Localization Using GCR (#5200)

The Docker image for the Google Cloud SDK was previously only published on Docker
Hub. Now that the image is publicly hosted in
GCR, Papi V2 jobs will localize inputs and delocalize outputs using
the GCR image.

Assets 4

30 Sep 17:44

mcovarr

46.1

bdba8b1

46.1

46.1 Release Notes

Retry with more memory on Papiv2 (#5180)

Assets 4

16 Sep 22:04

kshakir

b61c136

46 Release Notes

Nvidia GPU Driver Update

The default driver for Nvidia GPU's on Google Cloud has been updated from 390 to 418.87.00. A user may override this option at anytime by providing the nvidiaDriverVersion runtime attribute. See the Runtime Attribute description for GPUs for detailed information.

Enhanced "error code 10" handling in PAPIv2

On Google Pipelines API v2, a worker VM that is preempted may emit a generic error message like

PAPI error code 10. The assigned worker has failed to complete the operation

instead of a preemption-specific message like

PAPI error code 14. Task was preempted for the 2nd time.

Cromwell 44 introduced special handling that detects both preemption indicators and re-runs the job consistent with the preemptible setting.

Cromwell 46 enhances this handling in response to user reports of possible continued issues.

Assets 4

27 Aug 20:22

danbills

45.1

82af1fc

45.1

[45 hotfix edition] Error code 10 take 2 [BA-5952] (#5138)

Assets 4

19 Aug 15:02

salonishah11

d46ff9f

45 Release Notes

Improved input and output transfer performance on PAPI v2

Cromwell now requires only a single PAPI "action" each for the entire localization or delocalization process, rather than two per file or directory.
This greatly increases execution speed for jobs with large numbers of input or output files.
In testing, total execution time for a call with 800 inputs improved from more than 70 minutes to less than 20 minutes.

List dependencies flag in Womtool Command Line (#5098)

Womtool now outputs the list of files referenced in import statements using -l flag for validate command.
More info here

BCS backend new Features support

New docker registry

Alibaba Cloud Container Registry is now supported for the docker runtime attribute, and the previous dockerTag
runtime attribute continues to be available for Alibaba Cloud OSS Registry.

Call caching

Cromwell now supports Call caching when using the BCS backend.

Workflow output glob

Globs can be used to define outputs for BCS backend.

NAS mount

Alibaba Cloud NAS is now supported for the mounts runtime attribute.

Assets 4

15 Jul 18:44

danbills

8055dad

44 Release Notes

Improved PAPI v2 Preemptible VM Support

In some cases PAPI v2 will report the preemption of a VM in a way that differs from PAPI v1. This novel means of reporting
preemption was not recognized by Cromwell's PAPI v2 backend and would result in preemptions being miscategorized as call failures.
Cromwell's PAPI v2 backend will now handle this type of preemption.

Assets 4

26 Jun 17:39

rsasch

a52c76e

43 Release Notes

Virtual Private Cloud with Subnetworks

Cromwell now allows PAPIV2 jobs to run on a specific subnetwork inside a private network by adding the subnetwork key
subnetwork-label-key inside virtual-private-cloud in backend configuration. More info here.

Call caching database refactoring

Cromwell's CALL_CACHING_HASH_ENTRY primary key has been refactored to use a BIGINT datatype in place of the previous
INT datatype. Cromwell will not be usable during the time the Liquibase migration for this refactor is running.
In the Google Cloud SQL with SSD environment this migration runs at a rate of approximately 100,000 CALL_CACHING_HASH_ENTRY
rows per second. In deployments with millions or billions of CALL_CACHING_HASH_ENTRY rows the migration may require
a significant amount of downtime so please plan accordingly. The following SQL could be used to estimate the number of
rows in this table:

select max(CALL_CACHING_HASH_ENTRY_ID) from CALL_CACHING_HASH_ENTRY

Stackdriver Instrumentation

Cromwell now supports sending metrics to Google's Stackdriver API.
Learn more on how to configure here.

BigQuery in PAPI

Cromwell now allows a user to specify BigQuery jobs when using the PAPIv2 backend

Configuration Changes

StatsD Instrumentation

There is a small change in StatsD's configuration path. Originally, the path to the config was services.Instrumentation.config.statsd
which now has been updated to services.Instrumentation.config. More info on its configuration can be found
here.

cached-copy

A new experimental feature, the cached-copy localization strategy is available for the shared filesystem.
More information can be found in the documentation on localization.

Yaml node limits

Yaml parsing now checks for cycles, and limits the maximum number of parsed nodes to a configurable value. It also
limits the nesting depth of sequences and mappings. See the documentation on configuring
YAML for more information.

API Changes

Workflow Metadata

It is now possible to use includeKey and excludeKey at the same time. If so, the metadata key must match the includeKey and not match the excludeKey to be included.
It is now possible to use "calls" as one of your excludeKeys, to request that only workflow metadata gets returned.

PostgreSQL support

Cromwell now supports PostgreSQL (version 9.6 or higher, with the Large Object
extension installed) as a database backend.
See here for
instructions for configuring the database connection.

Assets 4

Releases: broadinstitute/cromwell

50

50 Release Notes

Changes and Warnings

Metadata Archival Config Change

49

49 Release Notes

Changes and Warnings

Job store database refactoring

Execution Directory Layout (cache copies)

New Functionality

Disable call-caching for tasks

Delete Intermediate Outputs on PapiV2

Metadata Archival Support

Adding support for Google Cloud Life Sciences v2beta

New Docs

Installation methods

Bug Fixes

48

48 Release Notes

Womtool Graph for WDL 1.0

Documentation

47

47 Release Notes

Retry with more memory on Papiv2 (#5180)

GCS Parallel Composite Upload Support

Papi V2 Localization Using GCR (#5200)

46.1

46.1 Release Notes

Retry with more memory on Papiv2 (#5180)

46

46 Release Notes

Nvidia GPU Driver Update

Enhanced "error code 10" handling in PAPIv2

45.1

45

45 Release Notes

Improved input and output transfer performance on PAPI v2

List dependencies flag in Womtool Command Line (#5098)

BCS backend new Features support

New docker registry

Call caching

Workflow output glob

NAS mount

44

44 Release Notes

Improved PAPI v2 Preemptible VM Support

43

43 Release Notes

Virtual Private Cloud with Subnetworks

Call caching database refactoring

Stackdriver Instrumentation

BigQuery in PAPI

Configuration Changes

StatsD Instrumentation

cached-copy

Yaml node limits

API Changes

Workflow Metadata

PostgreSQL support