Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GATK/Picard tasks for generating WGS qc metrics #110

Open
6 tasks done
mark-welsh opened this issue Sep 12, 2019 · 1 comment
Open
6 tasks done

Add GATK/Picard tasks for generating WGS qc metrics #110

mark-welsh opened this issue Sep 12, 2019 · 1 comment
Assignees

Comments

@mark-welsh
Copy link
Contributor

mark-welsh commented Sep 12, 2019

For UI issues

  • Tool: Picard
  • Version: 2.19.0

Description

WGS pipelines are long and therefore can have a high degree of variability across runs. Picard has many commands for generating QC metrics and md5s that will help maintain integrity/traceability.

What I Did

add task wrappers for:

  • CalculateReadGroupChecksum
  • CollectMultipleMetrics
  • CollectQualityYieldMetrics
  • CollectVariantCallingMetrics
  • CollectWgsMetrics
  • ValidateSamFile
@mark-welsh mark-welsh self-assigned this Sep 12, 2019
@mark-welsh mark-welsh changed the title Add Picard tasks for generating WGS qc metrics Add GATK/Picard tasks for generating WGS qc metrics Sep 13, 2019
@mark-welsh
Copy link
Contributor Author

also going to add GATK ValidateVariants as a part of this

mark-welsh added a commit that referenced this issue Sep 13, 2019
this tool is a little weird... it doesn't output any files for WDL to
check as a "success" condition. instead, it either exits with $? == 0 or
it displays and error

to signal success back to the Cromwell engine, I echo "true" if the
preceding GATK command finishes, which is read as a boolean from stdout
mark-welsh added a commit that referenced this issue Sep 13, 2019
and changes previous name to this for cohesion (and no invalids on my
PRs 😃)
mark-welsh added a commit that referenced this issue Sep 19, 2019
this tool is a little weird... it doesn't output any files for WDL to
check as a "success" condition. instead, it either exits with $? == 0 or
it displays and error

to signal success back to the Cromwell engine, I echo "true" if the
preceding GATK command finishes, which is read as a boolean from stdout
mark-welsh added a commit that referenced this issue Sep 19, 2019
and changes previous name to this for cohesion (and no invalids on my
PRs 😃)
mark-welsh added a commit that referenced this issue Sep 19, 2019
also enabled verbosity by default
mark-welsh added a commit that referenced this issue Sep 19, 2019
(d3b doesn't have it at all)
mark-welsh added a commit that referenced this issue Sep 20, 2019
if a BAM fails Validation (i.e. rc file has code 1), we still want to
finish the workflow
mark-welsh added a commit that referenced this issue Sep 20, 2019
command only takes one interval list, not multiple. it has been changed
to be optional since the interval for WGS is, well, the whole genome...
mark-welsh added a commit that referenced this issue Sep 23, 2019
mark-welsh added a commit that referenced this issue Nov 8, 2019
seeing an error with large BAMs (>100 GB) running out of memory.
hopefully this fixes that issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant