Skip to content

add long_phase supplementary alignment tag to extended args#163

Open
AmberVerhasselt wants to merge 7 commits into
IntGenomicsLab:devfrom
AmberVerhasselt:dev
Open

add long_phase supplementary alignment tag to extended args#163
AmberVerhasselt wants to merge 7 commits into
IntGenomicsLab:devfrom
AmberVerhasselt:dev

Conversation

@AmberVerhasselt
Copy link
Copy Markdown

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copilot AI review requested due to automatic review settings April 29, 2026 11:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new pipeline parameter to control whether Longphase haplotag includes supplementary alignments, wiring it through config/schema/docs and into the module args assembly.

Changes:

  • Add params.longphase_tag_supplementary (default false) and pass --tagSupplementary to LONGPHASE_HAPLOTAG when enabled.
  • Extend nextflow_schema.json and docs/usage.md to expose/document the new option.
  • Update documented default for --severus_minsupport to 3 (matching config/schema).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
nextflow_schema.json Adds Longphase option metadata, but currently introduces schema-structure and default-value issues.
nextflow.config Defines new longphase_tag_supplementary param default.
docs/usage.md Documents the new Longphase parameter and updates Severus default shown to users.
conf/modules.config Conditionally appends --tagSupplementary to Longphase haplotag args.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread nextflow_schema.json
Comment on lines 75 to 79
"enum": ["deepvariant", "clair"]
},
"minItems": 1
"minItems": 1,
"default": "['deepvariant', 'clair']"
},
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

germline_var_keep is typed as an array, but the schema default is currently a string (and not valid JSON). This will break downstream tooling that reads defaults from the schema. Set default to a JSON array (e.g., ["deepvariant","clair"]) instead of a quoted string.

Copilot uses AI. Check for mistakes.
Comment thread nextflow_schema.json
},
"minItems": 1
"minItems": 1,
"default": "['deepsomatic', 'clair']"
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somatic_var_keep is typed as an array, but the schema default is currently a string (and not valid JSON). This will confuse validation/docs generation that expects an array default. Change default to a JSON array (e.g., ["deepsomatic","clair"]).

Suggested change
"default": "['deepsomatic', 'clair']"
"default": ["deepsomatic", "clair"]

Copilot uses AI. Check for mistakes.
Comment thread nextflow_schema.json
Comment on lines +116 to +120
"longphase_options": {
"title": "Longphase options",
"type": "object",
"description": "Options for Longphase phasing of small variants",
"properties": {
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

longphase_options is added under $defs, but it isn't referenced in the top-level allOf list. As a result, longphase_tag_supplementary defined there won't be part of the effective schema grouping/metadata. Add a $ref to #/$defs/longphase_options in allOf (and avoid duplicating the param elsewhere) so it shows up correctly in the schema-driven UI/docs.

Copilot uses AI. Check for mistakes.
Comment thread nextflow_schema.json
Comment on lines +576 to +585
"properties": {
"generate_gvcf": {
"type": "boolean"
},
"longphase_tag_supplementary": {
"type": "boolean"
},
"skip_modkit": {
"type": "boolean"
}
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new top-level properties block defines generate_gvcf, longphase_tag_supplementary, and skip_modkit with only type, dropping their descriptions/defaults and bypassing the existing $defs grouping pattern used elsewhere in this schema. Instead, define these parameters in the appropriate $defs sections (e.g., generate_gvcf in small-variant options, skip_modkit in skip options, longphase_tag_supplementary in longphase options) and reference those groups from allOf, removing this partial top-level properties block.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 29, 2026 13:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread nextflow_schema.json
Comment on lines 70 to 89
"germline_var_keep": {
"type": "array",
"description": "List of germline variant callers to use. Must include at least one of [deepvariant, clair].",
"items": {
"type": "string",
"enum": ["deepvariant", "clair"]
},
"minItems": 1
"minItems": 1,
"default": "['deepvariant', 'clair']"
},
"somatic_var_keep": {
"type": "array",
"description": "List of somatic variant callers to use. Must include at least one of [deepsomatic, clair].",
"items": {
"type": "string",
"enum": ["deepsomatic", "clair"]
},
"minItems": 1
"minItems": 1,
"default": "['deepsomatic', 'clair']"
},
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

germline_var_keep and somatic_var_keep are typed as arrays, but their default values are currently JSON strings (e.g. "['deepvariant', 'clair']"). In JSON Schema, default should match the declared type; this will break schema validation / generated docs. Set these defaults to actual JSON arrays (e.g. ["deepvariant", "clair"]).

Copilot uses AI. Check for mistakes.
Comment thread nextflow_schema.json
Comment on lines +579 to +586
"properties": {
"generate_gvcf": {
"type": "boolean"
},
"skip_modkit": {
"type": "boolean"
}
}
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new top-level properties entries for generate_gvcf and skip_modkit are incomplete (no description/default) and also bypass the existing parameter grouping via $defs/allOf (e.g. skip_modkit should live under #/$defs/skip_options). This will make schema-generated docs/UI inconsistent with docs/usage.md and other params. Consider moving these definitions into the appropriate $defs sections and adding description (and default if the schema is intended to carry defaults).

Copilot uses AI. Check for mistakes.
Comment thread nextflow_schema.json
Comment on lines +116 to +127
"longphase_options": {
"title": "Longphase options",
"type": "object",
"description": "Options for Longphase phasing of small variants",
"properties": {
"longphase_tag_supplementary": {
"type": "boolean",
"description": "Whether to include supplementary alignments in Longphase haplotype tagging.",
"default": false
}
}
},
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR title/description indicate this is only about adding a Longphase supplementary-alignment haplotag option, but this PR also changes schema defaults for *_var_keep, adds new schema properties for generate_gvcf / skip_modkit, and updates the documented default for severus_minsupport. Please update the PR description (or split into separate PRs) so reviewers understand the full scope and rationale for these additional changes.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 29, 2026 14:06
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread nextflow_schema.json
Comment on lines 70 to 79
"germline_var_keep": {
"type": "array",
"type": "string",
"description": "List of germline variant callers to use. Must include at least one of [deepvariant, clair].",
"items": {
"type": "string",
"enum": ["deepvariant", "clair"]
},
"minItems": 1
"minItems": 1,
"default": "['deepvariant', 'clair']"
},
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

germline_var_keep is used throughout the pipeline as a Groovy List (e.g., .contains(), .size(), and comparisons to ['clair'] / ['deepvariant']). Changing the schema type to string (and setting a string default like "['deepvariant', 'clair']") will encourage invalid user input and can break runtime logic; it also makes the schema invalid because items/minItems apply to arrays, not strings. Please revert this property to type: "array" with items.type: "string", keep minItems, and use a real JSON array for default (e.g. ["deepvariant","clair"]).

Copilot uses AI. Check for mistakes.
Comment thread nextflow_schema.json
Comment on lines 80 to +88
"somatic_var_keep": {
"type": "array",
"type": "string",
"description": "List of somatic variant callers to use. Must include at least one of [deepsomatic, clair].",
"items": {
"type": "string",
"enum": ["deepsomatic", "clair"]
},
"minItems": 1
"minItems": 1,
"default": "['deepsomatic', 'clair']"
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue for somatic_var_keep: the workflow expects a List (uses .contains() / .size() patterns), but the schema now declares it as string with items/minItems and a stringified list default. This makes the schema inconsistent/invalid and can lead to runtime misbehavior if users supply a string. Please keep it as type: "array" with an enum-constrained items, and set default to a JSON array like ["deepsomatic","clair"].

Copilot uses AI. Check for mistakes.
Comment thread nextflow_schema.json
Comment on lines 319 to 323
"skip_qc": {
"type": "boolean",
"default": false,
"description": "Skips all QC steps"
},
"skip_cramino": {
Copy link

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A number of boolean parameters had their default: false removed from the schema (e.g. all skip_* flags, use_gpu, plus others elsewhere like igenomes_ignore, download_vep_cache, ascat_pdf_plots, and generic help/version flags). Even though JSON Schema default is informational, nf-core tooling/docs and UI generation typically rely on it to display correct defaults and keep the schema aligned with nextflow.config. Please restore the default values (at least for the params that are explicitly defaulted in nextflow.config).

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 30, 2026 05:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread nextflow_schema.json
Comment on lines +580 to +582
"generate_gvcf": {
"type": "boolean"
},
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top-level properties.generate_gvcf is missing a description/help_text/default and is not grouped with the other small variant calling options in $defs. This makes the schema inconsistent and may cause nf-core schema docs / UI grouping to omit or misplace the parameter metadata.

Suggested change
"generate_gvcf": {
"type": "boolean"
},

Copilot uses AI. Check for mistakes.
Comment thread nextflow_schema.json
Comment on lines +582 to +584
},
"skip_modkit": {
"type": "boolean"
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Top-level properties.skip_modkit is defined without description/help_text/default and outside the existing skip_options group. To keep schema validation + rendered docs consistent, this should be defined alongside the other skip parameters (with matching metadata) rather than as a bare root property.

Suggested change
},
"skip_modkit": {
"type": "boolean"

Copilot uses AI. Check for mistakes.
Comment thread nextflow_schema.json
Comment on lines +71 to 76
"type": ["string", "array"],
"description": "List of germline variant callers to use. Must include at least one of [deepvariant, clair].",
"items": {
"type": "string",
"enum": ["deepvariant", "clair"]
},
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

germline_var_keep now allows type: "string", but the allowed values are only constrained via items.enum (which applies only when the value is an array). As a result, any arbitrary string would validate successfully and bypass the intended [deepvariant, clair] restriction. Consider using oneOf/anyOf to separately validate the string form (e.g., enum/pattern) vs the array form, or keep this as an array-only parameter.

Copilot uses AI. Check for mistakes.
@AmberVerhasselt AmberVerhasselt marked this pull request as ready for review May 8, 2026 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants