Skip to content

Conversation

@bebatut
Copy link
Member

@bebatut bebatut commented Sep 22, 2025

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

@bebatut bebatut force-pushed the detect_circular_contigs branch from 70659b1 to ca3d37a Compare September 22, 2025 14:03
@bebatut bebatut force-pushed the detect_circular_contigs branch from ca3d37a to 7e0e0e1 Compare September 22, 2025 14:12
@bebatut
Copy link
Member Author

bebatut commented Sep 22, 2025

I do not understand why it fails at the "Combine chunked test results" step: https://github.com/galaxyproject/tools-iuc/actions/runs/17918115645?pr=7296
Tests are green with Planemo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you publish this python script separately (extra git repo + bioconda recipe) or re-use existing implementations? I think we should limit publishing software via IUC (except maybe for trivial cases) and restrict to Galaxy tools. IUC seems already busy with the tool wrappers and I'm afraid of the additional workload caused by scripts.

description: Detect circular sequences (e.g. circular contigs) in a FASTA file by k-mer matching
long_description: |
Detect circular sequences (e.g. circular contigs) by looking for exact identical k-mer at the two
ends on a cadre sequence of the sequences prodvide in fasta file. In order to be able
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cadre?

Args:
seq (str): sequence to format.
"""
return textwrap.wrap(seq, width=60, break_on_hyphens=False)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will probably cause problems with headers longer than 60.

use biopython for io?

return textwrap.wrap(seq, width=60, break_on_hyphens=False)


def one_line_fasta(input_fp, output_fp):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Biopython instead of this workaround?

@@ -0,0 +1,58 @@
<tool id="detect_circular_sequences" name="Detect circular sequences" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@">
<description>(e.g. circular contigs) in a FASTA file by k-mer matching</description>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to avoid the term kmer. The word has to many implications.

The tool just checks for exact sequence identity of a single sequence.

Returns:
: True if circular, False otherwise
"""
try:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This try just removes the traceback that could be useful for debugging.

begin ():
end ():
"""
pattern = re.compile(re.escape(begin))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use regexp if you can use simple string search?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants