-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Export CWL-abstract workflow representation #9407
base: dev
Are you sure you want to change the base?
Conversation
I appreciate the effort here @ieguinoa, awesome start. I think I'm going to strongly advocate for us pushing this down a layer into gxformat2 though. There are some pros and cons of doing that - the serious cons include not having Galaxy runtime knowledge of valid tool state and connections. Certain things will require us ensuring workflow exports have the required information needed to process these formats correctly - that is potentially a lot of extra work but I'm more than happy to help with that process and improve the workflow exports to capture all the metadata we need. Those are the cons - the pros are numerous however I think. I've opened a start of PR to do this at that level to highlight some of those advantages. That PR is here: https://github.com/galaxyproject/gxformat2/pull/38/files. The advantages of that PR over this approach is that:
|
Hi @jmchilton , thanks for the response. |
Ahhh - I had seen that repository also but I assumed this was newer because Björn pinged me on this yesterday. Sorry - I should have started from there - I do think the approach I outlined in gxformat2 is more promising because it doesn't have two separate blocks for format2 and native and handles type conversions and subworkflows a bit better - but your script has more help and handles the I/O information contained in the native format but dropped as unneeded in gxformat2 better. I think the format2 schema (since it based on CWL's) has room for all that extra input/output annotation information - it just wasn't strictly needed so it get drops in a naive conversion - I've long wanted to do a variant of the conversion that preserved more of that information - I'll see if I can do that. I'll try to find some time bring more of that other galaxy2cwl goodness into the gxformat2 script. If there is anything else I can do to convince you to hack on the gxformat2 version and add test cases for missing features and get y'all to use it for workflowhub - please let me know. I do have a quick question - are the inputs and outputs on the operation used in by the workflowhub (I assume just cwlviewer?) or is specifying the in/out on the workflow steps sufficient for your purposes? I handled that information and it seems to validate cwltool though the documents seem a bit off as a result. |
no problem, I actually forgot to post a link to that galaxy2cwl repo here. No need to convince me to hack on the gxformat2 ;) the galaxy2cwl was a bit of a quick fix since we rushed the launch of workflowhub and wanted to get the cwlviewer part going as it is definitely useful for users to see a diagram of the workflow. should I add native the formatted workflow examples to the gxformat2 repo? I will look for the use cases that had the most differences between the .ga and format2. About the workflowhub aim: using cwl-abstract representations is mainly a mean to get a standardized metadata file with the workflow representation/interface. The goal is to actually get something similar for other wfms like Nextflow so that most workflows submited to workflowhub have a similar cwl-abstract representation. This can later be used for higher level processing (search by wf patterns, input and output types,EDAM ontologies, etc.) |
Since you're on board for using it - I merged my PR that did a release that adds the experimental feature (https://pypi.org/project/gxformat2/) - it should make it easier to PR that repo and play with the functionality. I'll keep you updated on any more progress I can make this week. |
Add whatever - I'm fine with just developing test cases around the native format - every new test case helps and the implementation will translate the information to Format 2 anyway so we get free testing of conversion and spec and stuff when we right test cases around native format workflows. |
I did another release with gxformat2 with another pass at doing that abstract CWL export - this one includes Marius' test COVID workflow as an example and it validates (https://github.com/galaxyproject/gxformat2/pull/43/files#diff-8fd380a30112029c42a33c720693aa77R73). That PR maybe gives some indications of how one can add test workflows - if you run that test case the outputs are in |
Thanks for the heads-up! planemo and Galaxy already depend on gxformat2 and use it for format conversions (e.g. the already existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any updates on the plan for this code?
# Pack workflow data into a dictionary and return | ||
data = {} | ||
data['class'] = 'Workflow' | ||
data['cwlVersion'] = "v1.2.0-dev1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data['cwlVersion'] = "v1.2.0-dev1" | |
data['cwlVersion'] = "v1.2" |
This is a first step towards exporting a Galaxy workflow as an RO-crate object.
Besides the required metadata, the workflow RO-Crate profile recommends to accompany the native workflow definition with an abstract CWL description.
This PR is heavily dependant on the implementation of the Abstract Operation in CWL (common-workflow-language/cwl-v1.2#3) .