-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compleasm: use symlink instead of copying busco data #6679
base: main
Are you sure you want to change the base?
Conversation
cp -r '${busco_database.fields.path}/lineages/${lineage_dataset}/' 'galaxy_db/' && | ||
mkdir -p 'galaxy_db/' && | ||
ln -s '${busco_database.fields.path}/lineages/${lineage_dataset}/' 'galaxy_db/${lineage_dataset}' && | ||
touch 'galaxy_db/${lineage_dataset}.done' && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an empty file also existing in the DB folder? Then I would prefer a symlink.
Anyway a small explaining comment would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just an empty file specific to compleasm, to make it understand that it should not redownload it, I'm gonna add a comment
@@ -2,5 +2,5 @@ | |||
# - value | |||
# - name | |||
# - version | |||
# - /path/to/data | |||
eukaryota_odb10 eukaryota 5.4.6 ${__HERE__}/test-db/busco_downloads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May I ask why you updated the test (data)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to keep the test-data dir as small as possible as entomoplasmatales_odb10 is a much smaller lineage than eukaryota_odb10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but changing it will increase the size of the repo. The "problem" with git
repos is that everything that is in in will be there forever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But changing it might still be a good idea if the runtime is reduced significantly.
FOR CONTRIBUTOR:
I looked at the compleasm code, and found this way to avoid copying the busco dataset (which can take time and make compleasm jobs longer than busco jobs)