-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate Asklepian to use Datapipe products rather than its own MSA #61
Comments
This should be reasonably straightforward as the naive MSA is right in the scope of Asklepian and I think only the "best ref" genomes are used in the MSA too. |
@rmcolq suggested that we think about using the Asklepian is informally considered a "must run" service and so if we use the datapipe MSA product we'd need to guarantee it can run at least every day to ensure the Asklepian tables reach PHE in time for analysis (the informal expectation for this is the tables should be ingested PHE-side by early morning, requiring CLIMB-COVID to emit Asklepian products by the end of the working day). RC has confirmed they are willing to handle this. Datapipe also has a deduplication step that picks a best reference using a very similar algorithm to |
Plan is to migrate to using the |
SamStudio8/asklepian@43ffb67 uses the MSA rather than the best_ref.fasta to generate the genomes table. As they use the exact same headers they are interchangeable. |
As we discussed it here already I've updated this issue to track a longer term change for Asklepian to use Datapipe products. |
Bumping to backlog #62 |
Please can we have the genome table sequence column changed to the aligned sequence instead so we can use them in tree building?
Thanks!
The text was updated successfully, but these errors were encountered: