-
Notifications
You must be signed in to change notification settings - Fork 586
Create build_tarballs.jl for norMD #11075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
diegozea
wants to merge
13
commits into
JuliaPackaging:master
Choose a base branch
from
diegozea:patch-3
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
51d688c
Create build_tarballs.jl for norMD
diegozea 976dbd3
Update build_tarballs.jl
diegozea 89bdaee
Use bindir
diegozea 1bf930a
Add QUA License and use bindir
diegozea 8f0670f
Use exeext for Windows
diegozea 49346e6
Avoid building for Windows
diegozea cf3411e
norMD to AQUAnorMD
diegozea e76ec98
Update build_tarballs.jl
diegozea c13894a
Apply suggestions from code review
diegozea af00b8c
Merge branch 'JuliaPackaging:master' into patch-3
diegozea f8425ef
Go back to the list of supported platforms.
diegozea 1cd3034
Apply suggestions from code review
diegozea f6c2687
Apply suggestions from code review
diegozea File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
# Note that this script can accept some limited command-line arguments, run | ||
# `julia build_tarballs.jl --help` to see a usage message. | ||
using BinaryBuilder, Pkg | ||
|
||
name = "norMD" | ||
version = v"1.3.0" | ||
|
||
# This script installs norMD (normalized Mean Distance) version 1.3, provided by the AQUA suite. | ||
# norMD is a statistical metric used to assess the quality of multiple sequence alignments (MSAs). | ||
# The `normd` program calculates the overall norMD score for an entire multiple sequence alignment. | ||
# | ||
# Usage example: | ||
# - `normd aln_file`: Calculates the norMD score for the specified alignment file. | ||
# | ||
# If you use this tool, please cite the following references: | ||
# - Thompson, J. D., Plewniak, F., Ripp, R., Thierry, J. C., & Poch, O. (2001). Towards a reliable objective function for multiple sequence alignments. Journal of molecular biology, 314(4), 937-951. | ||
# - Muller, J., Creevey, C. J., Thompson, J. D., Arendt, D., & Bork, P. (2010). AQUA: automated quality improvement for multiple sequence alignments. Bioinformatics, 26(2), 263-265. | ||
|
||
# Collection of sources required to complete build | ||
sources = [ | ||
ArchiveSource("https://www.bork.embl.de/Docu/AQUA/latest/norMD1_3.tar.gz", "24ba32425640ae6288d59ca2bf5820dd85616132fe6a05337d849035184c660d"), | ||
FileSource("https://www.bork.embl.de/Docu/AQUA/latest/License.txt", "ddb9db7630752f8fdc6898f7c99a99eaeeac5213627ecb093df9c82f56175dc7") | ||
] | ||
|
||
# Bash recipe for building across all platforms | ||
script = raw""" | ||
cd $WORKSPACE/srcdir/normd_noexpat/ | ||
sed -i '/#include "score.h"/a#include <string.h>' init.c | ||
make -j${nproc} CFLAGS="-c -O2 -std=c99 -Wno-implicit-function-declaration" | ||
install -Dvm 755 normd "${bindir}/normd${exeext}" | ||
""" | ||
# NOTE: Only the normd executable is installed. | ||
# The programs normd_subaln, normd_range, normd_sw, normd_aln, and normd_aln1 are built but not installed. | ||
|
||
# These are the platforms we will build for by default, unless further | ||
# platforms are passed in on the command line | ||
platforms = supported_platforms() | ||
|
||
# The products that we will ensure are always built | ||
products = [ | ||
ExecutableProduct("normd", :normd) | ||
] | ||
|
||
# Dependencies that must be installed before this package can be built | ||
dependencies = Dependency[ | ||
] | ||
|
||
# Build the tarballs, and possibly a `build.jl` as well. | ||
build_tarballs(ARGS, name, version, sources, script, platforms, products, dependencies; julia_compat="1.6") |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to use a less concise name as the tool is for a specific field (bioinformatics, MSA), e.g.,
Even adding MSA to disambiguate among all the "Aqua"'s out there (Aqua.jl, Aqua macOS GUI, JetBrains Aqua, ... https://github.com/search?q=aqua&type=repositories&s=stars&o=desc ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Thanks, I understand :) Can it be
NormalizedMeanDistance
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my perspective, with a background in Computer Science, and not in Bioinformatics, I would be more comfortable with a name completely without field-specific abbreviations, i.e., I would really be most comfortable with a name like
But I'm OK with AquaMSA as the prefix if that is preferred, but I think there should be a prefix to disambiguate among other interpretations of "NorMD" or "NormalizedMeanDistance".
I do not think NormalizedMeanDistance only makes sense for MSA (with sequence referring to DNA/RNA sequences, I presume...). E.g., in Computer Vision, it is common to look at the IoU (intersection over union) distance between object detections, and then compute
mAP
(mean average precision) based on an IoU threshold. I would not expect everyone in the whole world (Julia world) to understand, or even accept, computer vision-specific terms as universal terms.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the package name should ideally match the tool's name, following the usual [convention](https://docs.binarybuilder.org/stable/building/#Name) for JLL packages.
After reconsidering, since the upstream
AQUA
project provides thenorMD
tool, I think it would make more sense to name the packageAQUAnorMD
.This keeps the name closely tied to the upstream project while being simpler and improving discoverability.
What do you think?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the last commit, I changed the name from
norMD
toAQUAnorMD
to better align with the upstream project and improve clarity.Should the file also be renamed to
AQUAnorMD
and moved into theA/
folder to reflect this new naming?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "not necessarily" I mean that they're specifically not checked for naming rules.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback and the thoughtful discussion.
I'd like to explain why I still think
norMD
is the most appropriate name for the JLL package:There is no other standalone software named
norMD
. So, currently there are no naming clashes, and such conflicts are unlikely in the future — especially since the casing (norMD
) isn’t standard in Julia.I really like the idea of JLL packages being named after their actual software or suite. It helps a lot with discoverability. For example, that’s how I found useful tools like
MAFFT_jll
. I wouldn’t have found them easily if the name had been something likeMultipleAlignmentUsingFastFourierTransform_jll
.While I’ve taken the implementation from AQUA because it’s the only publicly accessible one — available via
wget
— thenorMD
tool was published independently before AQUA. In fact, to get other versions, you'd need to contact the paper authors directly. So AQUA is just bundlingnorMD
, not the origin of it. Future versions of the tool could appear outside AQUA too. So, I would prefer not to tie the package name to AQUA.Let me know what you think — I’m open to discussing further, but I believe keeping the
norMD
name fits the conventions and benefits users the most.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding 1, when I go looking for “norMD”, I find (in the following order):
(no trace of the aqua/msa normd)
Regarding 2: In contrast, all of the first 10 hits from
Google for mafft seems to be for bioinformatics multiple alignment FFT of some DNA/RNA sequences.
Regarding 3: It makes sense not to link the package to AQUA if they are merely bundling it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the mentioned
normD
andnormd
do not clash because of the different casing ;) Also,normD
is an R subroutine, so it would not be a JLL package.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say that casing is too brittle to rely on for disambiguation.
And yes, I don't see the R package command becoming a JLL, it was more to show that there were (more common?) interpretations of what "normd" might be.