MSA Alignment Comparison Tool

A high-performance CLI tool for automating the validation and comparison of biological datasets. This tool calculates accuracy metrics between Reference Alignments and Realignments using vectorized NumPy operations.

What the Tool Does

For each pair of alignments, the script:

Reads the FASTA files and normalizes the sequences.
Converts each alignment into a numeric “coordinate matrix,” where gaps are 0 and nucleotides are assigned their 1‑based ungapped position.
Compares the two matrices to count how many positions differ.
Calculates an overall structural accuracy score.
Writes all results to a clean CSV file.

Installation

Requires Python 3.10+.

Install dependencies:

pip install biopython numpy natsort

Usage Examples

usage: msa_scorer.py [-h] -r REF -a REAL [-o OUTPUT] [-e EXTENSIONS [EXTENSIONS ...]] [-v] [--strict]

options:
  -h, --help            show this help message and exit
  -r REF, --ref REF     Reference alignment directory
  -a REAL, --real REAL  Realignment directory
  -o OUTPUT, --output OUTPUT
                        Output CSV path (default: results.csv)
  -v, --verbose         Enable debug logging
  --strict              Exit pipeline on first error

Compare two directories of alignments:

python msa_scorer.py --ref ref_dir --real real_dir

Save results under a custom filename:

python msa_scorer.py -r ref -a real -o comparison_results.csv

Enable debugging output:

python msa_scorer.py -r ref -a real -v

Stop immediately on error:

python msa_scorer.py -r ref -a real --strict

Specify your own set of acceptable file extensions:

python msa_scorer.py -r ref -a real -e .fa .fasta .aln

Output Format

The script writes a CSV with one row per paired comparison. Each row includes:

simulation_id
reference_file
realignment_file
total_differences
total_positions
accuracy_percent
sequences_count
alignment_length

Example:

0, sample1.fasta, sample1.fa, 42, 10500, 99.60, 12, 880

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
sample_data		sample_data
.gitignore		.gitignore
README.md		README.md
msa_scorer.py		msa_scorer.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MSA Alignment Comparison Tool

What the Tool Does

Installation

Usage Examples

Output Format

About

Uh oh!

Releases

Packages

Languages

bytesandroses/alignment-scorer

Folders and files

Latest commit

History

Repository files navigation

MSA Alignment Comparison Tool

What the Tool Does

Installation

Usage Examples

Output Format

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages