Skip to content

fix: alignment span slicing drops characters when gaps present#409

Merged
thatbudakguy merged 3 commits into
mainfrom
alignment-fix
May 30, 2026
Merged

fix: alignment span slicing drops characters when gaps present#409
thatbudakguy merged 3 commits into
mainfrom
alignment-fix

Conversation

@thatbudakguy
Copy link
Copy Markdown
Member

The span slicing in SmithWatermanAligner used len(cu)/len(cv) which
includes gap characters, causing the span to include tokens beyond the
aligned region. This led to mismatched character display and phoneme
transcription.

Changes:

  • align.py: count only non-gap elements when slicing spans
  • align.py: skip gap chars in edge trimming to prevent over-trimming
  • console.py: use alignment length for _mark_span loop range
  • console.py: add bounds check before accessing span/other pointers

GDRom and others added 3 commits May 21, 2026 12:18
The span slicing in SmithWatermanAligner used len(cu)/len(cv) which
includes gap characters, causing the span to include tokens beyond the
aligned region. This led to mismatched character display and phoneme
transcription.

Changes:
- align.py: count only non-gap elements when slicing spans
- align.py: skip gap chars in edge trimming to prevent over-trimming
- console.py: use alignment length for _mark_span loop range
- console.py: add bounds check before accessing span/other pointers
@thatbudakguy thatbudakguy merged commit 2d6dc52 into main May 30, 2026
3 checks passed
@thatbudakguy thatbudakguy deleted the alignment-fix branch May 30, 2026 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants