This guide walks through publishing biPangolin to PyPI with bundled probe weights and auto-downloaded Pangolin weights.
Create a public repo, e.g. github.com/wilkino/bipangolin. Push this whole
package directory there.
cd src/bipangolin/
curl -O https://raw.githubusercontent.com/tkzeng/Pangolin/main/pangolin/model.py
# Verify the file looks right
head -20 model.pyAdd this header to the top of model.py:
# Vendored from https://github.com/tkzeng/Pangolin (GPL-3.0 License)
# Copyright (c) Tony Zeng, Yang I. Li, et al.This means users no longer need pip install git+https://github.com/tkzeng/Pangolin.git
— removing the most fragile dependency.
Copy your 24 trained probes into src/bipangolin/data/probes/:
mkdir -p src/bipangolin/data/probes
cp /path/to/bipangolin_probes/*.pt src/bipangolin/data/probes/
ls src/bipangolin/data/probes/ | wc -l # should be 24Total size should be ~1.3 MB — well within PyPI's per-file limit. The probes
live inside the bipangolin package tree, so Hatch includes them in the wheel
with the package files.
cd /path/to/Pangolin/pangolin/models/
cp /path/to/Pangolin/LICENSE PANGOLIN_LICENSE
tar czf pangolin_models_v24.tar.gz final.[1-3].[0246].3.v2 final.[1-3].[1357].3.v2 PANGOLIN_LICENSE
ls -lh pangolin_models_v24.tar.gzInclude all 24 Pangolin v2 files: 12 P-tuned files plus 12 PSI-tuned files.
The default P-only workflow uses the even-indexed files; --psi / --psi-only
also need the odd-indexed PSI-tuned files.
The PANGOLIN_LICENSE file keeps the standalone weight archive clear about its
upstream GPL-3.0 provenance. Also mention Pangolin and GPL-3.0 in the GitHub
release notes for the asset.
Get the SHA-256:
sha256sum pangolin_models_v24.tar.gzUpdate src/bipangolin/_weights.py:
- Replace
USERNAMEinPANGOLIN_WEIGHTS_URLwith your GitHub username - Replace
REPLACE_WITH_ACTUAL_SHA256_BEFORE_PUBLISHINGwith the real hash
git tag v0.5.0
git push origin v0.5.0Then on github.com:
- Go to your repo → Releases → Draft a new release
- Tag:
v0.5.0, title: "biPangolin v0.5.0" - Upload
pangolin_models_v24.tar.gzas a release asset - Publish release
The download URL will be:
https://github.com/USERNAME/bipangolin/releases/download/v0.4.0/pangolin_models_v24.tar.gz
This must match what's in _weights.py.
pip install -e .
# Should work (uses bundled probes + downloads Pangolin weights)
bipangolin selftest
# Or in Python
python -c "from bipangolin import selftest; selftest()"Expected output:
biPangolin: 12 model+probe pairs ready on cuda
donor peak: pos= 69 (expected 69) P=0.998
acceptor peak: pos= 163 (expected 163) P=0.997
pip install build twine
python -m build
ls dist/ # should see bipangolin-0.4.0-py3-none-any.whl and .tar.gz
# Inspect the wheel contents to confirm probes are included
unzip -l dist/bipangolin-0.4.0-py3-none-any.whl | grep probes
# Should list all 24 .pt files
# Check metadata
twine check dist/*python -m venv /tmp/test_install
source /tmp/test_install/bin/activate
pip install dist/bipangolin-0.4.0-py3-none-any.whl
bipangolin selftest
deactivateIf selftest passes, you're ready to publish.
First-time setup: register at pypi.org and create an API token.
Test upload to TestPyPI first:
twine upload --repository testpypi dist/*
pip install --index-url https://test.pypi.org/simple/ bipangolinThen real PyPI:
twine upload dist/*Done. Anyone can now pip install bipangolin.
# 1. Bump version in pyproject.toml AND src/bipangolin/__init__.py
# 2. If you retrained probes, replace src/bipangolin/data/probes/*.pt
# 3. If you changed Pangolin weights, build new tarball + new GitHub release
# with new tag, update _weights.py URL + SHA
# 4. Build and upload
git tag v0.5.0
git push origin v0.5.0
rm -rf dist/
python -m build
twine upload dist/*Probe weights bundled, Pangolin weights downloaded. Probes are tiny (~1.3 MB, 24 files), Pangolin weights are large (~50 MB). PyPI allows wheels up to 100 MB but it's bad practice to ship large model weights — slower installs, wasted bandwidth for users who don't run inference, and PyPI storage isn't designed for binary blobs. GitHub Releases is the right place.
Cache directory respects platform conventions. Linux uses XDG
($XDG_CACHE_HOME/bipangolin or ~/.cache/bipangolin), macOS uses
~/Library/Caches/bipangolin, Windows uses %LOCALAPPDATA%\bipangolin\Cache.
Override with BIPANGOLIN_CACHE env var if needed.
Vendoring Pangolin's model.py is fine because:
- It's GPL-3.0-licensed (verify by checking their LICENSE file).
- It's small (~200 lines), unlikely to change frequently.
- It removes a fragile
git+https://...dependency from your install. - You include the original copyright notice and keep biPangolin's package
license metadata aligned with the repository's top-level GPL-3.0
LICENSEfile.