feat: support backbone-only CIF output and parallel-seed dump_dir#8
Open
Xinping-Liu wants to merge 1 commit intoOTeam-AI4S:mainfrom
Open
feat: support backbone-only CIF output and parallel-seed dump_dir#8Xinping-Liu wants to merge 1 commit intoOTeam-AI4S:mainfrom
Xinping-Liu wants to merge 1 commit intoOTeam-AI4S:mainfrom
Conversation
Two improvements to enable backbone-only inference and parallel seed runs:
1. scripts/inference.py: decouple data output dir from Hydra working dir
- Read hydra_output_dir before converting configs to ConfigDict
- Add exp.dump_dir override: if set, use it as dump_dir; otherwise
fall back to hydra_output_dir (original behavior unchanged)
- This allows parallel seed workers to each use a distinct
hydra.run.dir while writing outputs to a shared directory
2. src/utils/inference/dumper.py: save backbone CIF when use_invfold=false
- Move save_structure_cif() call outside the if variant is not None
block so backbone CIFs are written regardless of whether inverse
folding produced sequence variants
- When use_invfold=false, variant=None and the CIF contains only
backbone coordinates (no sequence annotation), which is the
expected and useful output for downstream backbone-only pipelines
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two small fixes to support backbone-only inference pipelines and parallel seed execution:
scripts/inference.py: decouple data output directory from Hydra working directory by introducing an optionalexp.dump_dirconfig override. When running multiple seeds in parallel, each worker needs its ownhydra.run.dirto avoid config file conflicts, while all workers should write outputs to a shared directory. Without this change, outputs from different workers overwrite each other. Original behavior (nodump_dirset) is unchanged.src/utils/inference/dumper.py: save backbone CIF files even whenuse_invfold=false. Previously,save_structure_cif()was inside theif variant is not Noneblock, so no CIF was written when inverse folding was skipped. Moving it outside the block ensures backbone structures are always saved, enabling downstream pipelines (e.g. motif scaffolding evaluation) that only need backbone coordinates.Compatibility
Both changes are fully backward-compatible. Existing usage without the new parameters is unaffected.
Test
Validated on MotifBench motif scaffolding tasks with
use_invfold=falseand parallel seed execution across multiple GPUs.