Skip to content

fix: bug in num_filters_passed counting#203

Open
chAwater wants to merge 2 commits into
HannesStark:mainfrom
chAwater:fix-filter-count
Open

fix: bug in num_filters_passed counting#203
chAwater wants to merge 2 commits into
HannesStark:mainfrom
chAwater:fix-filter-count

Conversation

@chAwater
Copy link
Copy Markdown
Contributor

fix bug in num_filters_passed

-            self.df["num_filters_passed"] += self.df[filter_cols].all(axis=1) # cumulative all-pass check
+            self.df["num_filters_passed"] += self.df[filter_col].astype(int)  # only cumulative this filter
             self.df["pass_filters"] = self.df[filter_cols].all(axis=1) 
  • verify the bug from data (in the final_ranked_designs dir)
import pandas as pd
df = pd.read_csv("./all_designs_metrics.csv")
filter_cols = [c for c in df.columns if c.startswith("pass_") and c.endswith("_filter")]
expected = df[filter_cols].sum(axis=1).astype(int)

# buggy data
print(df["num_filters_passed"].astype(int).value_counts().sort_index())
# expected data
print(expected.value_counts().sort_index())

mismatch = (df["num_filters_passed"].astype(int) != expected).sum()
print(f"Mismatched rows: {mismatch} / {len(df)}")
  • Luckily, the budget is typically small so that final_designs_metrics_{n}.csv only contains designs passing all filters. The final selection is usually unaffected (in my experience)

remove duplicate ligand_iptm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The dictionary eval_keys_confidence in src/boltzgen/data/const.py contains two entries for ligand_iptm. suspected bug in "num_filters_passed"

1 participant