Skip to content

Refactor summary_lines to ChainPairScoreResults dataclass#1

Merged
y1zhou merged 4 commits into
mainfrom
copilot/refactor-summary-lines-dataclass
Dec 2, 2025
Merged

Refactor summary_lines to ChainPairScoreResults dataclass#1
y1zhou merged 4 commits into
mainfrom
copilot/refactor-summary-lines-dataclass

Conversation

Copy link
Copy Markdown

Copilot AI commented Dec 2, 2025

Introduces a ChainPairScoreResults dataclass for chain-pair summary scores, matching the existing PerResScoreResults pattern. Replaces list[str] with structured data and strict formatting methods.

Changes

  • New ChainPairScoreResults dataclass with 24 fields for chain-pair metrics (Chn1/Chn2, PAE/Dist cutoffs, ipSAE variants, pDockQ/pDockQ2, LIS, n0/d0 values, etc.)
  • Formatting methods: to_formatted_line(), header_line(), csv_header_line() for consistent output
  • Renamed ScoreResults.summary_lines: list[str]chain_pair_scores: list[ChainPairScoreResults]
  • Renamed ScoreResults.by_res_databy_res_scores for consistency
  • Updated aggregate_byres_scores() to return list[ChainPairScoreResults]
  • Updated write_outputs() and main() to use dataclass methods
@dataclass
class ChainPairScoreResults:
    Chn1: str
    Chn2: str
    PAE: int
    Dist: int
    Type: str  # "asym" or "max"
    ipSAE: float
    # ... 18 more fields
    
    def to_formatted_line(self) -> str:
        """Fixed-width string matching original output format."""
        ...
    
    @staticmethod
    def header_line() -> str:
        ...

Output files verified identical to existing examples for AF2 and AF3 model types.

Original prompt

Similar to PerResScoreResults in ipsae.py, refactor summary_lines to use its own dataclass. Be very strict about the line formatting. Modify corresponding functions that use the results to write to stdout or text files.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI self-assigned this Dec 2, 2025
Copilot AI and others added 2 commits December 2, 2025 10:23
Co-authored-by: y1zhou <17245097+y1zhou@users.noreply.github.com>
Co-authored-by: y1zhou <17245097+y1zhou@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor summary_lines to use its own dataclass Refactor summary_lines to SummaryResult dataclass Dec 2, 2025
Copilot AI requested a review from y1zhou December 2, 2025 10:28
@y1zhou
Copy link
Copy Markdown
Owner

y1zhou commented Dec 2, 2025

@copilot Change the class name to ChainPairScoreResults. For ScoreResults, change the corresponding field name to chain_pair_scores. Also change the by_res_data to by_res_scores to be consistent.

Co-authored-by: y1zhou <17245097+y1zhou@users.noreply.github.com>
Copy link
Copy Markdown
Author

Copilot AI commented Dec 2, 2025

@copilot Change the class name to ChainPairScoreResults. For ScoreResults, change the corresponding field name to chain_pair_scores. Also change the by_res_data to by_res_scores to be consis...

Done in 1e11f5a. Renamed:

  • SummaryResultChainPairScoreResults
  • summary_datachain_pair_scores
  • by_res_databy_res_scores

Copilot AI changed the title Refactor summary_lines to SummaryResult dataclass Refactor summary_lines to ChainPairScoreResults dataclass Dec 2, 2025
@y1zhou
Copy link
Copy Markdown
Owner

y1zhou commented Dec 2, 2025

@copilot What are some potential ways of refactoring the pymol_script strings as well? They are not as structured.

Copy link
Copy Markdown
Author

Copilot AI commented Dec 2, 2025

@copilot What are some potential ways of refactoring the pymol_script strings as well? They are not as structured.

The pymol_script contains two types of lines:

  1. Comment lines - Already derived from ChainPairScoreResults.to_formatted_line() with a # prefix
  2. Alias commands - e.g., alias color_A_B, color gray80, all; color magenta, chain A and resi 3-119; ...

Potential refactoring approaches:

Option 1: PyMOLCommand dataclass hierarchy

@dataclass
class PyMOLAliasCommand:
    name: str  # e.g., "color_A_B"
    chain1: str
    chain2: str
    color1: str
    color2: str
    resi_ranges1: str
    resi_ranges2: str
    
    def to_line(self) -> str: ...

Option 2: Extend ChainPairScoreResults
Add a to_pymol_alias() method to ChainPairScoreResults that generates the alias command (would need residue range info passed in or stored).

Option 3: Separate PyMOLScript dataclass

@dataclass
class PyMOLScript:
    header: str
    chain_pair_comments: list[str]  # from ChainPairScoreResults
    alias_commands: list[PyMOLAliasCommand]
    
    def to_lines(self) -> list[str]: ...

Option 1 is cleanest for the alias commands. Option 3 provides the best structure for the full script. The challenge is that alias commands need residue range data that isn't currently in ChainPairScoreResults.

@y1zhou y1zhou marked this pull request as ready for review December 2, 2025 15:23
@y1zhou y1zhou merged commit 443b43c into main Dec 2, 2025
1 check passed
@y1zhou y1zhou deleted the copilot/refactor-summary-lines-dataclass branch December 2, 2025 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants