Clarification on Phillips 2021 dataset processing

Hi, thank you for putting together FLAb. It is a great resource. While working with the Phillips et al. 2021
   binding affinity data (`phillips2021binding_*.csv`), I had a few questions about how the data was processed
  and wanted to check my understanding.                                                                       
                                                                                                            
  1. **Genotype filtering**

  The four Phillips CSVs appear to contain only genotypes where the first position is '1' (i.e., the first
  mutation is the somatic allele). For example, `phillips2021binding_cr9114_h3_kd.csv` has 32,768 rows, all
  starting with '1', which is exactly 2^15 (half of the full 2^16 = 65,536 combinatorial library). The same
  pattern holds for the other three files.

  Was this filtering intentional? The original data from the [paper](https://github.com/klawrence26/bnab-landscapes)
  contains the full genotype space including the germline sequence (all-0 genotype). I could not find
  documentation for this in the README or metadata files, so I wanted to confirm.

  2. **Flu B antigen**

  Phillips et al. measured CR9114 binding against three antigens: H1, H3, and Flu B. The Flu B data does not
  appear to be included in FLAb. Was this excluded deliberately (perhaps because only 198/65,536 variants show
   measurable binding)?

  3. **Metadata labels**

  In `flab_metadata.csv`, the Phillips entries list the assay as `SPR Kd` and the units as `-log( Kd [nM]) Fab`.
   My reading of the paper is that the measurement method is Tite-Seq (flow cytometry + deep sequencing)
  rather than SPR, and that the antibody format is scFv (single-chain variable fragment on yeast display)
  rather than Fab. Could you confirm whether these labels are correct, or if they should be updated?

  Thanks again for maintaining this resource. Happy to discuss further.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on Phillips 2021 dataset processing #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarification on Phillips 2021 dataset processing #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions