Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

🎢 Getting Started

Prerequisites

Python environment

We suggest both teams to install the provided environments. We recommend using micromamba, though you may also replace mamba with conda.

micromamba create --file environment.yaml --name <env_name>

To be able to run CTGAN with Differential Privacy we utilize smartnoise-synth package which requires a specific version of opacus. Thus, we provide a specific environment to be able to reproduce this experiment.

micromamba create --file dpctgan_environment.yaml --name <env_name>

Visit the Blue Team Home Page for detailed instructions.

Visit the Red Team Home Page for detailed instructions.

📝 CITATIONS

Please make sure to cite the following papers if any of the baseline methods and evaluation metrics are mentioned/utilized in your CAMDA extended abstracts.

Competition related

  1. CAMDA 2025 Health Privacy Challenge

Dataset sources

  1. Genomic Data Commons (GDC), https://gdc.cancer.gov/, https://portal.gdc.cancer.gov/, accessed on Nov 1, 2024

Dataset preprocessing

  1. Chen, Dingfan, Marie Oestreich, Tejumade Afonja, Raouf Kerkouche, Matthias Becker, and Mario Fritz. "Towards biologically plausible and private gene expression data generation." arXiv preprint. (2024)

  2. Love, Michael I., Wolfgang Huber, and Simon Anders. "Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2." Genome biology. (2014)

  3. Colaprico, Antonio, Tiago C. Silva, Catharina Olsen, Luciano Garofano, Claudia Cava, Davide Garolini, Thais S. Sabedot et al. "TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data." Nucleic acids research. (2016)

  4. Subramanian, A., Narayan, R., Corsello, S.M., Peck, D.D., Natoli, T.E., Lu, X., Gould, J., Davis, J.F., Tubelli, A.A., Asiedu, J.K. and Lahr, D.L. "A next generation connectivity map: L1000 platform and the first 1,000,000 profiles." Cell. (2017)

  5. Landmark genes, https://clue.io/command?q=/gene-space%20lm, accessed on Nov 1, 2024

Generative models and evaluations

  1. Sohn, Kihyuk, Honglak Lee, and Xinchen Yan. "Learning structured output representation using deep conditional generative models." Advances in neural information processing systems. (2015)

  2. Xu, Lei, Maria Skoularidou, Alfredo Cuesta-Infante, and Kalyan Veeramachaneni. "Modeling tabular data using conditional gan." Advances in neural information processing systems. (2019)

  3. Holsten, L., Dahm, K., Oestreich, M., Becker, M., & Ulas, T. "hCoCena: A toolbox for network-based co-expression analysis and horizontal integration of transcriptomic datasets. STAR protocols." (2024)

  4. Lun ATL, McCarthy DJ, Marioni JC. “A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor.” F1000Res. (2016)

Membership inference attack models

  1. Van Breugel, B., Sun, H., Qian, Z., & van der Schaar, M. "Membership inference attacks against synthetic data through overfitting detection." arXiv preprint. (2023)

  2. Chen, D, Yu, N., Zhang, Y., and Fritz, M. "Gan-leaks: A taxonomy of membership inference attacks against generative models." In Proceedings of the 2020 ACM SIGSAC conference on computer and communications security (2020)

  3. Hilprecht, B., Härterich, M., & Bernau, D. "Monte carlo and reconstruction membership inference attacks against generative models." Proceedings on Privacy Enhancing Technologies. (2019)

  4. Hayes, J., Melis, L., Danezis, G. & De Cristofaro, E. "Logan: Membership inference attacks against generative models." arXiv preprint. (2019)