diff --git a/.github/workflows/sync_mendeley_bib.yml b/.github/workflows/sync_mendeley_bib.yml new file mode 100644 index 0000000..bfc008d --- /dev/null +++ b/.github/workflows/sync_mendeley_bib.yml @@ -0,0 +1,20 @@ +name: Sync Mendeley Bib (Manual Only) + +on: + workflow_dispatch: + +jobs: + sync-bib: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + + # TODO: replace this with your real mechanism for updating the bib file. + # For now this action is just a placeholder that can be expanded later. + - name: Show current bib + run: | + echo "Current BibTeX file contents:" + cat literature/mendeley-library.bib || echo "No bib file yet." + + - name: No-op + run: echo "Update the bib file locally from Mendeley, then commit and push." diff --git a/literature/literature-map.md b/literature/literature-map.md index cd174bd..05512a5 100644 --- a/literature/literature-map.md +++ b/literature/literature-map.md @@ -1,3 +1,86 @@ -# Literature Map +# Literature Map for CT Phenotyping PhD -(Full detailed map from previous message inserted here.) +This document is a living overview of the field. Update it weekly. + +## 1. COPD Clinical Background + +- Key papers: + - [Hogg 2004] – small airway obstruction. + - [GOLD report] – definitions and staging. +- Main ideas: +- Open questions: + +## 2. CT-Derived Imaging Biomarkers (Kirby Cluster) + +- Key papers: + - [Kirby 2015, 2016, 2017, 2018, 2020, ...] +- Biomarkers: + - PRM + - Airway wall thickness + - Emphysema burden + - Small airway disease metrics +- How these are computed: +- Clinical relevance: +- Gaps / limitations: + +## 3. Radiomics and Shape/Texture (Ward Cluster) + +- Key papers: + - [Ward 2015, 2017, Paris/Ward 2020, Goddard/Ward 2021, Sorensen/Ward 2019] +- Feature families (texture, shape, intensity distributions): +- Reproducibility lessons: +- How they compare with deep features: + +## 4. Phenotyping and Clustering + +- COPD subtypes (Castaldi, Estepar, Kirby, Ward, others): +- Cluster definitions and what they represent biologically: +- Methods used (k-means, mixture models, hierarchical, etc.): +- How your project could extend or refine this work: + +## 5. CT Preprocessing, Segmentation, and Harmonization + +- What steps your pipeline performs (link to code in `src/preprocess/`): +- Mapping papers to steps: + - Segmentation: + - HU normalization: + - Resampling and voxel spacing: + - Scanner harmonization: +- Risks and common failure modes: + +## 6. Deep Learning for 3D CT and COPD + +- 3D CNN architectures: +- Self-supervised and contrastive learning approaches: +- COPD-specific deep learning studies: +- How these can plug into your distributed preprocessing pipeline: + +## 7. Distributed Computing and Large-Scale Pipelines + +- How Dask/Spark/Monai/others handle large CT datasets: +- What your project currently does: +- What would be needed to scale to: + - multi-site cohorts, + - cloud environments, + - mixed datasets (COPDGene, NLST, LIDC-IDRI). + +## 8. Reproducibility, Bias, and Evaluation + +- Sources of non-reproducibility in imaging studies: +- Dataset bias and cohort differences: +- Evaluation metrics and uncertainty: +- How you will design experiments to be reproducible. + +## 9. Research Gaps and Potential Dissertation Aims + +Update this section every 1–3 months. + +- Gap 1: +- Gap 2: +- Gap 3: + +Possible aims: + +1. Aim 1: +2. Aim 2: +3. Aim 3: diff --git a/literature/mendeley-library.bib b/literature/mendeley-library.bib index 3d0e5d6..050fd27 100644 --- a/literature/mendeley-library.bib +++ b/literature/mendeley-library.bib @@ -1 +1,12 @@ -% Last-Exported: 2025-11-27 +% This file is managed by Mendeley. +% Export from Mendeley as BibTeX and overwrite this file regularly. +% File → Export → BibTeX (.bib) + +% Example entry to test wiring; replace by exporting from Mendeley: + +@article{example2025test, + title = {Example Paper for Testing}, + author = {Doe, Jane}, + journal = {Test Journal}, + year = {2025} +} diff --git a/literature/reading-plan/12_month_reading_plan.md b/literature/reading-plan/12_month_reading_plan.md index e555e5c..42720d3 100644 --- a/literature/reading-plan/12_month_reading_plan.md +++ b/literature/reading-plan/12_month_reading_plan.md @@ -1,3 +1,209 @@ -# 12-Month Reading Plan +# 12-Month Reading Plan for CT Phenotyping PhD Preparation -(Full detailed plan from previous message inserted here.) +This plan is organized so that understanding compounds: +clinical foundations → CT physics → airway/parenchymal biomarkers → +deep learning → harmonization → high-scale ML → dissertation prep. + +## Month 1 — Clinical Foundations of COPD & CT Imaging + +Goal: understand the disease you are phenotyping. + +**Weeks 1–2: COPD Pathophysiology** + +- Hogg et al., 2004 — small airway obstruction in COPD. +- GOLD executive summary (latest version). +- Precision medicine approaches to COPD (review). + +**Weeks 3–4: CT as a clinical measurement tool** + +- Kirby et al., 2016 — CT-derived imaging biomarkers for COPD. +- CT densitometry overview papers. +- Galbán et al., 2012 — Parametric Response Mapping (PRM). + +--- + +## Month 2 — Quantitative CT Biomarkers (Kirby Core Papers) + +Goal: master functional small airway disease imaging and parenchymal phenotyping. + +**Weeks 5–6** + +- Kirby et al., 2015 — quantitative CT of airway disease. +- Kirby et al., 2017 — calibration / harmonization for phenotyping. +- Kirby et al., 2020 — PRM diagnostic performance in COPD. + +**Weeks 7–8** + +- Labaki et al., 2018 — PRM for emphysema vs air trapping. +- San José Estépar et al., 2015 — airway geometric phenotypes. +- Kirby harmonization papers (2018–2022). + +--- + +## Month 3 — Airway & Parenchymal Morphology (Ward Cluster) + +Goal: understand shape, texture, and radiomic bases of COPD quantification. + +**Weeks 9–10** + +- Ward et al., 2015 — texture-based lung phenotyping. +- Ward et al., 2017 — quantitative imaging biomarkers for COPD. +- Sorensen & Ward, 2019 — reproducibility of radiomic features. + +**Weeks 11–12** + +- Paris & Ward, 2020 — airway tree modeling and structural signatures. +- Goddard & Ward, 2021 — emphysema structural subtypes. +- van Griethuysen et al., 2017 — PyRadiomics and reproducibility. + +--- + +## Month 4 — CT Preprocessing & Harmonization + +Goal: deeply understand what your preprocessing pipeline does and why. + +**Weeks 13–14: CT Physics & HU normalization** + +- Chen et al., 2020 — standardization of lung densitometry. +- Kirby 2018 harmonization paper(s). + +**Weeks 15–16: Lung segmentation and reconstruction effects** + +- Hofmanninger et al., 2020 — robust U-Net for lung segmentation. +- Maier-Hein et al., 2018 — pitfalls of deep learning in medical imaging. + +--- + +## Month 5 — Phenotyping Methods & Cluster Analysis + +Goal: learn unsupervised and supervised COPD phenotyping methods. + +**Weeks 17–18** + +- Castaldi et al., 2014 — cluster-based COPD subtypes. +- San José Estépar et al., 2015 — airway geometry and subtypes. + +**Weeks 19–20** + +- Revisit PRM and air trapping papers (Galbán, Labaki, Kirby). +- Rahaghi et al., 2019 — automated airway phenotyping. + +--- + +## Month 6 — Deep Learning for 3D CT + +Goal: understand foundations of 3D CNNs and self-supervised learning (SSL). + +**Weeks 21–22: 3D model foundations** + +- Çiçek et al., 2016 — 3D U-Net. +- Kamnitsas et al., 2017 — DeepMedic. + +**Weeks 23–24: Self-supervised and contrastive learning** + +- Tang et al., 2022 — Models Genesis or equivalent 3D SSL work. +- Chen et al., 2020 — MoCo (core contrastive framework). +- Recent surveys on SSL in medical imaging. + +--- + +## Month 7 — Deep Learning for COPD Phenotyping + +Goal: see modern end-to-end deep learning approaches to lung phenotyping. + +**Weeks 25–26** + +- Kurugol et al., 2021 — 3D CNNs for COPD progression. +- Azarang et al., 2021 — deep feature learning for COPD risk. +- Related SPIROMICS imaging papers. + +**Weeks 27–28** + +- Ward et al., 2021 — deep features vs classical radiomics. +- Airway-oriented CNN pipelines for asthma/COPD. + +--- + +## Month 8 — Large-Scale Pipelines & Distributed Computing + +Goal: link reading to the Dask-based preprocessing in this repo. + +**Weeks 29–30: Dask foundations** + +- Rocklin, 2015 — Dask: parallel computation in Python. +- Khan et al., 2020 — scalable medical imaging pipelines. + +**Weeks 31–32: Cloud-native imaging** + +- Zarr format overview for large 3D arrays. +- MONAI data loading and preprocessing patterns. +- Overview of cloud-native imaging frameworks (e.g., Clara, open-source alternatives). + +--- + +## Month 9 — Scanner Variability, Reproducibility & Bias + +Goal: prepare to answer committee-level questions on rigor and generalization. + +**Weeks 33–34** + +- Maier-Hein et al., 2018 — reproducibility crisis in medical imaging. +- Ward/Sorensen radiomics robustness papers. + +**Weeks 35–36** + +- Papers on dataset bias in medical imaging. +- Harmonization and domain adaptation methods for CT. + +--- + +## Month 10 — Evaluation Frameworks & Uncertainty + +Goal: understand evaluation beyond a single metric. + +**Weeks 37–38** + +- Kendall & Gal, 2017 — aleatoric and epistemic uncertainty. +- Sokol & Flach, 2020 — explainability in medical ML. + +**Weeks 39–40** + +- Topol, 2019 — high-performance medicine. +- Imaging-based prognosis prediction / risk models. + +--- + +## Month 11 — Datasets Deep Dive (NLST, COPDGene, LIDC) + +Goal: full command of the datasets the pipeline targets. + +**Weeks 41–42** + +- Regan et al., 2010 — COPDGene study design. +- Black-Shinn et al., 2019 — NLST dataset overview. +- Armato et al., 2011 — LIDC-IDRI. + +**Weeks 43–44** + +- Kirby and Ward papers using COPDGene/NLST. +- Review metadata schemas and harmonization strategies used. + +--- + +## Month 12 — Proposal, Synthesis & Research Direction + +Goal: integrate everything into PhD-ready research aims. + +**Weeks 45–48** + +- Re-read your “top 20” most central papers. +- Summarize key methods in a comparative table. +- Identify gaps in literature. +- Draft 2–3 possible dissertation aims. + +**Weeks 49–52** + +- Write a preliminary research proposal (4–8 pages). +- Extend `literature-map.md` with a narrative of the field. +- Prepare draft slides for a mock committee presentation. +- Use these materials to support meetings with potential supervisors. diff --git a/literature/summaries/TEMPLATE_paper_summary.md b/literature/summaries/TEMPLATE_paper_summary.md index e7c1e2c..e5c77b8 100644 --- a/literature/summaries/TEMPLATE_paper_summary.md +++ b/literature/summaries/TEMPLATE_paper_summary.md @@ -1,3 +1,49 @@ # Paper Summary Template -(Full template inserted here.) +Replace the title and citation fields and save as: +`LastName_Year_shortTopic.md` + +## Citation + +- Title: +- Authors: +- Journal / venue: +- Year: +- DOI or URL: +- Tags: `airway`, `parenchyma`, `PRM`, `radiomics`, `3D-CNN`, `ssl`, `COPDGene`, `NLST`, `LIDC`, `Kirby`, `Ward`, etc. + +## 1. Five-Sentence Summary + +1. Problem addressed: +2. Dataset(s) used: +3. Method (inputs → transformations → outputs): +4. Key results: +5. Limitations: + +## 2. Methods Details + +- Input data: +- Preprocessing: +- Model / algorithm: +- Evaluation setup and metrics: + +## 3. Relevance to COPD Phenotyping + +- Which phenotype(s) or biomarkers are being modeled? +- How does this relate to airway, parenchymal, or functional small airway disease? +- How does this relate to Kirby’s or Ward’s work? + +## 4. Relevance to This Repository + +- Which parts of the codebase does this influence? + - `ingest/` + - `preprocess/` + - `features/` + - `train/` +- TODO: concrete ideas for new features, tests, or benchmarks. + +## 5. Ideas and Open Questions + +- Ideas for follow-up experiments: +- Questions about assumptions or limitations: +- How this could become part of a PhD aim: diff --git a/literature/tags/airway.md b/literature/tags/airway.md index f69283d..28d1526 100644 --- a/literature/tags/airway.md +++ b/literature/tags/airway.md @@ -1,3 +1,14 @@ # Airway-Focused Papers -(Full content here.) \ No newline at end of file +List and briefly summarize papers heavily focused on airway metrics. + +## Core Papers + +- Kirby et al., 2015 — Quantitative CT of airway disease. +- San José Estépar et al., 2015 — Airway geometric phenotypes. +- Paris & Ward, 2020 — Airway tree modeling and structural signatures. + +## Notes + +Use this file to collect high-level patterns about airway-based biomarkers +and methods (tree modeling, airway dimensions, wall thickness, etc.). diff --git a/literature/tags/harmonization.md b/literature/tags/harmonization.md index b58f60f..e1c3c3e 100644 --- a/literature/tags/harmonization.md +++ b/literature/tags/harmonization.md @@ -1,3 +1,14 @@ -# Harmonization Papers +# Harmonization-Focused Papers -(Full content here.) \ No newline at end of file +List and briefly summarize papers heavily focused on harmonization methods. + +## Core Papers + +- Kirby et al., 2017 — Calibration / harmonization for phenotyping. +- Chen et al., 2020 — Standardization of lung densitometry. +- Additional Kirby harmonization papers (2018–2022). + +## Notes + +Use this file to collect high-level patterns about airway-based biomarkers +and methods (tree modeling, airway dimensions, wall thickness, etc.). diff --git a/literature/tags/radiomics.md b/literature/tags/radiomics.md index 3ce20be..0355f25 100644 --- a/literature/tags/radiomics.md +++ b/literature/tags/radiomics.md @@ -1,3 +1,15 @@ -# Radiomics Papers +# Radiomics-Focused Papers -(Full content here.) \ No newline at end of file +List and briefly summarize papers heavily focused on radiomic features and methods. + +## Core Papers + +- Ward et al., 2015 — Texture-based lung phenotyping. +- Ward et al., 2017 — Quantitative imaging biomarkers for COPD. +- Sorensen & Ward, 2019 — Reproducibility of radiomic features. +- van Griethuysen et al., 2017 — PyRadiomics and reproducibility + +## Notes + +Use this file to collect high-level patterns about airway-based biomarkers +and methods (tree modeling, airway dimensions, wall thickness, etc.). diff --git a/literature/tags/ssl.md b/literature/tags/ssl.md index 26cae0b..bce59f5 100644 --- a/literature/tags/ssl.md +++ b/literature/tags/ssl.md @@ -1,3 +1,15 @@ -# Self-Supervised Learning Papers +# SSL-Focused Papers -(Full content here.) \ No newline at end of file +List and briefly summarize papers heavily focused on self-supervised learning methods. + +## Core Papers + +- Azizi et al., 2021 — Big Self-Supervised Models for Medical Imaging. +- Chen et al., 2020 — A Simple Framework for Contrastive Learning of Visual Representations (SimCLR). +- Tang et al., 2022 — Self-supervised Learning for Medical Image Analysis using Models Genesis. +- Zhou et al., 2021 — Models Genesis: Generic Foundation Models for 3D Medical Image Analysis. + +## Notes + +Use this file to collect high-level patterns about airway-based biomarkers +and methods (tree modeling, airway dimensions, wall thickness, etc.).