Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .github/workflows/sync_mendeley_bib.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: Sync Mendeley Bib (Manual Only)

on:
workflow_dispatch:

jobs:
sync-bib:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

# TODO: replace this with your real mechanism for updating the bib file.
# For now this action is just a placeholder that can be expanded later.
- name: Show current bib
run: |
echo "Current BibTeX file contents:"
cat literature/mendeley-library.bib || echo "No bib file yet."

- name: No-op
run: echo "Update the bib file locally from Mendeley, then commit and push."
87 changes: 85 additions & 2 deletions literature/literature-map.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,86 @@
# Literature Map
# Literature Map for CT Phenotyping PhD

(Full detailed map from previous message inserted here.)
This document is a living overview of the field. Update it weekly.

## 1. COPD Clinical Background

- Key papers:
- [Hogg 2004] – small airway obstruction.
- [GOLD report] – definitions and staging.
- Main ideas:
- Open questions:

## 2. CT-Derived Imaging Biomarkers (Kirby Cluster)

- Key papers:
- [Kirby 2015, 2016, 2017, 2018, 2020, ...]
- Biomarkers:
- PRM
- Airway wall thickness
- Emphysema burden
- Small airway disease metrics
- How these are computed:
- Clinical relevance:
- Gaps / limitations:

## 3. Radiomics and Shape/Texture (Ward Cluster)

- Key papers:
- [Ward 2015, 2017, Paris/Ward 2020, Goddard/Ward 2021, Sorensen/Ward 2019]
- Feature families (texture, shape, intensity distributions):
- Reproducibility lessons:
- How they compare with deep features:

## 4. Phenotyping and Clustering

- COPD subtypes (Castaldi, Estepar, Kirby, Ward, others):
- Cluster definitions and what they represent biologically:
- Methods used (k-means, mixture models, hierarchical, etc.):
- How your project could extend or refine this work:

## 5. CT Preprocessing, Segmentation, and Harmonization

- What steps your pipeline performs (link to code in `src/preprocess/`):
- Mapping papers to steps:
- Segmentation:
- HU normalization:
- Resampling and voxel spacing:
- Scanner harmonization:
- Risks and common failure modes:

## 6. Deep Learning for 3D CT and COPD

- 3D CNN architectures:
- Self-supervised and contrastive learning approaches:
- COPD-specific deep learning studies:
- How these can plug into your distributed preprocessing pipeline:

## 7. Distributed Computing and Large-Scale Pipelines

- How Dask/Spark/Monai/others handle large CT datasets:
- What your project currently does:
- What would be needed to scale to:
- multi-site cohorts,
- cloud environments,
- mixed datasets (COPDGene, NLST, LIDC-IDRI).

## 8. Reproducibility, Bias, and Evaluation

- Sources of non-reproducibility in imaging studies:
- Dataset bias and cohort differences:
- Evaluation metrics and uncertainty:
- How you will design experiments to be reproducible.

## 9. Research Gaps and Potential Dissertation Aims

Update this section every 1–3 months.

- Gap 1:
- Gap 2:
- Gap 3:

Possible aims:

1. Aim 1:
2. Aim 2:
3. Aim 3:
13 changes: 12 additions & 1 deletion literature/mendeley-library.bib
Original file line number Diff line number Diff line change
@@ -1 +1,12 @@
% Last-Exported: 2025-11-27
% This file is managed by Mendeley.
% Export from Mendeley as BibTeX and overwrite this file regularly.
% File → Export → BibTeX (.bib)

% Example entry to test wiring; replace by exporting from Mendeley:

@article{example2025test,
title = {Example Paper for Testing},
author = {Doe, Jane},
journal = {Test Journal},
year = {2025}
}
210 changes: 208 additions & 2 deletions literature/reading-plan/12_month_reading_plan.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,209 @@
# 12-Month Reading Plan
# 12-Month Reading Plan for CT Phenotyping PhD Preparation

(Full detailed plan from previous message inserted here.)
This plan is organized so that understanding compounds:
clinical foundations → CT physics → airway/parenchymal biomarkers →
deep learning → harmonization → high-scale ML → dissertation prep.

## Month 1 — Clinical Foundations of COPD & CT Imaging

Goal: understand the disease you are phenotyping.

**Weeks 1–2: COPD Pathophysiology**

- Hogg et al., 2004 — small airway obstruction in COPD.
- GOLD executive summary (latest version).
- Precision medicine approaches to COPD (review).

**Weeks 3–4: CT as a clinical measurement tool**

- Kirby et al., 2016 — CT-derived imaging biomarkers for COPD.
- CT densitometry overview papers.
- Galbán et al., 2012 — Parametric Response Mapping (PRM).

---

## Month 2 — Quantitative CT Biomarkers (Kirby Core Papers)

Goal: master functional small airway disease imaging and parenchymal phenotyping.

**Weeks 5–6**

- Kirby et al., 2015 — quantitative CT of airway disease.
- Kirby et al., 2017 — calibration / harmonization for phenotyping.
- Kirby et al., 2020 — PRM diagnostic performance in COPD.

**Weeks 7–8**

- Labaki et al., 2018 — PRM for emphysema vs air trapping.
- San José Estépar et al., 2015 — airway geometric phenotypes.
- Kirby harmonization papers (2018–2022).

---

## Month 3 — Airway & Parenchymal Morphology (Ward Cluster)

Goal: understand shape, texture, and radiomic bases of COPD quantification.

**Weeks 9–10**

- Ward et al., 2015 — texture-based lung phenotyping.
- Ward et al., 2017 — quantitative imaging biomarkers for COPD.
- Sorensen & Ward, 2019 — reproducibility of radiomic features.

**Weeks 11–12**

- Paris & Ward, 2020 — airway tree modeling and structural signatures.
- Goddard & Ward, 2021 — emphysema structural subtypes.
- van Griethuysen et al., 2017 — PyRadiomics and reproducibility.

---

## Month 4 — CT Preprocessing & Harmonization

Goal: deeply understand what your preprocessing pipeline does and why.

**Weeks 13–14: CT Physics & HU normalization**

- Chen et al., 2020 — standardization of lung densitometry.
- Kirby 2018 harmonization paper(s).

**Weeks 15–16: Lung segmentation and reconstruction effects**

- Hofmanninger et al., 2020 — robust U-Net for lung segmentation.
- Maier-Hein et al., 2018 — pitfalls of deep learning in medical imaging.

---

## Month 5 — Phenotyping Methods & Cluster Analysis

Goal: learn unsupervised and supervised COPD phenotyping methods.

**Weeks 17–18**

- Castaldi et al., 2014 — cluster-based COPD subtypes.
- San José Estépar et al., 2015 — airway geometry and subtypes.

**Weeks 19–20**

- Revisit PRM and air trapping papers (Galbán, Labaki, Kirby).
- Rahaghi et al., 2019 — automated airway phenotyping.

---

## Month 6 — Deep Learning for 3D CT

Goal: understand foundations of 3D CNNs and self-supervised learning (SSL).

**Weeks 21–22: 3D model foundations**

- Çiçek et al., 2016 — 3D U-Net.
- Kamnitsas et al., 2017 — DeepMedic.

**Weeks 23–24: Self-supervised and contrastive learning**

- Tang et al., 2022 — Models Genesis or equivalent 3D SSL work.
- Chen et al., 2020 — MoCo (core contrastive framework).
- Recent surveys on SSL in medical imaging.

---

## Month 7 — Deep Learning for COPD Phenotyping

Goal: see modern end-to-end deep learning approaches to lung phenotyping.

**Weeks 25–26**

- Kurugol et al., 2021 — 3D CNNs for COPD progression.
- Azarang et al., 2021 — deep feature learning for COPD risk.
- Related SPIROMICS imaging papers.

**Weeks 27–28**

- Ward et al., 2021 — deep features vs classical radiomics.
- Airway-oriented CNN pipelines for asthma/COPD.

---

## Month 8 — Large-Scale Pipelines & Distributed Computing

Goal: link reading to the Dask-based preprocessing in this repo.

**Weeks 29–30: Dask foundations**

- Rocklin, 2015 — Dask: parallel computation in Python.
- Khan et al., 2020 — scalable medical imaging pipelines.

**Weeks 31–32: Cloud-native imaging**

- Zarr format overview for large 3D arrays.
- MONAI data loading and preprocessing patterns.
- Overview of cloud-native imaging frameworks (e.g., Clara, open-source alternatives).

---

## Month 9 — Scanner Variability, Reproducibility & Bias

Goal: prepare to answer committee-level questions on rigor and generalization.

**Weeks 33–34**

- Maier-Hein et al., 2018 — reproducibility crisis in medical imaging.
- Ward/Sorensen radiomics robustness papers.

**Weeks 35–36**

- Papers on dataset bias in medical imaging.
- Harmonization and domain adaptation methods for CT.

---

## Month 10 — Evaluation Frameworks & Uncertainty

Goal: understand evaluation beyond a single metric.

**Weeks 37–38**

- Kendall & Gal, 2017 — aleatoric and epistemic uncertainty.
- Sokol & Flach, 2020 — explainability in medical ML.

**Weeks 39–40**

- Topol, 2019 — high-performance medicine.
- Imaging-based prognosis prediction / risk models.

---

## Month 11 — Datasets Deep Dive (NLST, COPDGene, LIDC)

Goal: full command of the datasets the pipeline targets.

**Weeks 41–42**

- Regan et al., 2010 — COPDGene study design.
- Black-Shinn et al., 2019 — NLST dataset overview.
- Armato et al., 2011 — LIDC-IDRI.

**Weeks 43–44**

- Kirby and Ward papers using COPDGene/NLST.
- Review metadata schemas and harmonization strategies used.

---

## Month 12 — Proposal, Synthesis & Research Direction

Goal: integrate everything into PhD-ready research aims.

**Weeks 45–48**

- Re-read your “top 20” most central papers.
- Summarize key methods in a comparative table.
- Identify gaps in literature.
- Draft 2–3 possible dissertation aims.

**Weeks 49–52**

- Write a preliminary research proposal (4–8 pages).
- Extend `literature-map.md` with a narrative of the field.
- Prepare draft slides for a mock committee presentation.
- Use these materials to support meetings with potential supervisors.
48 changes: 47 additions & 1 deletion literature/summaries/TEMPLATE_paper_summary.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,49 @@
# Paper Summary Template

(Full template inserted here.)
Replace the title and citation fields and save as:
`LastName_Year_shortTopic.md`

## Citation

- Title:
- Authors:
- Journal / venue:
- Year:
- DOI or URL:
- Tags: `airway`, `parenchyma`, `PRM`, `radiomics`, `3D-CNN`, `ssl`, `COPDGene`, `NLST`, `LIDC`, `Kirby`, `Ward`, etc.

## 1. Five-Sentence Summary

1. Problem addressed:
2. Dataset(s) used:
3. Method (inputs → transformations → outputs):
4. Key results:
5. Limitations:

## 2. Methods Details

- Input data:
- Preprocessing:
- Model / algorithm:
- Evaluation setup and metrics:

## 3. Relevance to COPD Phenotyping

- Which phenotype(s) or biomarkers are being modeled?
- How does this relate to airway, parenchymal, or functional small airway disease?
- How does this relate to Kirby’s or Ward’s work?

## 4. Relevance to This Repository

- Which parts of the codebase does this influence?
- `ingest/`
- `preprocess/`
- `features/`
- `train/`
- TODO: concrete ideas for new features, tests, or benchmarks.

## 5. Ideas and Open Questions

- Ideas for follow-up experiments:
- Questions about assumptions or limitations:
- How this could become part of a PhD aim:
Loading
Loading