Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
mwyczalkowski committed Nov 4, 2022
2 parents 5749042 + ebd1f38 commit 6a73a15
Show file tree
Hide file tree
Showing 6 changed files with 29 additions and 5,130 deletions.
6 changes: 3 additions & 3 deletions DCC_Analysis_Summary/README.Y3.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@

Last updated: 9/20/21
Last updated: 9/22/22

**BOLD** are new since 5/18/22

## Analysis Summaries - Case counts - Y3 only
Pipeline | AML | CCRCC | CM | GBM | HNSCC | LSCC | LUAD | PDA | SAR | UCEC | Total
--- | --- | ----- | --- | --- | ----- | ---- | ---- | --- | --- | ---- | ---
Methylation Array | 115 | 260 | 8 | 205 | 111 | 202 | 229 | 164 | 19 | 249 | 1562
Methylation Array |**172**| 260 | 8 |**239**| 111 | 202 | 229 | 164 | 19 | 249 |**1653**
miRNA-Seq |**172**|82 | 3 |**249**| 22 | 104 | 72 | 164 | 0 | 35 | **903**
RNA-Seq Expression |**172**|261 | 8 |**243**| 111 | 202 | 229 | 186 | 19 | 252 |**1683**
RNA-Seq Fusion |**172**| 110 | 0 |**127**| 0 | 89 | 109 | 81 | 0 | 170 | **858**
Expand All @@ -23,7 +23,7 @@ WXS Germline |**160**| 259 | 0 |**230**| 111 | 199 |229|

Pipeline | AML | CCRCC | CM | GBM | HNSCC | LSCC | LUAD | PDA | SAR | UCEC | Total
--- | --- | ----- | --- | --- | ----- | ---- | ---- | --- | --- | ---- | ---
Methylation Array | 115 | 260 | 8 | 205 | 111 | 202 | 229 | 164 | 19 | 249 | 1562
Methylation Array | 172 | 260 | 8 | 239 | 111 | 202 | 229 | 164 | 19 | 249 | 1653
miRNA-Seq | 172 | 261 | 8 | 249 | 111 | 202 | 229 | 164 | 19 | 250 | 1665
RNA-Seq Expression | 172 | 261 | 8 | 243 | 111 | 202 | 229 | 186 | 19 | 252 | 1683
RNA-Seq Fusion | 172 | 222 | 8 | 243 | 111 | 202 | 229 | 164 | 19 | 246 | 1616
Expand Down
97 changes: 26 additions & 71 deletions DCC_Analysis_Summary/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,40 +19,34 @@ DCC Analysis Summary files have the following initial columns:
10. md5sum
```

NEW: Column C3Y indicates "CPTAC3 Year" and takes values `Y1`, `Y2`, etc. It is used for administrative purposes.
Column C3Y indicates "CPTAC3 Year" and takes values `Y1`, `Y2`, etc. It is used for administrative purposes.

`DCC_path` is the path of each result at the [CPTAC DCC](https://cptac-data-portal.georgetown.edu/cptac/dataPrivate/list?scope=private).
Such results are also available relative to the following path on WUSTL RIS storage1:
```
/storage1/fs1/dinglab/Active/Projects/CPTAC3/Common/CPTAC3-DCC-Staging/DCC_STAGE_ROOT
```

Additional columns are specific to individual pipelines and will typically indicate the input data associated with this analysis.
Pipelines which generate multiple result files per case will have multiple entries in the analysis summary file.

## Analysis Summaries

Counts of cases processed per disease and pipeline. Counts of results to reference other than GDC hg38 excluded.
Counts of unique cases processed per disease and pipeline. Last updated 9/21/22.

Pipeline (last updated 10/28/21)| AML | CCRCC | CM | GBM | HNSCC | LSCC | LUAD | PDA | SAR | UCEC | Total
Pipeline | AML | CCRCC | CM | GBM | HNSCC | LSCC | LUAD | PDA | SAR | UCEC | Total
--- | --- | ----- | --- | --- | ----- | ---- | ---- | --- | --- | ---- | ---
Methylation Array | 43 | 260 | 8 | 205 | 111 | 202 | 229 | 164 | 19 | 249 | 1490
miRNA-Seq | 42 | 261 | 8 | 184 | 111 | 202 | 229 | 164 | 19 | 250 | 1480
RNA-Seq Expression | 42 | 261 | 8 | 189 | 111 | 202 | 229 | 186 | 19 | 252 | 1499
RNA-Seq Fusion | 11 | 222 | 8 | 188 | 111 | 202 | 229 | 164 | 19 | 246 | 1400
RNA-Seq Transcript + Splicing | 1 | 261 | 3 | 234 | 111 | 202 | 218 | 186 | 0 | 246 | 1462
WGS CNV Somatic | 0 | 258 | 0 | 187 | 111 | 202 | 229 | 166 | 0 | 243 | 1396
WGS SV | 0 | 258 | 0 | 187 | 111 | 201 | 219 | 166 | 0 | 243 | 1476
WXS MSI | 0 | 259 | 0 | 188 | 111 | 202 | 219 | 166 | 0 | 247 | 1392
WXS Somatic TD | 0 | 259 | 0 | 185 | 111 | 202 | 218 | 181 | 0 | 233 | 1401
WXS Somatic SW | 0 | 260 | 0 | 185 | 111 | 202 | 218 | 181 | 0 | 247 | 1404
WXS Germline | 0 | 259 | 0 | 186 | 111 | 199 | 219 | 166 | 0 | 233 | 1373


<!--
### QC Pipelines
QC Pipelines | AML | CCRCC | CM | GBM | HNSCC | LSCC | LUAD | PDA | SAR | UCEC | Total
--- | --- | ----- | --- | --- | ----- | ---- | ---- | --- | --- | ---- | ---
miRNA-Seq QC | 11 | 28 | 5 | 52 | 17 | 27 | 10 | 77 | 19 | 39 | 285
RNA-Seq QC | 11 | 112 | 8 | 119 | 111 | 113 | 53 | 83 | 19 | 77 | 706
WGS QC | 0 | 0 | 0 | 60 | 21 | 113 | 0 | 77 | 0 | 39 | 310
WXS QC | 5 | 87 | 4 | 118 | 111 | 113 | 101 | 44 | 9 | 51 | 643
-->
Methylation Array | 172 | 260 | 8 | 239 | 111 | 202 | 229 | 164 | 19 | 249 | 1653
miRNA-Seq | 172 | 261 | 8 | 249 | 111 | 202 | 229 | 164 | 19 | 250 | 1665
RNA-Seq Expression | 172 | 261 | 8 | 243 | 111 | 202 | 229 | 186 | 19 | 252 | 1683
RNA-Seq Fusion | 172 | 222 | 8 | 243 | 111 | 202 | 229 | 164 | 19 | 246 | 1616
RNA-Seq Transcript + Splicing | 172 | 261 | 3 | 244 | 111 | 202 | 218 | 186 | 0 | 246 | 1643
WGS CNV Somatic | 133 | 258 | 0 | 218 | 111 | 202 | 229 | 166 | 0 | 243 | 1560
WGS SV | 139 | 258 | 0 | 218 | 111 | 201 | 219 | 166 | 0 | 243 | 1646
WXS MSI | 160 | 259 | 0 | 228 | 111 | 202 | 219 | 166 | 0 | 247 | 1543
WXS Somatic TD | 160 | 259 | 0 | 225 | 111 | 202 | 218 | 181 | 0 | 233 | 1601
WXS Somatic SW | 160 | 260 | 0 | 225 | 111 | 202 | 218 | 181 | 0 | 247 | 1604
WXS Germline | 160 | 259 | 0 | 230 | 111 | 199 | 229 | 166 | 0 | 233 | 1587

## Year 1

Expand All @@ -71,8 +65,7 @@ pipeline details are included in documentation included with data files on DCC.

## Methylation_Array

Methylation array analysis has been performed for all cases available on DCC
through December 2019, details in
Details in
[`Methylation_Array.DCC_analysis_summary.dat`](Methylation_Array.DCC_analysis_summary.dat)

[CPTAC3 Methylation pipeline details](https://github.com/ding-lab/cptac_methylation)
Expand All @@ -86,26 +79,16 @@ Note that each sample has results for mature miRNA, precursor miRNA, and total m

[miRNA-Seq pipeline documentation](https://github.com/ding-lab/CPTAC_miRNA) and [processing description](processing_description/miRNA-Seq.processing_description.md).

### miRNA-Seq QC

Analysis details [`miRNA-Seq_QC.DCC_analysis_summary.dat`](miRNA-Seq_QC.DCC_analysis_summary.dat)

[QC pipeline documentation](https://github.com/ding-lab/CPTAC3_QC)

## RNA-Seq

### RNA-Seq Expression

965 cases (consisting of all Y1 and Y2 cases) been analyzed

Analysis details [`RNA-Seq_Expression.DCC_analysis_summary.dat`](RNA-Seq_Expression.DCC_analysis_summary.dat)

[CPTAC3 RNA-Seq Expression pipeline](https://github.com/ding-lab/cptac_rna_expression)

### RNA-Seq Fusion

1259 samples across 817 cases analyzed, including all of Y2 cases and LUAD from Y1.

Analysis details [`RNA-Seq_Fusion.DCC_analysis_summary.dat`](RNA-Seq_Fusion.DCC_analysis_summary.dat), and
[pipeline documentation on GitHub](https://github.com/cuidaniel/Fusion_hg38)

Expand All @@ -115,20 +98,10 @@ Analysis details [`RNA-Seq_Transcript.DCC_analysis_summary.dat`](RNA-Seq_Transcr

[Pipeline documentation on GitHub](https://github.com/ding-lab/CPTAC3_splicing)

### RNA-Seq QC

Analysis details [`RNA-Seq_QC.DCC_analysis_summary.dat`](RNA-Seq_QC.DCC_analysis_summary.dat)

[QC pipeline documentation](https://github.com/ding-lab/CPTAC3_QC)


## WGS

### WGS SV

Year 1 LUAD analyses are included.
Also in the DCC analysis summary file are 109 HNSCC cases aligned to a custom reference (`GRCh38_full_analysis_set_plus_decoy_hla`) for UMich group

Analysis details [`WGS_SV.DCC_analysis_summary.dat`](WGS_SV.DCC_analysis_summary.dat)

[CPTAC3 SomaticSV pipeline on GitHub](https://github.com/ding-lab/somatic_sv_workflow)
Expand All @@ -139,14 +112,6 @@ Analysis details [`WGS_CNV_Somatic.DCC_analysis_summary.dat`](WGS_CNV_Somatic.DC

[WGS CNV pipeline](https://github.com/ding-lab/BICSEQ2)

All Y1 analyses with pipeline version v2.0 have been added to the analysis summary.

### WGS QC

Analysis details [`WGS_QC.DCC_analysis_summary.dat`](WGS_QC.DCC_analysis_summary.dat)

[CPTAC3 QC pipeline](https://github.com/ding-lab/CPTAC3_QC)

## WXS

### WXS MSI
Expand All @@ -161,25 +126,15 @@ Analysis details [`WXS_Normal_Adjacent.DCC_analysis_summary.dat`](WXS_Normal_Adj

WXS Normal Adjacent analysis generated using [TinDaisy pipeline](https://github.com/ding-lab/TinDaisy)

### WXS QC

Analysis details [`WXS_QC.DCC_analysis_summary.dat`](WXS_QC.DCC_analysis_summary.dat)

[CPTAC3 QC pipeline](https://github.com/ding-lab/CPTAC3_QC)

### WXS Somatic
### WXS Somatic TD

WXS Somatic analysis is new for CPTAC3 Year 3. So far only 113 LSCC cases have been analyzed using
[TinDaisy variant caller v2.1](https://github.com/ding-lab/TinDaisy)
WXS Somatic analysis [TinDaisy variant caller v2.1](https://github.com/ding-lab/TinDaisy)

Analysis details [`WXS_Somatic.DCC_analysis_summary.dat`](WXS_Somatic.DCC_analysis_summary.dat)
Analysis details [`WXS_Somatic_Variant_TD.DCC_analysis_summary.dat`](WXS_Somatic_Variant_TD.DCC_analysis_summary.dat"

### WXS Somatic SW

113 LSCC and 109 LUAD cases in [`WXS_Somatic_Variant_SW.DCC_analysis_summary.dat`](WXS_Somatic_Variant_SW.DCC_analysis_summary.dat).
[`WXS_Somatic_Variant_SW.DCC_analysis_summary.dat`](WXS_Somatic_Variant_SW.DCC_analysis_summary.dat).

This is an ad hoc upload of calls generated by [SomaticWrapper v1.5](https://github.com/ding-lab/somaticwrapper)
and is provided for backwards compatiblity with prior SomaticWrapper calls.
`WXS_Somatic_Variant` pipeline above (based on TinDaisy) is expected to
ultimately replace these calls.
Generated by [SomaticWrapper](https://github.com/ding-lab/somaticwrapper).

Loading

0 comments on commit 6a73a15

Please sign in to comment.