Skip to content

Add function to create total CNV column #1157

@sjspielman

Description

@sjspielman

If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.

Part of #1143

Describe the goals of the changes to the analysis module.

We have this function in the exploratory notebooks in the infercnv-consensus-cell-type module appearing in multiple exploratory notebooks:

```{r}
prepare_cnv_df <- function(infercnv_tsv) {
infercnv_df <- readr::read_tsv(infercnv_tsv)
infercnv_metadata_df <- infercnv_df |>
dplyr::select(
-starts_with("has_"),
-starts_with("proportion_"),
-starts_with("top_")
) |>
dplyr::mutate(cell_group = stringr::str_split_i(subcluster, "_", 1))
infercnv_df |>
tidyr::pivot_longer(
starts_with("has_cnv_"),
names_to = "chr",
values_to = "cnv"
) |>
# sum all values for each
dplyr::group_by(cell_id) |>
dplyr::summarize(total_cnv_per_cell = sum(cnv)) |>
dplyr::ungroup() |>
# bring back metadata
dplyr::inner_join(
infercnv_metadata_df,
by = "cell_id"
)
}
```

When we return to working on this module for neuroblastoma samples, we should move this function into its own utils file to keep things modular since we anticipate using it more in the future.

What will your pull request contain?

New utils file with functions that are imported in relevant notebooks

Will you require additional software beyond what is already in the analysis module?

No

Will you require different computational resources beyond what the analysis module already uses?

No

If known, when do you expect to file the pull request?

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions