Releases: broadinstitute/gnomad_methods
Releases · broadinstitute/gnomad_methods
v0.5.0
Released April 22nd, 2021
Fixed
- Fix for error in
generate_trio_stats_exprthat led to an incorrect untransmitted count. (#238) - Fix for error in
compute_quantile_binthat caused incorrect binning when a single score overlapped multiple bins (#238) - Fixed
create_binned_htbecause it produced a "Cannot combine expressions from different source objects error" (#238) - Fixed handling of missing entries (not within a ref block / alt site) when computing
coverage_statsinsparse_mt.py[#242] - Fix for error in
compute_stratified_sample_qcwheregt_exprcaused error (#259) - Fix for error in
default_lift_datacaused by missingresultsfield innew_locus(#270) - Fix to dbSNP b154 resource (resources.grch38.reference_data) import to allow for multiple rsIDs per variant (#345)
- Fix to
set_female_metrics_to_nato correctly update chrY metrics to be missing (#347) - Fixed available versions for gnomAD v2
coverageandliftoverresources (#352) - Removed side effect of accessing gnomAD v2
coverageandliftoverexome resources that would edit available versions for other resources (#352) - Use
overwriteargument for importing a BlockMatrixResource (#342)
Changed
- Removed assumption of
snvannotation fromcompute_quantile_bin. (#238) - Modified
compute_binned_truth_sample_concordanceto handle additional binning for subsets of variants. (#240) - Updated liftover functions to be more generic (#246)
- Changed quality histograms to label histograms calculated on raw and not adj data (#247)
- Updated some VCF export constants (#249)
- Changed default DP threshold to 5 for hemi genotype calls in
annotate_adjandget_adj_expr(#252) - Updated coverage resources to version 3.0.1 [#242]
- Update to
compute_last_ref_block_end, removing assumption that sparse MatrixTables are keyed only bylocusby default (#279) - Update
generic_field_checkto have option to show percentage of sites that fail checks. (#284) - Modified
vep_or_lookup_vepto support the use of different VEP versions (#282) - Modified
create_truth_sample_htto add adj annotation information in the returned Table if present in the supplied MatrixTables (#300)
Added
- Added constants and functions relevant to VCF export (#241)
- Add reference genome to call of
has_liftoveringet_liftover_genome(#259) - Added fix for MQ calculation in
_get_info_agg_expr, switchedRAW_MQandMQ_DPin calculation (#262) - Add importable method for filtering clinvar to pathogenic sites (#257)
- Added common variant QC functions
get_rf_runsandget_run_datatorandom_forest.py(#278) - Add calculation for the strand odds ratio (SOR) to
get_site_info_exprandget_as_info_expr(#281) - Added VEPed context HT to resource files and included support for versioning (#282)
- Added code to generate summary statistics (total number of variants, number of LoF variants, LOFTEE summaries) (#285)
- Added additional counts to summary statistics (added autosome/sex chromosome counts, allele counts, counts for missense and synomymous variants) (#289)
- Added function,
default_generate_gene_lof_matrix, to generate gene matrix (#290) - Added function
default_generate_gene_lof_summaryto summarize gene matrix results (#292) - Add resource for v3.1.1 release (#364)
Removed
- Removed
rep_on_read; this function is no longer necessary, as MatrixTables/Tables can be repartitioned on read with_n_partitionsadded by this hail update (#283) - Removed
compute_quantile_binand addedcompute_ranked_binas an alternative that provides more even binning. This is now used bycreate_binned_htinstead. (#288) - Removed
prefixparameter from tomake_combo_header_text, as this was only used to check if samples were from gnomAD (#348)
v0.4.0
Released July 9th, 2020
Note gnomAD resources have been moved to a requester pays bucket.
Dataproc clusters must be configured to allow reading from it.
- Added
VEP_CSQ_HEADERto generate vep description necessary for VCF export. (#230) - Modified variant QC pipeline functions
generate_trio_statsandgenerate_sib_statsto add filter parameter for autosomes and bi-allelic sites (#223) score_bin_aggnow requires additional annotationsacandac_qc_samples_unrelated_rawand no longer needstdt(#223)- Changed
score_bin_aggto useac_qc_samples_unrelated_rawannotation instead ofunrelated_qc_callstats(#223) - Added singleton de novo counts to variant QC pipeline function
score_bin_agg(#223) - Modified
filter_mt_to_triosto no longer filter to autosomes as this should be handled during the variant QC pipeline (#223) - Updated
annotate_sexto add globals tosex_ht(#227) - Document
slack_notificationsfunction (#228) - Added
median_impute_featuresto variant QC random forest module (224) - Created
training.pyin variant QC and addedsample_training_examples(224) - Added variant QC pipeline function
train_rf_model(224) - Use local copy of VEP config instead of reading from bucket (#231)
- Updated gnomAD resources paths for hail tables to requester pays buckets (#233)
v0.3.0
v0.2.0
Released April 3rd, 2020
Added
- Function to subset a
MatrixTablebased on a list of samples (#196) - Function to get file size and MD5 hash (#186)
- Developer documentation (#185)
- Include
RAW_MQandAS_VQSLODmetrics inget_annotations_hists(#181) - Functions to compute coverage stats from sparse MT (#173)
Changed
- Repo restructured - imports may need to be updated [(#207)] (#207)
- Make some arguments to
get_qc_mtoptional (#200) - Fetch VEP configuration from new Hail requestor pays buckets (#197)
- Hail must be installed separately (#194)