Skip to content

Error: _Map_base::at #31

@nick-youngblut

Description

@nick-youngblut

During the Reading Gene Annotations step in the sc_long_pipeline() workflow, I'm getting a _Map_base::at error.

I'm using a decompressed fastq for the BLAZE output (ran that stand-alone). I'm using GCF_000001405.40_GRCh38.p14 for the reference: (GCF_000001405.40_GRCh38.p14_genomic.fna.gz and GCF_000001405.40_GRCh38.p14_genomic.gtf.gz).

My config:

config_file = FLAMES::create_config(
  config_dir,
  type = "sc_3end",
  do_barcode_demultiplex = FALSE,
  threads = 12
)

My sc_long_pipeline() job:

sce = FLAMES::sc_long_pipeline(
    fastq = fastq_input,
    genome_fa = ref_fna_input,
    annotation = ref_annot_input,
    outdir = work_dir,
    config = config_file,
    minimap2 = minimap2_path,
    k8 = k8_path,
    expect_cell_number = 8000
)

My console output:

Skipping Demultiplexing step...
Please make sure the ` /home/rstudio/workspace//data/SspArc0008_10x_cDNA_longRead//blaze_output/matched_reads.fastq `` is the the demultiplexing output from previous FLAEMS call.
Running FLAMES pipeline...
#### Input parameters:
{
  "pipeline_parameters": {
    "seed": [2022],
    "threads": [12],
    "do_barcode_demultiplex": [false],
    "do_gene_quantification": [true],
    "do_genome_alignment": [true],
    "do_isoform_identification": [true],
    "bambu_isoform_identification": [false],
    "multithread_isoform_identification": [true],
    "do_read_realignment": [true],
    "do_transcript_quantification": [true]
  },
  "barcode_parameters": {
    "max_bc_editdistance": [2],
    "max_flank_editdistance": [8],
    "pattern": {
      "primer": ["CTACACGACGCTCTTCCGATCT"],
      "BC": ["NNNNNNNNNNNNNNNN"],
      "UMI": ["NNNNNNNNNNNN"],
      "polyT": ["TTTTTTTTT"]
    },
    "TSO_seq": ["CCCATGTACTCTGCGTTGATACCACTGCTT"],
    "TSO_prime": [3],
    "full_length_only": [false]
  },
  "isoform_parameters": {
    "generate_raw_isoform": [false],
    "max_dist": [10],
    "max_ts_dist": [100],
    "max_splice_match_dist": [10],
    "min_fl_exon_len": [40],
    "max_site_per_splice": [3],
    "min_sup_cnt": [5],
    "min_cnt_pct": [0.001],
    "min_sup_pct": [0.2],
    "bambu_trust_reference": [true],
    "strand_specific": [0],
    "remove_incomp_reads": [4],
    "downsample_ratio": [1]
  },
  "alignment_parameters": {
    "use_junctions": [true],
    "no_flank": [false]
  },
  "realign_parameters": {
    "use_annotation": [true]
  },
  "transcript_counting": {
    "min_tr_coverage": [0.4],
    "min_read_coverage": [0.4]
  }
} 
gene annotation: /home/rstudio/workspace//data/references/human/FLAMES/GCF_000001405.40_GRCh38.p14_genomic.gtf.gz 
genome fasta: /home/rstudio/workspace//data/references/human/FLAMES/GCF_000001405.40_GRCh38.p14_genomic.fna.gz 
input fastq: /home/rstudio/workspace//data/SspArc0008_10x_cDNA_longRead//blaze_output/matched_reads.fastq 
output directory: /home/rstudio/workspace//data/SspArc0008_10x_cDNA_longRead//flames 
minimap2 path: /home/rstudio/miniconda3/bin/minimap2 
k8 path: /home/rstudio/miniconda3/bin/k8 
#### Aligning reads to genome using minimap2
02:49:30 PM Fri Apr 12 2024 minimap2_align
[M::mm_idx_gen::52.772*2.11] collected minimizers
[M::mm_idx_gen::59.234*3.06] sorted minimizers
[M::main::59.234*3.06] loaded/built the index for 705 target sequence(s)
[M::mm_mapopt_update::60.414*3.02] mid_occ = 2177
[M::mm_idx_stat] kmer size: 14; skip: 5; is_hpc: 0; #seq: 705
[M::mm_idx_stat::61.052*3.00] distinct minimizers: 65648619 (22.00% are singletons); average occurrences: 16.258; average spacing: 3.090; total length: 3298430636
[M::worker_pipeline::305.468*9.91] mapped 482699 sequences
[M::worker_pipeline::539.678*10.81] mapped 485246 sequences
[M::worker_pipeline::772.824*11.17] mapped 494113 sequences
[M::worker_pipeline::1008.325*11.36] mapped 492944 sequences
[M::worker_pipeline::1250.776*11.48] mapped 487522 sequences
[M::worker_pipeline::1515.925*11.57] mapped 468694 sequences
[M::worker_pipeline::1778.447*11.64] mapped 468982 sequences
[M::worker_pipeline::2041.090*11.68] mapped 468921 sequences
[M::worker_pipeline::2305.636*11.72] mapped 468068 sequences
[M::worker_pipeline::2566.991*11.75] mapped 469711 sequences
[M::worker_pipeline::2814.316*11.77] mapped 476070 sequences
[M::worker_pipeline::3044.860*11.79] mapped 482292 sequences
[M::worker_pipeline::3277.376*11.80] mapped 482075 sequences
[M::worker_pipeline::3507.651*11.81] mapped 482633 sequences
[M::worker_pipeline::3738.925*11.82] mapped 482293 sequences
[M::worker_pipeline::3910.235*11.82] mapped 377902 sequences
[M::main] Version: 2.28-r1209
[M::main] CMD: /home/rstudio/miniconda3/bin/minimap2 -ax splice -t 12 -k14 --secondary=no --seed 2022 --junc-bed /home/rstudio/workspace//data/SspArc0008_10x_cDNA_longRead//flames/tmp_splice_anno.bed12 --junc-bonus 1 /home/rstudio/workspace//data/references/human/FLAMES/GCF_000001405.40_GRCh38.p14_genomic.fna.gz /home/rstudio/workspace//data/SspArc0008_10x_cDNA_longRead//blaze_output/matched_reads.fastq
[M::main] Real time: 3910.908 sec; CPU: 46208.963 sec; Peak RSS: 21.147 GB
[bam_sort_core] merging from 17 files and 1 in-memory blocks...
04:02:56 PM Fri Apr 12 2024 Start gene quantification and UMI deduplication
04:02:56 PM Fri Apr 12 2024 quantify genes 
Found genome alignment file(s): 	align2genome.bam
Connected to your session in progress, last started 2024-Apr-12 14:48:40 UTC (1 hour ago)
Assigning reads to genes...
Processed: 100%|██████████| 40375/40375 [02:41<00:00, 249.45gene_group/s]
Finalising the gene count matrix ...
Plotting the saturation curve ...
Generating deduplicated fastq file ...
Processed: 7572500.0Read [01:27, 87015.60Read/s] 
04:14:35 PM Fri Apr 12 2024 Gene quantification and UMI deduplication done!
04:14:35 PM Fri Apr 12 2024 Start isoform identificaiton
04:14:35 PM Fri Apr 12 2024 find_isoform
#### Reading Gene Annotations
Error: _Map_base::at

I'm using the patched version of FLAMES from #26.

My sessionInfo:

R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] FLAMES_1.9.2

loaded via a namespace (and not attached):
  [1] BiocIO_1.12.0               bitops_1.0-7                filelock_1.0.2             
  [4] tibble_3.2.1                R.oo_1.25.0                 basilisk.utils_1.14.1      
  [7] bambu_3.4.0                 graph_1.80.0                XML_3.99-0.14              
 [10] rpart_4.1.19                lifecycle_1.0.3             edgeR_4.0.16               
 [13] doParallel_1.0.17           OrganismDbi_1.44.0          globals_0.16.2             
 [16] lattice_0.21-8              ensembldb_2.26.0            MultiAssayExperiment_1.28.0
 [19] backports_1.4.1             magrittr_2.0.3              limma_3.58.1               
 [22] Hmisc_5.1-1                 rmarkdown_2.25              yaml_2.3.7                 
 [25] metapod_1.10.1              reticulate_1.34.0           ggbio_1.50.0               
 [28] cowplot_1.1.1               DBI_1.1.3                   RColorBrewer_1.1-3         
 [31] abind_1.4-5                 zlibbioc_1.48.2             GenomicRanges_1.54.1       
 [34] purrr_1.0.2                 R.utils_2.12.3              AnnotationFilter_1.26.0    
 [37] biovizBase_1.50.0           BiocGenerics_0.48.1         RCurl_1.98-1.12            
 [40] nnet_7.3-19                 VariantAnnotation_1.48.1    rappdirs_0.3.3             
 [43] circlize_0.4.15             GenomeInfoDbData_1.2.11     IRanges_2.36.0             
 [46] S4Vectors_0.40.2            ggrepel_0.9.4               irlba_2.3.5.1              
 [49] listenv_0.9.0               dqrng_0.3.1                 parallelly_1.36.0          
 [52] DelayedMatrixStats_1.24.0   codetools_0.2-19            DropletUtils_1.22.0        
 [55] DelayedArray_0.28.0         scuttle_1.12.0              xml2_1.3.5                 
 [58] tidyselect_1.2.0            shape_1.4.6                 viridis_0.6.4              
 [61] ScaledMatrix_1.10.0         matrixStats_1.0.0           stats4_4.3.1               
 [64] BiocFileCache_2.10.2        base64enc_0.1-3             GenomicAlignments_1.38.2   
 [67] jsonlite_1.8.7              BiocNeighbors_1.20.2        GetoptLong_1.0.5           
 [70] Formula_1.2-5               scater_1.30.1               iterators_1.0.14           
 [73] foreach_1.5.2               tools_4.3.1                 progress_1.2.2             
 [76] Rcpp_1.0.11                 glue_1.7.0                  gridExtra_2.3              
 [79] SparseArray_1.2.4           xfun_0.40                   MatrixGenerics_1.14.0      
 [82] GenomeInfoDb_1.38.8         dplyr_1.1.4                 HDF5Array_1.30.1           
 [85] withr_2.5.1                 BiocManager_1.30.22         fastmap_1.1.1              
 [88] GGally_2.1.2                basilisk_1.14.3             bluster_1.12.0             
 [91] rhdf5filters_1.14.1         fansi_1.0.5                 rsvd_1.0.5                 
 [94] digest_0.6.33               R6_2.5.1                    colorspace_2.1-0           
 [97] dichromat_2.0-0.1           biomaRt_2.58.2              RSQLite_2.3.2              
[100] R.methodsS3_1.8.2           utf8_1.2.4                  tidyr_1.3.1                
[103] generics_0.1.3              data.table_1.14.8           rtracklayer_1.62.0         
[106] prettyunits_1.2.0           httr_1.4.7                  htmlwidgets_1.6.2          
[109] S4Arrays_1.2.1              pkgconfig_2.0.3             gtable_0.3.4               
[112] blob_1.2.4                  ComplexHeatmap_2.18.0       SingleCellExperiment_1.24.0
[115] XVector_0.42.0              htmltools_0.5.6.1           RBGL_1.78.0                
[118] ProtGenerics_1.34.0         clue_0.3-65                 scales_1.3.0               
[121] Biobase_2.62.0              png_0.1-8                   scran_1.30.2               
[124] knitr_1.44                  rstudioapi_0.15.0           reshape2_1.4.4             
[127] rjson_0.2.21                checkmate_2.3.0             curl_5.1.0                 
[130] cachem_1.0.8                rhdf5_2.46.1                GlobalOptions_0.1.2        
[133] stringr_1.5.1               vipor_0.4.5                 parallel_4.3.1             
[136] foreign_0.8-84              AnnotationDbi_1.64.1        restfulr_0.0.15            
[139] pillar_1.9.0                grid_4.3.1                  reshape_0.8.9              
[142] vctrs_0.6.4                 BiocSingular_1.18.0         dbplyr_2.4.0               
[145] beachmat_2.18.1             cluster_2.1.4               beeswarm_0.4.0             
[148] htmlTable_2.4.2             evaluate_0.22               GenomicFeatures_1.54.4     
[151] cli_3.6.1                   locfit_1.5-9.8              compiler_4.3.1             
[154] Rsamtools_2.18.0            rlang_1.1.1                 crayon_1.5.2               
[157] ggbeeswarm_0.7.2            plyr_1.8.9                  stringi_1.7.12             
[160] viridisLite_0.4.2           BiocParallel_1.36.0         munsell_0.5.0              
[163] Biostrings_2.70.3           lazyeval_0.2.2              Matrix_1.6-5               
[166] dir.expiry_1.10.0           BSgenome_1.70.2             hms_1.1.3                  
[169] sparseMatrixStats_1.14.0    bit64_4.0.5                 future_1.33.0              
[172] ggplot2_3.5.0               Rhdf5lib_1.24.2             KEGGREST_1.42.0            
[175] statmod_1.5.0               SummarizedExperiment_1.32.0 igraph_1.5.1               
[178] memoise_2.0.1               bit_4.0.5                   xgboost_1.7.5.1

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions