addGeneExpressionMatrix Error: invalid subscript #2219

danli349 opened this issue Oct 17, 2024 · 1 comment

danli349 commented Oct 17, 2024


How can I addGeneExpressionMatrix using a SingleCellAssay object?
I created the SingleCellAssay object by:

seRNA <- MAST::FromMatrix(
    as.matrix(csr_matrix),  #raw counts data, genes in the rows and cells in the columns
    class = "SingleCellAssay",
    check_sanity = FALSE
Loading required package: MAST

Loading required package: SingleCellExperiment

class: SingleCellAssay 
dim: 15515 3157 
assays(1): et
rownames(15515): CreERT2 mKATE_Bglobin ... CAAA01118383.1
rowData names(10): gene_id function ... std primerid
colnames(3157): TCEV_Multiome1#ATTCAACCATGATTGT-1
colData names(13): cells sample ... umap2 wellKey
mainExpName: NULL
projHeme3 <- subsetArchRProject(ArchRProj = projHeme2,
                                cells = colnames(seRNA),
                                outputDirectory = "ArchRSubset",
                                dropCells = TRUE,
                                force = FALSE)

GRanges object with 24368 ranges and 2 metadata columns:
          seqnames            ranges strand |     gene_id      symbol
             <Rle>         <IRanges>  <Rle> | <character> <character>
      [1]     chr1   3214482-3671498      - |      497097        Xkr4
      [2]     chr1   4290846-4409241      - |       19888         Rp1
      [3]     chr1   4490928-4497354      - |       20671       Sox17
      [4]     chr1   4773198-4785726      - |       27395      Mrpl15
      [5]     chr1   4807893-4846735      + |       18777      Lypla1
      ...      ...               ...    ... .         ...         ...
  [24364]     chrY 66739797-66742170      - |   100040786     Gm20852
  [24365]     chrY 78835721-78836719      - |   100039574     Gm20806
  [24366]     chrY 79148793-79149787      - |   100042428     Gm20917
  [24367]     chrY 85528523-85529519      - |   100040911     Gm20854
  [24368]     chrY 90784610-90816465      + |      170942       Erdr1
  seqinfo: 21 sequences from mm10 genome
DataFrame with 15515 rows and 5 columns
                       symbol        function   n_cells highly_variable
                  <character>        <factor> <integer>       <logical>
CreERT2               CreERT2 Gene Expression        85           FALSE
mKATE_Bglobin   mKATE_Bglobin Gene Expression      1898           FALSE
MYC_SV40pA         MYC_SV40pA Gene Expression      1117            TRUE
Xkr4                     Xkr4 Gene Expression        38           FALSE
Rp1                       Rp1 Gene Expression        32           FALSE
...                       ...             ...       ...             ...
Vamp7                   Vamp7 Gene Expression       158           FALSE
Tmlhe                   Tmlhe Gene Expression       238           FALSE
AC149090.1         AC149090.1 Gene Expression       785           FALSE
CAAA01118383.1 CAAA01118383.1 Gene Expression       791           FALSE
CAAA01147332.1 CAAA01147332.1 Gene Expression        34           FALSE
CreERT2        0.0532844
mKATE_Bglobin  1.4895195
MYC_SV40pA     0.9191971
Xkr4           0.0237930
Rp1            0.0211868
...                  ...
Vamp7          0.0806509
Tmlhe          0.1489885
AC149090.1     0.4397101
CAAA01118383.1 0.4414484
CAAA01147332.1 0.0184182
proj <- addGeneExpressionMatrix(input = projHeme3, 
                                seRNA = seRNA, 
                                force = TRUE,
                                strictMatch = TRUE)
ArchR logging to : ArchRLogs/ArchR-addGeneExpressionMatrix-3bc58765440-Date-2024-10-17_Time-09-55-16.395514.log
If there is an issue, please report to github with logFile!

Overlap w/ scATAC = 1

2024-10-17 09:55:17.974571 : 

Overlap Per Sample w/ scATAC : TCEV_Multiome1=687,TCEV_Multiome2=669,TCEV_Multiome3=1801

2024-10-17 09:55:17.990951 : 

Error: invalid subscript

1. seRNA[BiocGenerics::which(seqnames(seRNA) %bcin% seqnames(chromSizes))]
2. seRNA[BiocGenerics::which(seqnames(seRNA) %bcin% seqnames(chromSizes))]
3. int_elementMetadata(x)[ii, , drop = FALSE]
4. int_elementMetadata(x)[ii, , drop = FALSE]
5. extractROWS(x, i)
6. extractROWS(x, i)
7. normalizeSingleBracketSubscript(i, x, allow.NAs = TRUE, as.NSBS = TRUE)
8. NSBS(i, x, exact = exact, strict.upper.bound = !allow.append, 
 .     allow.NAs = allow.NAs)
9. NSBS(i, x, exact = exact, strict.upper.bound = !allow.append, 
 .     allow.NAs = allow.NAs)
10. .subscript_error("invalid subscript")
11. stop(wmsg(...), call. = FALSE)
12. .handleSimpleError(function (cnd) 
  . {
  .     watcher$capture_plot_and_output()
  .     cnd <- sanitize_call(cnd)
  .     watcher$push(cnd)
  .     switch(on_error, continue = invokeRestart("eval_continue"), 
  .         stop = invokeRestart("eval_stop"), error = invokeRestart("eval_error", 
  .             cnd))
  . }, "invalid subscript", base::quote(NULL))

The ArchRLogs/ArchR-addGeneExpressionMatrix-3bc58765440-Date-2024-10-17_Time-09-55-16.395514.log:

           ___      .______        ______  __    __  .______      
          /   \     |   _  \      /      ||  |  |  | |   _  \     
         /  ^  \    |  |_)  |    |  ,----'|  |__|  | |  |_)  |    
        /  /_\  \   |      /     |  |     |   __   | |      /     
       /  _____  \  |  |\  \\___ |  `----.|  |  |  | |  |\  \\___.
      /__/     \__\ | _| `._____| \______||__|  |__| | _| `._____|
Logging With ArchR!

Start Time : 2024-10-17 09:55:16.436662

------- ArchR Info

ArchRThreads = 10
ArchRGenome = Mm10

------- System Info

Computer OS = unix
Total Cores = 12

------- Session Info

R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 20.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/ 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/;  LAPACK version 3.9.0

Random number generation:
 RNG:     L'Ecuyer-CMRG 
 Normal:  Inversion 
 Sample:  Rejection 
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
 [1] parallel  stats4    grid      stats     graphics  grDevices utils    
 [8] datasets  methods   base     

other attached packages:
 [1] SeuratObject_4.1.3          Seurat_4.3.0               
 [3] MAST_1.30.0                 SingleCellExperiment_1.26.0
 [5] rhdf5_2.48.0                SummarizedExperiment_1.34.0
 [7] Biobase_2.64.0              MatrixGenerics_1.16.0      
 [9] Rcpp_1.0.13                 Matrix_1.7-0               
[11] GenomicRanges_1.56.1        GenomeInfoDb_1.40.1        
[13] IRanges_2.38.1              S4Vectors_0.42.1           
[15] BiocGenerics_0.50.0         matrixStats_1.4.1          
[17] data.table_1.16.0           stringr_1.5.1              
[19] plyr_1.8.9                  magrittr_2.0.3             
[21] ggplot2_3.5.1               gtable_0.3.5               
[23] gtools_3.9.5                gridExtra_2.3              
[25] ArchR_1.0.2                

loaded via a namespace (and not attached):
  [1] RcppAnnoy_0.0.22                   splines_4.4.1                     
  [3] later_1.3.2                        pbdZMQ_0.3-11                     
  [5] BiocIO_1.14.0                      bitops_1.0-8                      
  [7] tibble_3.2.1                       polyclip_1.10-7                   
  [9] XML_3.99-0.17                      lifecycle_1.0.4                   
 [11] globals_0.16.3                     lattice_0.22-6                    
 [13] MASS_7.3-61                        plotly_4.10.4                     
 [15] yaml_2.3.10                        httpuv_1.6.15                     
 [17] sctransform_0.4.1                  sp_2.1-4                          
 [19] spatstat.sparse_3.1-0              reticulate_1.38.0                 
 [21] cowplot_1.1.3                      pbapply_1.7-2                     
 [23] RColorBrewer_1.1-3                 abind_1.4-8                       
 [25] zlibbioc_1.50.0                    Rtsne_0.17                        
 [27] purrr_1.0.2                        RCurl_1.98-1.16                   
 [29] GenomeInfoDbData_1.2.12            ggrepel_0.9.6                     
 [31] irlba_2.3.5.1                      listenv_0.9.1                     
 [33] spatstat.utils_3.0-5               goftest_1.2-3                     
 [35] spatstat.random_3.3-1              fitdistrplus_1.2-1                
 [37] parallelly_1.38.0                  leiden_0.4.3.1                    
 [39] codetools_0.2-20                   DelayedArray_0.30.1               
 [41] tidyselect_1.2.1                   UCSC.utils_1.0.0                  
 [43] base64enc_0.1-3                    spatstat.explore_3.3-1            
 [45] GenomicAlignments_1.40.0           jsonlite_1.8.9                    
 [47] progressr_0.14.0                   ggridges_0.5.6                    
 [49] survival_3.7-0                     tools_4.4.1                       
 [51] ica_1.0-3                          glue_1.7.0                        
 [53] SparseArray_1.4.8                  IRdisplay_1.1                     
 [55] dplyr_1.1.4                        withr_3.0.1                       
 [57] fastmap_1.2.0                      rhdf5filters_1.16.0               
 [59] fansi_1.0.6                        digest_0.6.37                     
 [61] R6_2.5.1                           mime_0.12                         
 [63] colorspace_2.1-1                   scattermore_1.2                   
 [65] tensor_1.5                         spatstat.data_3.1-2               
 [67] utf8_1.2.4                         tidyr_1.3.1                       
 [69] generics_0.1.3                     rtracklayer_1.64.0                
 [71] httr_1.4.7                         htmlwidgets_1.6.4                 
 [73] S4Arrays_1.4.1                     uwot_0.2.2                        
 [75] pkgconfig_2.0.3                    lmtest_0.9-40                     
 [77] XVector_0.44.0                     htmltools_0.5.8.1                 
 [79] scales_1.3.0                       png_0.1-8                         
 [81] spatstat.univar_3.0-0              reshape2_1.4.4                    
 [83] rjson_0.2.23                       uuid_1.2-1                        
 [85] nlme_3.1-166                       curl_5.2.3                        
 [87] repr_1.1.7                         zoo_1.8-12                        
 [89] KernSmooth_2.23-24                 miniUI_0.1.1.1                    
 [91] restfulr_0.0.15                    pillar_1.9.0                      
 [93] vctrs_0.6.5                        RANN_2.6.1                        
 [95] promises_1.3.0                     xtable_1.8-4                      
 [97] cluster_2.1.6                      evaluate_1.0.0                    
 [99] cli_3.6.3                          compiler_4.4.1                    
[101] Rsamtools_2.20.0                   rlang_1.1.4                       
[103] crayon_1.5.3                       future.apply_1.11.2               
[105] stringi_1.8.4                      viridisLite_0.4.2                 
[107] deldir_2.0-4                       BiocParallel_1.38.0               
[109] munsell_0.5.1                      Biostrings_2.72.1                 
[111] lazyeval_0.2.2                     spatstat.geom_3.3-2               
[113] IRkernel_1.3.2                     BSgenome_1.72.0                   
[115] patchwork_1.2.0                    future_1.34.0                     
[117] Rhdf5lib_1.26.0                    shiny_1.9.1                       
[119] ROCR_1.0-11                        igraph_2.0.3                      
[121] BSgenome.Mmusculus.UCSC.mm10_1.4.3

------- Log Info

2024-10-17 09:55:16.571968 : addGeneExpressionMatrix Input-Parameters, Class = list

addGeneExpressionMatrix Input-Parameters$input: length = 1

addGeneExpressionMatrix Input-Parameters$seRNA: length = 15515
class: SingleCellAssay 
dim: 6 3157 
assays(1): et
rownames(6): CreERT2 mKATE_Bglobin ... Rp1 Sox17
rowData names(10): gene_id function ... std primerid
colnames(3157): TCEV_Multiome1#ATTCAACCATGATTGT-1
colData names(13): cells sample ... umap2 wellKey
mainExpName: NULL

addGeneExpressionMatrix Input-Parameters$chromSizes: length = 21
GRanges object with 6 ranges and 0 metadata columns:
      seqnames      ranges strand
         <Rle>   <IRanges>  <Rle>
  [1]     chr1 1-195471971      *
  [2]     chr2 1-182113224      *
  [3]     chr3 1-160039680      *
  [4]     chr4 1-156508116      *
  [5]     chr5 1-151834684      *
  [6]     chr6 1-149736546      *
  seqinfo: 21 sequences from an unspecified genome

addGeneExpressionMatrix Input-Parameters$excludeChr: length = 2
[1] "chrM" "chrY"

addGeneExpressionMatrix Input-Parameters$scaleTo: length = 1
[1] 10000

addGeneExpressionMatrix Input-Parameters$verbose: length = 1
[1] TRUE

addGeneExpressionMatrix Input-Parameters$threads: length = 1
[1] 10

addGeneExpressionMatrix Input-Parameters$parallelParam: length = 0

addGeneExpressionMatrix Input-Parameters$strictMatch: length = 1
[1] TRUE

addGeneExpressionMatrix Input-Parameters$force: length = 1
[1] TRUE

addGeneExpressionMatrix Input-Parameters$logFile: length = 1
[1] "ArchRLogs/ArchR-addGeneExpressionMatrix-3bc58765440-Date-2024-10-17_Time-09-55-16.395514.log"

2024-10-17 09:55:17.976565 : 

2024-10-17 09:55:17.992817 : 

rcorces commented Oct 17, 2024

Hi @danli349! Thanks for using ArchR! Lately, it has been very challenging for me to keep up with maintenance of this package and all of my other
responsibilities as a PI. I have not been responding to issue posts and I have not been pushing updates to the software. We are actively searching to hire
a computational biologist to continue to develop and maintain ArchR and related tools. If you know someone who might be a good fit, please let us know!
In the meantime, your issue will likely go without a reply. Most issues with ArchR right not relate to compatibility. Try reverting to R 4.1 and Bioconductor 3.15.
Newer versions of Seurat and Matrix also are causing issues. Sorry for not being able to provide active support for this package at this time.

