Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

How to control which SE assay to consider? #266

Closed
lgatto opened this issue Jan 26, 2020 · 3 comments
Closed

How to control which SE assay to consider? #266

lgatto opened this issue Jan 26, 2020 · 3 comments

Comments

@lgatto
Copy link
Collaborator

lgatto commented Jan 26, 2020

I am wondering if the following scenario is relevant for your applications with MAE objects, and if there are any best practice or experimence of how to proceed.

The MAE below is composed of two SEs, the first one populated with 2 assays, the second one with 1:

m1 <- matrix(1:12, ncol = 3)
logm1 <- log2(m1)
m2 <- matrix(rnorm(12), ncol = 3)
colnames(m1) <- colnames(logm1) <- colnames(m2) <- LETTERS[1:3]

se1 <- SummarizedExperiment(list(m1 = m1, logm1 = logm1))
se2 <- SummarizedExperiment(list(m2 = m2))
mae <- MultiAssayExperiment(ExperimentList(list(se1 = se1, se2 = se2)))
> mae
A MultiAssayExperiment object of 2 listed
 experiments with user-defined names and respective classes. 
 Containing an ExperimentList class object of length 2: 
 [1] se1: SummarizedExperiment with 4 rows and 3 columns 
 [2] se2: SummarizedExperiment with 4 rows and 3 columns 
Features: 
 experiments() - obtain the ExperimentList instance 
 colData() - the primary/phenotype DataFrame 
 sampleMap() - the sample availability DataFrame 
 `$`, `[`, `[[` - extract colData columns, subset, or experiment 
 *Format() - convert into a long or wide DataFrame 
 assays() - convert ExperimentList to a SimpleList of matrices
> assays(se1)
List of length 2
names(2): m1 logm1
> assays(se2)
List of length 1
names(1): m2

How to control which SE assay to consider?

  • For example longFormat(mae), which is equivalent to longFormat(mae, i = 1) takes the first assays in each SE:
> longFormat(mae)
DataFrame with 24 rows and 5 columns
          assay     primary     rowname     colname             value
    <character> <character> <character> <character>         <numeric>
1           se1           A           1           A                 1
2           se1           A           2           A                 2
3           se1           A           3           A                 3
4           se1           A           4           A                 4
5           se1           B           1           B                 5
...         ...         ...         ...         ...               ...
20          se2           B           4           B 0.604367483954895
21          se2           C           1           C 0.170694282195851
22          se2           C           2           C -1.53991262659098
23          se2           C           3           C -1.48481354143856
24          se2           C           4           C  1.04264526337826
  • This next one fails, as expected, given that se2 only has one assay:
> longFormat(mae, i = 2)
Error in value[[3L]](cond) : 
  'assay(<SummarizedExperiment>, i="numeric", ...)' invalid subscript 'i'
subscript is out of bounds
  • But what if I needed the second assay in se1 and the first one in se2?
> longFormat(mae, i = c(2, 1))
Error in value[[3L]](cond) : 
  'assay(<SummarizedExperiment>, i="numeric", ...)' invalid subscript 'i'
attempt to extract more than one element

The motivation behind this question is more general than longFormat above, but it illustrates my issue well.

@LiNk-NY
Copy link
Collaborator

LiNk-NY commented Jan 28, 2020

Hi Laurent, @lgatto
That's a good question.

I think that the easier and more general approach would be to separate your assays
into individual ones. We've made that easier when using the c() function to add
another assay based on the metadata of an existing assay:

suppressPackageStartupMessages({
    library(MultiAssayExperiment)
})

m1 <- matrix(1:12, ncol = 3)
logm1 <- log2(m1)
m2 <- matrix(rnorm(12), ncol = 3)
colnames(m1) <- colnames(logm1) <- colnames(m2) <- LETTERS[1:3]

se1 <- SummarizedExperiment(list(m1 = m1, logm1 = logm1))
se2 <- SummarizedExperiment(list(m2 = m2))
mae <- MultiAssayExperiment(ExperimentList(list(se1 = se1, se2 = se2)))

c(mae, logse = SummarizedExperiment(list(logm1 = logm1)), mapFrom = 1L)
#> Warning: Assuming column order in the data provided 
#>  matches the order in 'mapFrom' experiment(s) colnames
#> A MultiAssayExperiment object of 3 listed
#>  experiments with user-defined names and respective classes.
#>  Containing an ExperimentList class object of length 3:
#>  [1] se1: SummarizedExperiment with 4 rows and 3 columns
#>  [2] se2: SummarizedExperiment with 4 rows and 3 columns
#>  [3] logse: SummarizedExperiment with 4 rows and 3 columns
#> Features:
#>  experiments() - obtain the ExperimentList instance
#>  colData() - the primary/phenotype DFrame
#>  sampleMap() - the sample availability DFrame
#>  `$`, `[`, `[[` - extract colData columns, subset, or experiment
#>  *Format() - convert into a long or wide DFrame
#>  assays() - convert ExperimentList to a SimpleList of matrices

Created on 2020-01-28 by the reprex package (v0.3.0)

And then use the longFormat function...

Also it is quite possible to support a vector (and possibly even IntegerList) i
argument in the longFormat function that would work on SummarizedExperiment
and RaggedExperiment derivatives.

Best,
Marcel

@lgatto
Copy link
Collaborator Author

lgatto commented Jan 28, 2020

From your reply, I conclude that the default/expected use case is to have one assay par MultiAssayExperiment assay. Thank you.

@LiNk-NY
Copy link
Collaborator

LiNk-NY commented Jan 28, 2020

Hi Laurent, @lgatto
This type of input should be supported as of version 1.13.7. See commit 9704f1a

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants