Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Gene-sets not loading from file #81

Closed
george-hall-ucl opened this issue Jun 30, 2023 · 7 comments
Closed

Gene-sets not loading from file #81

george-hall-ucl opened this issue Jun 30, 2023 · 7 comments
Assignees

Comments

@george-hall-ucl
Copy link
Contributor

Hi, thanks for the nice tool!

I'm running cellxgene-gateway v0.3.10 and cellxgene v1.1.2 locally on a MacBook Pro.

If I run "export GATEWAY_ENABLE_ANNOTATIONS=1" and create a new annotations file from filecrawl then it is saved to the csv file and is displayed in filecrawl as expected. However, when I try to reload the dataset by clicking on the annotation file's name, no gene sets are displayed. If I create a new gene set, then a new csv is created with its name appended to the existing file's name (e.g. if the first csv is called "test1-gene-sets-R64TJAID.csv" then this new one is "test1-gene-sets-R64TJAID-gene-sets-R64TJAID.csv"). I am terminating the app by CTRL-C'ing in the terminal.

Am I misunderstanding how this should work, or is this a bug?

Many thanks in advance!

@george-hall-ucl george-hall-ucl changed the title Annotations not loading from file Gene-sets not loading from file Jun 30, 2023
@george-hall-ucl
Copy link
Contributor Author

I have done some more digging and it looks like I mean that gene sets aren't loading from file, rather than annotations.

The command run by cellxgene-gateway is:
cellxgene launch --annotations-file /path/to/test1-gene-sets-NS3OKLZ5.csv dataset.h5ad

Changing the command to:
cellxgene launch --gene-sets-file /path/to/test1-gene-sets-NS3OKLZ5.csv dataset.h5ad
loads the gene sets as desired.

So, I guess this is the command that I want cellxgene-gateway to execute. Is there any way to make it do this?

@alokito alokito self-assigned this Jul 4, 2023
@alokito
Copy link
Member

alokito commented Jul 4, 2023

Hi @george-hall-ucl ! You are indeed misunderstanding how this currently works, and in particular the difference between annotations and gene sets. Annotations are like "cell sets" rather than gene sets. When annotations are enabled, you can click the "Create new category" button to add a new "Category" and then add "Labels" within the categories and assign cells to the labels. I have attached a screenshot of this process.

categoryLabelAnnotation

Assuming that you are naming your annotations "test1", the command that cellxgene gateway runs should be

cellxgene launch --annotations-file dataset_annotations/test1.csv dataset.h5ad

This made sense originally because there were no gene sets. I think the best way to enable what you want would be to add support for a new environment variable GATEWAY_ENABLE_GENE_SETS that will additionally set the --gene-sets-file parameter as follows:

Case 1: GATEWAY_ENABLE_GENE_SETS alone is set

cellxgene launch --gene-sets-file dataset_annotations/test1-gene-sets.csv  dataset.h5ad

Case 2: GATEWAY_ENABLE_GENE_SETS and GATEWAY_ENABLE_ANNOTATIONS are set

cellxgene launch  --annotations-file dataset_annotations/test1.csv dataset.h5ad --gene-sets-file dataset_annotations/test1-gene-sets.csv dataset.h5ad

This should let annotations and gene sets play nicely and independently from each other. I'll test this when I get a chance and push it on a branch... let know what you think. Are you savvy enough to be able to run the code from a branch? I could also try and figure out how to publish a "pre-release" version to pypi.

@george-hall-ucl
Copy link
Contributor Author

Hi @alokito!

Many thanks for your response. Sounds a good solution to me. I will code it up today and send a pull request.

If anyone is reading this before this fix has been implemented and has the same problem, my current workaround is to set CELLXGENE_LOCATION to a script that adds --gene-sets-file (and the corresponding file) to the correct place in the call to cellxgene (see here). This fix will be much more stable, though!

@george-hall-ucl
Copy link
Contributor Author

@alokito I have now implemented a GATEWAY_ENABLE_GENE_SETS flag: please see my pull request.

As I explain in the PR, I have implemented it in a simple way that meets my needs, but it may need more consideration before actual release! Hopefully this is a useful start, at least.

alokito pushed a commit that referenced this issue Jul 6, 2023
This adds the flag `GATEWAY_ENABLE_GENE_SETS` to enable support for gene
sets.  To simplify implementation, activating this flag also activates
`GATEWAY_ENABLE_ANNOTATIONS`.  The gene sets are saved in a file that
has the same name as the annotations `csv` but with `_gene_sets`
appended to the file name (before the extension).  This file is hidden
in filecrawler, and the gene sets are loaded when the associated
annotations file is loaded.

If the annotations file is missing, then an Exception is raised.

I have updated one unit test to make it expect
`--disable-gene-sets-save` in the default case (i.e. if
`GATEWAY_ENABLE_ANNOTATIONS = 0`).  All units tests pass.

I have updated the README to document `GATEWAY_ENABLE_GENE_SETS`.
@alokito
Copy link
Member

alokito commented Jul 6, 2023

@george-hall-ucl After thinking about this some more, I'm not sure that there's a use case for setting GATEWAY_ENABLE_ANNOTATIONS without GATEWAY_ENABLE_GENE_SETS. I'm thinking it would be simpler to just have GATEWAY_ENABLE_ANNOTATIONS enable both. The fact that you opened this ticket is good evidence that the distinction between annotations and gene sets is confusing to people, and most likely a historical artifact. I just pushed a few commits and will make a PR... will hopefully have time to merge and cut a release this weekend.

@alokito alokito mentioned this issue Jul 6, 2023
alokito pushed a commit that referenced this issue Jul 6, 2023
@george-hall-ucl
Copy link
Contributor Author

Yes, sounds sensible. Thank you for your help with this!

alokito pushed a commit that referenced this issue Jul 9, 2023
alokito pushed a commit that referenced this issue Jul 9, 2023
alokito pushed a commit that referenced this issue Jul 9, 2023
@alokito
Copy link
Member

alokito commented Jul 9, 2023

This is deployed to Pypi, please open a new ticket if there are any remaining issues.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants