Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[REQUEST]: allow non-conda installation of dependencies #208

Closed
dirkjanvw opened this issue Aug 13, 2024 · 10 comments
Closed

[REQUEST]: allow non-conda installation of dependencies #208

dirkjanvw opened this issue Aug 13, 2024 · 10 comments

Comments

@dirkjanvw
Copy link

Description

I have been trying to get phg to work on my system as a singularity image, but I have been facing an issue with that: the code assumes that the user always has a conda environment which contains the dependencies. Since a singularity image (or docker container for that matter) can have all dependencies inside without the need for a conda environment that is needed, is it possible to have a --no-conda option or something similar which assumes all dependencies are available on the ${PATH}?

For testing, I have written this singularity .def file (for version 2.3):

Bootstrap: docker
From: mambaorg/micromamba:1.5.8

%post
   apt-get update && apt-get install -y wget

%post
   micromamba install -y -n base -c conda-forge -c bioconda -c tiledb python=3.8.15 tiledb-py=0.22.3 tiledbvcf-py=0.25.3 anchorwave=1.2.2 bcftools=1.16 samtools=1.16.1 agc=3.0 openjdk=17.0.10
   micromamba clean --all --yes

%post
   mkdir -p /opt
   cd /opt
   wget https://github.com/maize-genetics/phg_v2/releases/download/2.3.16.153/PHGv2-v2.3.tar
   tar xvf PHGv2-v2.3.tar
   rm PHGv2-v2.3.tar

%environment
   export PATH=/opt/phg/bin:$PATH
   export JAVA_OPTS="-Xmx50g"

Alternatives

No response

Additional Context

No response

@lynnjo
Copy link
Collaborator

lynnjo commented Aug 13, 2024

This is something we can consider. Is there a reason you prefer docker vs conda?

@dirkjanvw
Copy link
Author

I prefer having all in one single (singularity) container so it's easy to incorporate in my pipelines. I prefer to run most tools in a Snakemake pipeline myself so it's reproducible later on. I think removing the need for having a specific conda environment name could solve this :)

@dirkjanvw
Copy link
Author

I tried to implement it myself in a PR but I cannot get the tests to run successfully (also not without any changes, there seems to be the assumption that I have a full TileDB available at $HOME/temp/phgv2Tests/tempDir/testTileDBURI/, which I don't.

If you prefer to implement it yourself, no worries! Just thought I could give it a go!

@lynnjo
Copy link
Collaborator

lynnjo commented Aug 13, 2024

We have created a card to consider this request. If implemented, it may not be via a parameter, but based on other internal changes to the code. One of our goals is to keep parameters to a minimum. We find an abundance of parameters results in an interface that is confusing to users. At the moment we have higher priorities so I cannot predict when we will address it.

In the meantime, one option for you is to take our phg_environment.yml file and create a "phgv2-conda" conda environment inside your docker. You would not need to run the environment, just create it. If you decide to try this, please let us know how it works. We appreciate your feedback!

The command to run inside your docker would be:
conda env create --solver=libmamba --file src/main/resources/phg_environment.yml

(replace "src/main/resources/phg_environment.yml" with the path to your copy of the phg_environment.yml file)

@dirkjanvw
Copy link
Author

After playing around with your suggestion and some other ideas, I have created a working version. It basically creates a script called conda which checks if phg is wanting to run something in your default environment name and if so, it removes the conda run -n phgv2-conda from the command.

The singularity .def file (works with singularity v3.9; haven't tested other versions):

Bootstrap: docker
From: mambaorg/micromamba:1.5.8

%post
   apt-get update && apt-get install -y wget

%post
   mkdir -p /opt
   cd /opt
   wget https://github.com/maize-genetics/phg_v2/releases/download/2.3.16.153/PHGv2-v2.3.tar
   tar xvf PHGv2-v2.3.tar
   rm PHGv2-v2.3.tar

%post
   micromamba install -y -n base -c conda-forge -c bioconda -c tiledb python=3.8.15 tiledb-py=0.22.3 tiledbvcf-py=0.25.3 anchorwave=1.2.2 bcftools=1.16 samtools=1.16.1 agc=3.0 openjdk=17.0.10
   micromamba clean --all --yes

%post
    cat << 'EOF' > /usr/local/bin/conda
#!/bin/bash
if [[ "$1" == "run" && ("$2" == "-n" || $2 == "--name") && "$3" == "phgv2-conda" ]]; then
    shift 3
    exec micromamba run --name base "$@"
else
    echo "conda is not installed; use micromamba instead"
    exit 1
fi
EOF
    chmod +x /usr/local/bin/conda

%environment
    export PATH=/usr/local/bin:/opt/phg/bin:/opt/conda/bin:$PATH
    export JAVA_OPTS="-Xmx50g"

%runscript
    echo "Running: $*"
    exec "$@"

You may close this issue as this solves it for me and you indicated such a workaround is preferred for now.

@lynnjo
Copy link
Collaborator

lynnjo commented Aug 14, 2024

I'm glad you found a solution that works for you. Keep in mind you need to be sure the phgv2 required programs you load from conda must have tags that match the release of the phgv2 version you are pulling. Otherwise there will be errors in execution.

@dirkjanvw
Copy link
Author

Yes I will! That's also why I have the version of phgv2 hardcoded. But as far as I'm aware the YAML file is not part of the github release files? If it is, that would make it easier to write it for another version but for now I'll check the dependencies per version :)

@lynnjo
Copy link
Collaborator

lynnjo commented Aug 14, 2024

Correct, the yml file is not part of the release as an individual file. To access the contents of it you would need to do this programmatically with a getResource("phg_environment.yml") command against the java class. If you think this would be useful, we could consider putting phg_environment.yml in the phg/resources/main folder with the application.conf file.

@dirkjanvw
Copy link
Author

For me it is not needed since I know where to look, but should you decide to add a docker and/or singularity definition file to your repo it would definitely make it more future- and fool-proof I think.

@lynnjo
Copy link
Collaborator

lynnjo commented Aug 14, 2024

Closing this issue as user has workaround in place.

@lynnjo lynnjo closed this as completed Aug 14, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants