Skip to content

3. installation guide

Rauf Salamzade edited this page Feb 23, 2025 · 17 revisions

Bioconda (recommended)

Note, (for some setups at least) it is critical to specify the conda-forge channel before the bioconda channel to properly configure priority and lead to a successful installation.

Recommended: For a significantly faster installation process, use mamba in place of conda in the below commands, by installing mamba in your base conda environment.

# 1. install and activate lsabgc
# Note, panaroo is not a dependency by default due to difficulties 
# with passing bioconda's checks and space limitations in Azure
# when including it in the recipe
conda create -n lsabgc_env -c conda-forge -c bioconda lsabgc panaroo
conda activate lsabgc_env

# 2. setup annotation databases. This can take 5 - 30 
# minutes depending on whether you want to install the 
# small or full version of the databases and the end 
# product will be ~5 GB (small) or ~40 GB (full)! 

# Recommended for laptop: download small/minimal database 
# (only PGAP HMMs & MIBiG proteins)
setup_annotation_dbs.py -ld

# Recommended for server: download full database (PGAP, 
# CARD, KOfam, ISfinder, MIBiG, etc.)
setup_annotation_dbs.py

Tip

When you create a conda environment using -n, the environment will typically be stored in your home directory. However, because the databases can be large (~40 GB), you might prefer to instead setup the conda environment somewhere else with more space on your system using -p. For instance, conda create -p /path/to/drive_with_more_space/lsabgc_conda_env/ -c conda-forge -c bioconda lsabgc. Then, next time around you would simply activate this environment by providing the path to it: conda activate /path/to/drive_with_more_space/lsabgc_conda_env/.

Caution

If you choose to manually define a database directory for setup_annotation_dbs.py make sure that it is a unique directory to zol. This directory will be deleted and recreated when you run the script. You don't have to worry about this if using bioconda where the default directory is located within the conda environment space.

Note

🍎 For Mac users with Apple Silicon chips, you might need to specify CONDA_SUBDIR=osx-64 prior to conda create as described here. So you would issue: CONDA_SUBDIR=osx-64 conda create -n lsabgc_env -c conda-forge -c bioconda lsabgc.

Docker via user-friendly wrapper script

We provide a Docker image containing lsaBGC-Pan, together with the minimal annotation databases (MIBiG + PGAP) on Dockerhub. The image is around ~13GB.

We also provide a wrapper bash script that makes using lsaBGC-Pan via Docker super easy.

# get wrapper script from GitHub
wget https://raw.githubusercontent.com/Kalan-Lab/lsaBGC-Pan/main/docker/run_lsaBGC-Pan.sh

# change permissions to allow execution
chmod a+x ./run_lsaBGC-Pan.sh

# run script
./run_lsaBGC-Pan.sh

See info on how to run run_lsaBGC-Pan.sh below.

Conda Manual

# 1. get the latest release (at the time of writing this, it was v1.0.6)
# but there might be newer releases.
wget https://github.com/Kalan-Lab/lsaBGC-Pan/archive/refs/tags/v1.0.6.tar.gz
tar -zxvf v1.0.6.tar.gz 
cd lsaBGC-Pan-1.0.6/

# 2. create conda environment using yaml file and activate it!
conda env create -f lsaBGC_env.yml -n lsaBGC_env
conda activate lsaBGC_env

# 3. complete python installation with the following commands:
pip install -e .

# 2. setup annotation databases. This can take 5 - 30 
# minutes depending on whether you want to install the 
# small or full version of the databases and the end 
# product will be ~5 GB (small) or ~40 GB (full)! 

# Recommended for laptop: download small/minimal database 
# (only PGAP HMMs & MIBiG proteins)
setup_annotation_dbs.py -ld

# Recommended for server: download full database (PGAP, 
# CARD, KOfam, ISfinder, MIBiG, etc.)
setup_annotation_dbs.py

Testing Installation

Testing Bioconda installation

You can test the installation worked by running the test dataset of 7 Cutibacterium acnes and Cutibacterium avidum genomes provided in this repo.

# get the input dataset
wget https://github.com/Kalan-Lab/lsaBGC-Pan/raw/main/test_case.tar.gz

# get the bash script to run the test
wget https://raw.githubusercontent.com/Kalan-Lab/lsaBGC-Pan/main/run_test.sh

# run the test!
bash run_test.sh

Testing Docker installation

You can test the installation worked by running the test dataset of 7 Cutibacterium acnes and Cutibacterium avidum genomes provided in this repo.

# get the input dataset
wget https://github.com/Kalan-Lab/lsaBGC-Pan/raw/main/test_case.tar.gz

# uncompress test_case.tar.gz and change in to the directory
rm test_case.tar.gz
wget https://github.com/Kalan-Lab/lsaBGC-Pan/raw/main/test_case.tar.gz
rm -rf test_case/
tar -zxvf test_case.tar.gz
cd test_case/

# get the wrapper bash script for Docker based running of 
# lsaBGC-Pan from GitHub
wget https://raw.githubusercontent.com/Kalan-Lab/lsaBGC-Pan/main/docker/run_lsaBGC-Pan.sh

# change permissions for it to allow execution
chmod a+x ./run_lsaBGC-Pan.sh

# run test
./run_lsaBGC-Pan.sh -g input_genomes/ -o lsabgc_pan_results/ -nb -c 4