StrainGE is a set of tools to analyse the within-species strain diversity in bacterial populations. It consists of two main components: 1) StrainGST: Strain Genome Search tool, a tool to find close reference genomes for strains present in a sample and 2) StrainGR: Strain Genome Recovery, a tool to perform strain-aware variant calling at low coverages.
- Python >= 3.7
- NumPy
- SciPy
- matplotlib
- scikit-bio >= 0.5
- scikit-learn >= 0.24
- pysam
- h5py
- intervaltree
- bwa
- samtools
- mummer
pip install strainge
Warning: NumPy already has to be installed otherwise the above command will fail.
You'll have to make sure all tools like bwa
, samtools
and mummer
are installed as well.
-
Install Anaconda or miniconda (if not already present on your system)
-
Create a new environment:
conda create -n strainge python=3
-
Activate the environment:
source activate strainge
-
Enable
bioconda
andconda-forge
channels:conda config --add channels bioconda conda config --add channels conda-forge
-
Install StrainGE:
conda install strainge
Optional tip: also consider installing mamba before installing StrainGE for much faster conda operations.
The documentation can be read on readthedocs.
Dijk, Lucas R. van, Bruce J. Walker, Timothy J. Straub, Colin J. Worby, Alexandra Grote, Henry L. Schreiber, Christine Anyansi, et al. 2022. “StrainGE: A Toolkit to Track and Characterize Low-Abundance Strains in Complex Microbial Communities.” Genome Biology 23 (1): 74. https://doi.org/10.1186/s13059-022-02630-0.