GitHub repository for Unified variant pipeline (UVP): https://github.com/CPTR-ReSeqTB/UVP
GitHub repository for BioCompute Objects: https://github.com/biocompute-objects/
The BioCompute Objects user guide provides an introduction to implementing/writing a BCO for a pipeline and/or a workflow, and is taken from the BioCompute Objects Specification Document.
The UVP BCO was made to standardize how we communicate the process for imputing drug resistance profiles using sequence-based technologies. The UVP incorporates a suite of the most current bioinformatics analysis tools, written in python scripting language. In broad outline, there are four major steps implemented in the pipeline:
- Input data validation & QC
- Sequence reads mapping & refinement
- Variant calling
- Functional annotation & lineage analysis
Note that unless you are viewing a release this is a draft subject to change.
Table of content:
All of the files referenced in the UVP BCO are available at Unified Variant Pipeline BioCompute Object site.
The following are the tools used in UVP:
BEDtools Version 2.17.0, Bcftools Version 1.2, BWA Version 0.7.12, FastQC Version 0.11.5, Fastqvalidator Version 1.0.5, GATK Version 3.4.0, Kraken Version 0.10.5, Picard Version 1.134, Prinseq-lite.pl Version 0.20.4, Pigz Version 2.3.3, Qualimap Version 2.1.1, Samtools Version 1.2, SnpEff Version 4.1, Vcftools Version 0.1.126
The UVP requires at least 100GB RAM and up to 100GB storage space to run locally. Insatlling the UVP on your local machine is straight forward. Clone the entire repository, and download the specific version of each of the third party tool listed above into the 'Local Directory Path'/uvp/bin folder . You will need to edit the config.yml file in the 'Local Directory Path'/uvp/bin folder to point to the correct directory and file paths of all the scripts and tools listed there in.
You will run the UVP using command line prompts, by invoking the UVP module in the 'Local Directory Path'/uvp/scripts directory:
'Local Directory Path'/uvp/scripts/UVP -q 'input fastq' -r 'path to H37Rv reference genome fasta file' -n 'sample name' -q2 'paired fastq file' -a -v