Skip to content

aandradebio/V2IDA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Viral Vaccine genetIc Diversity Analyzer

This script provides an automatized and user-friendly scientific pipeline to perform variant calling and/or quasispecies reconstruction specifically for viral vaccine samples. It was previously used to establish the relationship among genetic diversity, vaccine stability, and the possible reversion to virulence caused by the presence of SNPs and viral quasispecies in vaccine lots from the 17DD strain of Yellow Fever vaccine.

List of Tools Used in this Pipeline

All requirements should be downloaded and installed by the user.

As default the tools should be in path. As an alternative, the pre-compiled files should be in the same folder as the V2IDA script.

JDK 7

BWA-MEM v. 0.7

Samtools v. 1.6

Picard v. 2.21.9

GATK v.4

QuasiRecomb v. 1.2

Pipeline Overview

Usage

./v2ida.sh id ref pair-end initial final parts

id is the name of a .tab file containing sample name, r1.fastq file and r2.fastq file in the same line. Other samples should be placed in the next lines. One sample per line;

ref is the reference genome used for alignment (default: fasta format)

pair-end or single-end mode

initial is the inicial nucleotide for quasispecies reconstruction

final is the final nucleotide for quasispecies reconstruction

parts is how many parts you would like to divide the genome for quasispecies reconstruction (eg. 1 if you dont want to divide)

Example:

./v2ida.sh samples MN737509 pair-end 1 10862 5

In this example, the V2IDA pipeline reads the sample names from the samples.tab file, uses the MN737509.fasta file as reference genome, pair-end raw data and divides the genome from nucleotide 1 to nucleotide 10.862 in 5 parts.

To costumize the SNP hard-filtering criteria, we suggest the reading of GATK'S Best Practices.

Once the V2IDA pipeline analysis is finished, it generates multiple files that comprise metrics and can be opened in any web browser or text editor.

Additionally, we suggest the use of complementary algorithms to perform SNP effect prediction (eg. SNPeff) and Phylogenetic analysis of reconstructed quasispecies (eg. Seaview) from the output files generated by V2IDA pipeline.

Credits

This pipeline was developed by Andrade, AAS (aandradebio@gmail.com) at the National Laboratory for Scientific Computing - Bioinformatic Laboratory (LABINFO), with contributions from Soares, AER, Almeida LGP and Vasconcelos, ATR.

Andrade AAS, Soares AER, Paula de Almeida LG, Ciapina LP, Pestana CP, Aquino CL, Medeiros MA, Ribeiro de Vasconcelos AT. Testing the genomic stability of the Brazilian yellow fever vaccine strain using next-generation sequencing data. Interface Focus. 2021 Jun 11;11(4):20200063. doi: 10.1098/rsfs.2020.0063. PMID: 34123353; PMCID: PMC8193464.

About

Viral Vaccine genetic Diversity Analyzer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages