qhery was developed by the Q-PHIRE Genomics team at Forensic and Scientific Services, Queensland Health.
While in development qhery can only be installed by downloading from git
git clone https://github.com/mjsull/qhery.git
- Python >= 3.9.12
- bcftools >= 1.10.2
- curl >= 7.83.1
- wget >= 1.20.3
- ncbi-blast+ >= 2.9.0+ - This will generate a BLASTx alignment of the genome for visualization
- lofreq >= 2.1.5 - if provided with a BAM file qhery will look for minor alleles in the alignment with lofreq
- samtools >= 1.7 - samtools is used to determine the depth of sequence along the genome, and which resistance mutations cannot be reported on due to lack of coverage.
qhery run --database_dir database_dir --vcf sample.vcf --pipeline_dir output_dir --lineage Omicron/BA.1 --sample_name mysample --rx_list Sotrovimab
Determines the amino acid changes caused by the mutations listed in sample.vcf and then compares them to a list of mutations that cause a reduction in Sotrovimab binding.
qhery run --database_dir database_dir --vcf sample.vcf --pipeline_dir output_dir --lineage Omicron/BA.1 --sample_name mysample --rx_list Sotrovimab Remdesivir --fasta sample.consensus.fasta --bam sample.primertrimmed.rg.sorted.bam
Determines the amino acid changes caused by the mutations listed in sample.vcf. Additionally will use lofreq to find minor alleles in the BAM file. Finally they are compared to a list of mutations that cause a reduction in Sotrovimab binding or reduction in Remdesivir efficiency.
qhery list_rx --database_dir database_dir
List treatments for which resistance information exists.
qhery produces two tables.
<sample_name>.full.tsv
Contains all mutations detected in the sample and all mutations associated with the treatments listed by the user.
<sample_name>.final.tsv
Both tables have the same format (described below in example output).
Contains all mutations detected in the sample that are both in genes assosciated with reistance to the treatments listed and are not lineage defining mutations. It also contains resistance mutation that do not have enough read depth to be called as present or absent (default 20x read depth).
Finally qhery will also produce a BLASTx alignment of the query to mature proteins and a bammix plot of the epitopes of the treatments the user listed (if available).
Mutation | alt_names | in_sample | in_variant | covered | resistance_mutation | Remdesivir_average_fold_reduction | Remdesivir_fold_reductions | Remdesivir_in_epitope | Sotrovimab_average_fold_reduction | Sotrovimab_fold_reductions | Sotrovimab_in_epitope |
---|---|---|---|---|---|---|---|---|---|---|---|
E:T9I | - | True | True | True | False | 0 | - | False | 0 | - | False |
M:D3G | - | True | True | True | False | 0 | - | False | 0 | - | False |
N:ERS31-33∆ | - | True | True | False | False | 0 | - | False | 0 | - | False |
ORF3a:L52F | - | True | False | True | False | 0 | - | False | 0 | - | False |
RdRP:802D | - | False | False | True | True | 2.54 | =2.54 | False | 0 | - | False |
S:R214ins | S:R214R_EPE | True | True | True | True | 0 | - | False | 3.00 | =3.0 | False |
S:P337T | - | True | False | True | True | 0 | - | False | 8.00 | =5.4,=10.6 | True |
column | header | description |
---|---|---|
1 | mutation | The mutation name. Gene name comes before the colon, then reference amino acid, position and sample amino acid |
2 | alt_names | Discrepency between database mutation name and csq mutation name |
3 | in_sample | Is mutation in the query |
4 | in_variant | Is mutation a lineage defining mutation |
5 | covered | Is the mutation covered by 20 or more reads |
6 | resistance_mutation | Is there evidence the mutation may confer some resistance to one of the treatments listed |
7 | rx1_average_fold_reduction | Average fold reduction of listed fold reductions |
8 | rx1_fold_reductions | Fold reductions listed in the database |
9 | rx1_in_epitope | Is the mutation in the epitope of this treatment (MABs only) |
10 | rx2_average_fold_reduction | The previous 3 columns repeate for each treatment provided by the user |
11 | rx2_fold_reductions | ... |
12 | rx2_in_epitope | ... |
- Need a phasing step between lofreq and bcftools csq (or to switch to a vcf caller that does phasing)
- Add allele frequency information to output table
-h
, --help
show this help message and exit
List all drugs for which resistance information is available.
Takes no arguments
Determines mutations in samples and then checks against resistance data.
-n
, --sample_name
<sample_name>
Sample name, output files will be prefixed with this.
-v
, --vcf
<sample.vcf>
vcf file, variants called against the Wuhan-Hu-1 reference (MN908947.3)
-b
, --bam
<sample.sorted.bam>
Sorted bam file. File of read alignments for the sample mapped against the Wuhan-Hu-1 reference (MN908947.3)
-d
, --database_dir
<path/to/database_dir>
Directory with the latest version of the Stanford resistance database. If the latest version is not in this folder it will be downloaded to this location.
-p
, --pipeline_dir
<path/to/pipeline_dir>
All script output and intermediated files will be put here. Script will create a directory if none exists.
-l
, --lineage
<BA.1>
Lineage of the query (BA.1/BA.2/BA.3/Delta etc.)
-rx
, --rx_list
<Sotrovimab Remdesevir>
List of treatments to interrogate.
--fasta
, --fasta
<sample.fasta>
Fasta file of the consensus sequence of the sample, only used to generate a BLASTx alignment for double checking mutations.
Only list mutations and not resistance information.
-n
, --sample_name
<sample_name>
Sample name, output files will be prefixed with this.
-v
, --vcf
<sample.vcf>
vcf file, variants called against the Wuhan-Hu-1 reference (MN908947.3)
-b
, --bam
<sample.sorted.bam>
Sorted bam file. File of read alignments for the sample mapped against the Wuhan-Hu-1 reference (MN908947.3)
-d
, --database_dir
<path/to/database_dir>
Directory with the latest version of the Stanford resistance database. If the latest version is not in this folder it will be downloaded to this location.
-p
, --pipeline_dir
<path/to/pipeline_dir>
All script output and intermediated files will be put here. Script will create a directory if none exists.
-l
, --lineage
<BA.1>
Lineage of the query (BA.1/BA.2/BA.3/Delta etc.)
-k
, --keep_lineage
report lineage defining mutations as well
A flowchart of how qhery run
works