You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a vcf file where both the genotype (GT) and the read depth (DP) is given for each sample. I would like to set all genotypes where the read depth is below x (DP<x) to missing (GT-->'.'). For that, I use the bcftools +setGT pluging. However, the filtering for genotypes with a low read depth does not do what it should do if I set the limit x too high.
Here is a minimal working example: bcftools view -H minimal_vcf_file_2_samples_gt_and_dp.vcf.gz
1 1890 . A G 0.2 . . GT:DP 1/1:12 0/0:102 0/1:10 1/1:20
2 1991 . G T 0.9 . . GT:DP 1/1:8 0/1:100 0/0:11 0/0:11
Filled 8 alleles
1 1890 . A G 0.2 . . GT:DP 1/1:12 0/0:102 ./.:10 1/1:20
2 1991 . G T 0.9 . . GT:DP ./.:8 0/1:100 ./.:11 ./.:11`
That is as expected: the genotypes with read depth below 12 are set to missing.
But here is the problem: bcftools +setGT minimal_vcf_file_2_samples_gt_and_dp.vcf.gz -- -t q -n . -e 'FMT/DP>=101' | bcftools view -H
Filled 6 alleles
1 1890 . A G 0.2 . . GT:DP ./.:12 0/0:102 ./.:10 ./.:20
2 1991 . G T 0.9 . . GT:DP 1/1:8 0/1:100 0/0:11 0/0:11
In the second row, no genotypes are filled.
But when I do: bcftools +setGT minimal_vcf_file_2_samples_gt_and_dp.vcf.gz -- -t q -n . -e 'FMT/DP>=100' | bcftools view -H
Filled 12 alleles
1 1890 . A G 0.2 . . GT:DP ./.:12 0/0:102 ./.:10 ./.:20
2 1991 . G T 0.9 . . GT:DP ./.:8 0/1:100 ./.:11 ./.:11
It behaves as expected. Could you help me figure out what is going on?
Maybe it ignores the command if for a specific variant/row no sample is above the threshold?
My version is bcftools --version
bcftools 1.12
Using htslib 1.12
I would appreciate your help, thank you.
UPDATE: I think the problem is that bcftools is not setting any genotypes for a row/variant if the set of excluded samples is empty. That is a problem since instead of setting genotypes for all samples, it does it for no samples. However, by replacing exclusion by an inclusion (-e to -i) and reversing the filtering condition (>= to <), the bug can be fixed: if the set of genotypes to be set is empty, not doing anything is exactly what you want the command to do:
I have a vcf file where both the genotype (GT) and the read depth (DP) is given for each sample. I would like to set all genotypes where the read depth is below x (DP<x) to missing (GT-->'.'). For that, I use the bcftools +setGT pluging. However, the filtering for genotypes with a low read depth does not do what it should do if I set the limit x too high.
Here is a minimal working example:
bcftools view -H minimal_vcf_file_2_samples_gt_and_dp.vcf.gz
bcftools +setGT minimal_vcf_file_2_samples_gt_and_dp.vcf.gz -- -t q -n . -e 'FMT/DP>=12' | bcftools view -H
That is as expected: the genotypes with read depth below 12 are set to missing.
But here is the problem:
bcftools +setGT minimal_vcf_file_2_samples_gt_and_dp.vcf.gz -- -t q -n . -e 'FMT/DP>=101' | bcftools view -H
In the second row, no genotypes are filled.
But when I do:
bcftools +setGT minimal_vcf_file_2_samples_gt_and_dp.vcf.gz -- -t q -n . -e 'FMT/DP>=100' | bcftools view -H
Filled 12 alleles
1 1890 . A G 0.2 . . GT:DP ./.:12 0/0:102 ./.:10 ./.:20
2 1991 . G T 0.9 . . GT:DP ./.:8 0/1:100 ./.:11 ./.:11
It behaves as expected. Could you help me figure out what is going on?
Maybe it ignores the command if for a specific variant/row no sample is above the threshold?
My version is
bcftools --version
I would appreciate your help, thank you.
UPDATE: I think the problem is that bcftools is not setting any genotypes for a row/variant if the set of excluded samples is empty. That is a problem since instead of setting genotypes for all samples, it does it for no samples. However, by replacing exclusion by an inclusion (
-e
to-i
) and reversing the filtering condition (>=
to<
), the bug can be fixed: if the set of genotypes to be set is empty, not doing anything is exactly what you want the command to do:bcftools +setGT minimal_vcf_file_2_samples_gt_and_dp.vcf.gz -- -t q -n . -i 'FMT/DP<101' | bcftools view -H
It might still be good to fix this bug though.
The text was updated successfully, but these errors were encountered: