enhancement: bcftools +contrast could work in a Map/Reduce framework if allowed to print NASSOC results even when the number of cases or controls is zero #1566

freeseek · 2021-08-24T16:07:34Z

I am trying to implement a Fisher's exact test analysis for rare variants in a large biboank. The data is split across batches but since Fisher's exact only needs counting to work, each batch could be analyzed independently and the output could be combined across batches without having to merge the batches and allowing the computation to be highly parallel.

I thought that I could use bcftools +contrast and bcftools merge as follows:

bcftools +contrast -a NASSOC -0 controls.txt -1 cases.txt --force-samples A.vcf.gz -o A.nassoc.vcf
bcftools +contrast -a NASSOC -0 controls.txt -1 cases.txt --force-samples B.vcf.gz -o B.nassoc.vcf
bcftools merge -i NASSOC:sum -m none --no-index A.nassoc.vcf B.nassoc.vcf -o nassoc.vcf

This was exciting as I could do this re-using existing infrastructure. However, there is only one minor drawback. Currently bcftools +contrast errors out when either the number of cases or controls is zero, which makes sense if you are interested in the p-value statistics from a single batch, but it makes less sense if you are using the counts in a larger multi-batch framework.

This drawback would be very easy to fix by changing the following three lines in contrast.c:

...
if ( !nlist ) error("None of the samples are present in the VCF: %s\n", str);
...
if ( !control_als )
...
if ( !has_gt )
...

To something like:

...
if ( !nlist ) fprintf(stderr,"Warning: None of the samples are present in the VCF: %s\n", str);
...
if ( !control_als && !(args->annots & PRINT_NASSOC) )
...
if ( !has_gt && !(args->annots & PRINT_NASSOC) )
...

(the warning could be kept as an error when the flag PRINT_NASSOC is off)

The text was updated successfully, but these errors were encountered:

pd3 · 2021-08-26T08:44:57Z

I added this, hope it helps

freeseek · 2021-08-26T14:04:31Z

Super! This is so helpful. Thank you @pd3

pd3 closed this as completed in cc737e8 Aug 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancement: bcftools +contrast could work in a Map/Reduce framework if allowed to print NASSOC results even when the number of cases or controls is zero #1566

enhancement: bcftools +contrast could work in a Map/Reduce framework if allowed to print NASSOC results even when the number of cases or controls is zero #1566

freeseek commented Aug 24, 2021

pd3 commented Aug 26, 2021

freeseek commented Aug 26, 2021

enhancement: bcftools +contrast could work in a Map/Reduce framework if allowed to print NASSOC results even when the number of cases or controls is zero #1566

enhancement: bcftools +contrast could work in a Map/Reduce framework if allowed to print NASSOC results even when the number of cases or controls is zero #1566

Comments

freeseek commented Aug 24, 2021

pd3 commented Aug 26, 2021

freeseek commented Aug 26, 2021