Skip to content

GATKv3 vs GATKv4 Genotype Concordance

tnguyensanger edited this page Jul 27, 2020 · 1 revision

In the Plasmodium falciparum data releases v6.0 and v7.0, we used GATK v3.8 and GATK v4.1.4.0 respectively to call genotypes. We noticed two oddities when we analyzed genotype concordance at the intersecting sites & samples for biallelic pass snps with the same alleles in both releases & QC pass samples.

  1. Genotypes would be called as missing by one version of GATK but HomRef by the other. We found that in 8% of these discordant calls, the GATK version that failed to call the genotype did so despite giving it high support for a HomRef call, with >= 10 Ref AD and 0 Alt AD.

  2. There seems to be an artificial cap of 30 for Ref AD for many genotypes in GATK v4. We saw this artifact in GATK v3.6 as well.

The oddities are discussed in more detail in this presentation

Clone this wiki locally