IntelDeflater intermittently fails to properly compress outputs with GKL 0.8.8 #177

droazen · 2023-01-30T18:58:22Z

As reported by @kachulis in broadinstitute/gatk#8141, we are finding that the IntelDeflater in GKL version 0.8.8 seems to intermittently fail to properly compress outputs:

I run PrintReads over and over again, on the same input data, not doing anything, just read in, write out, ie gatk PrintRead -I input.bam -O output.bam. Mostly, I just get an identical 9GB bam over and over again (as confirmed by md5). However, sometimes (~10% of the time it seems), I get a MUCH larger “bam”, more like ~45GB. In runs where I get these larger output files, they are not always the same size, sometimes 45GB, sometimes 47GB (still always with the same input file, same commandline, same wdl task, etc). The runs that produce these larger bam also take much longer, with slower “reads per minute rate). They report exactly the same number of reads processed in the logs as the “normal” runs.

Looking inside the large output “bams” with gsutil cat, I see the header suddenly transitioning from compressed looking jibberish to a plaintext header, and then after a bit back to compressed looking jibberish again. Additionally, if I run these large bams through samtools view to get samtools to write them as a bam (ie samtools view big.bam -o samtools_out.bam) the resulting bam is much smaller ~6GB. It kind of seems like sometimes gatk will just stop compressing the output, and then start back up again, seemingly randomly??

This does not occur with GKL 0.8.6, and seems to have been introduced by the upgrade from GKL 0.8.6 to GKL 0.8.8 in https://github.com/broadinstitute/gatk/pull/7203/files

Additional data points:

Reproducible on very small files (at about same rate of ~10%)
Appears to be related to the IntelDeflater. when running with JDK deflater (--use-jdk-deflater) all 100/100 runs result in same sized bam

Any help would be much appreciated! This is actually a rather serious issue for us that might force us to temporarily revert back to the JDK deflater or the older GKL release if it looks like it might be difficult to diagnose / fix.

(CC @lbergelson)

kdhanala · 2023-01-30T20:41:48Z

Thank you for reaching out. I noticed between GKL 0.8.6 to 0.8.8, ISAL has been upgraded from 2.21 to 2.30 but since it is happening intermittently we will first try to reproduce it on our end using a small bam file(~9gb) like mentioned in the original issue. We will use a 7gb file from our old long reads list (PAE09121_dae79b.bam.raw) and iterate it for ~100 times to check any anomalies in output sizes.

mateuszsnowak · 2023-04-14T13:10:07Z

ISA-L added a new configuration field (isal_zstream->hist_bits) between versions 2.21 and 2.30 which wasn't initialized to default value by isal_deflate_stateless_init call. This was fixed by a recent commit in ISA-L (intel/isa-l@9f2b68f).

When there were multiple simultaneous allocations of memory using malloc the OS kernel could provide a previously used memory page which contained a value different than default in place corresponding to hist_bits field. Values of hist_bits between 1 and 14 (especially lower ones) could reduce compression efficiency for this particular compression session. Other values (including 0) are replaced by ISA-l with 15 which is the default, most efficient setting.

Using calloc instead of malloc to allocate and fill with zeroes the isal_zstream struct also fixes this issue and prevents similar issues happening in the future.

lbergelson · 2023-04-14T14:47:44Z

@mateuszsnowak Yay!

droazen changed the title ~~IntelDeflater intermittently fail to properly compress outputs with GKL 0.8.8~~ IntelDeflater intermittently fails to properly compress outputs with GKL 0.8.8 Jan 30, 2023

kdhanala assigned mateuszsnowak Apr 4, 2023

mateuszsnowak mentioned this issue Apr 13, 2023

Use calloc instead of malloc when initializing ISA-L #178

Merged

mateuszsnowak closed this as completed in #178 Apr 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IntelDeflater intermittently fails to properly compress outputs with GKL 0.8.8 #177

IntelDeflater intermittently fails to properly compress outputs with GKL 0.8.8 #177

droazen commented Jan 30, 2023

kdhanala commented Jan 30, 2023

mateuszsnowak commented Apr 14, 2023

lbergelson commented Apr 14, 2023

IntelDeflater intermittently fails to properly compress outputs with GKL 0.8.8 #177

IntelDeflater intermittently fails to properly compress outputs with GKL 0.8.8 #177

Comments

droazen commented Jan 30, 2023

kdhanala commented Jan 30, 2023

mateuszsnowak commented Apr 14, 2023

lbergelson commented Apr 14, 2023