Switch to GATK PairedEndAndSplitReadEvidenceCollection for PESR collection #34
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This switches the pipeline to using the GATK tool for PESR collection.
The GATK tool produces results that differ from
svtk
only in (i) the sort order of the discordant reads file -- read pairs are now sorted in sequence dictionary order and have a secondary sort on the position of the second read in the pair; and (2) changes in the spilt read file on HLA and other small alt contigs due to fixing #24.I've tested several runs of the single sample pipeline with this change. The number of variants changes slightly from runs that used svtk, and some variants change their position, but the exact set of variants which change differ from run to run, so I'm chalking that up to non-deterministic behavior in downstream steps of the pipeline.