Releases: broadinstitute/gatk-sv
v1.0
Welcome to v1.0 of GATK-SV! With this release, we are deprecating MELT in favor of Scramble, adding downstream filtering and refinement steps to the core pipeline and Terra workspace, moving our documentation to our website, and getting ready to release our public Terra workspace for the joint calling mode of GATK-SV.
Updates include
Pipeline functionality
- Improve Scramble accuracy for BWA and Dragen 3.7.8
- Refine complex variants and translocations
- Filter wham-only DELs and scramble-only SVAs in CleanVcf
- Update GATK to 4.6.0.0 in sv_pipeline_docker
- Resolve incorrect mCNV genotype counts for hom-alt
- Updates to EvidenceQC for sample QC analysis
- Enhance EvidenceQC outputs to better align with sample QC
- Add Scramble metrics to the qc table generated in EvidenceQC
- Rename high & low overall outliers columns generated via EvidenceQC
- Leave empty genotypes in CleanVcf part 5
- Add check for duplicate records to MainVcfQc
- Update type-checking in RdTestV2 to be class-based
Documentation
- Move documentation to website
- Update docs on creating sample_set_set
- Update GatherSampleEvidence & TrainGCNV docs
- Update the landing page of the docs
- Add genotype filtering Terra workflow configs and documentation
- Include SVTK documentation in README
For developers
- Deprecate Google Cloud configuration JSONs
- Add the single-sample pipeline to the Dockstore automation
Full Changelog: v0.29-beta...v1.0
v0.29-beta
What's Changed
Critical
This release includes a critical bug fix to the SplitVariants task in GenotypeBatch. The affected workflow was GenotypeBatch, and the affected version was v0.28.5-beta. We recommend immediately updating to v0.29-beta. If you ran GenotypeBatch with v0.28.5-beta, please check if any records were dropped. If in doubt, re-run with v0.29-beta. More details in #712
Pipeline functionality updates
- Integrate ReshardVcf into ResolveComplexVariants
- Remove CHR2 and END2 from INS in CleanVcf
- Fix --par arg to compute_AFs.py in ShardedAnnotateVcf
- Updates to allele frequency annotation fields
- Grouped MEIs with insertions in splitvariants.py
Performance improvements
- Reimplement ParseGenotypes in GenotypeComplexVariants
- Reduce memory usage in GenotypeSRPart1
- Set the default disk size in AnnotateIntervals as a function of input files size
- Making the Vapor plots optional to store as a final output
- Make per-sample QC plots optional in MainVcfQc
Fixing bugs and small annoyances
- Skip subsampling if batch size is less than n_samples_subsample
- Add NonZeroReferenceLengthAlignmentReadFilter read filter to CollectSVEvidence
- Prevent sample ID mangling in WGD computation
- Update gatk docker with changes to handle CPX_TYPE for CTX
- Fix UnboundLocalError in EvidenceQc
- Update gnomad-v2 sample-level benchmarking data path
Documentation
- Terra dashboard updates
- Reorganize docs on running the pipeline
- Add CONTRIBUTING.md
- Update Docusaurus and its dependencies to v3.3.2
- Update docs on building and hosting Docker images
- Update docs on building inputs
- Add documentation for rename_samples in GatherBatchEvidence
CI/CD, workflow organization, and auxiliary scripts
- Deprecate single-batch Terra configs
- Add support for incomplete workflows in get_inputs_outputs.py
- Trigger WDL tests on changes to /inputs and fix Terra config tests
- Update deprecated Pandas append operation for monitoring log analysis
- Extend the list supported syntax-highlighting languages
Full Changelog: v0.28.5-beta...v0.29-beta
v0.28.5-beta
Updates
GatherSampleEvidence
- Copy or move the files instead of creating symlinks in the LocalizeReads task for compatibility with CoA/TES
- Make LocalizeReads optional
- Make manta region bed index required
- Make MELT scripts independent of the execution path
Main pipeline
- Rewrite SplitVariants in TasksGenotypeBatch.wdl
- Disable MakeCohortVcfMetrics by default
- Find CN field for mCNVs in MainVcfQc without hard-coding order of format fields
- Update BEDTools version to 2.31.0
- Two small bugfixes to EvidenceQc
- Update gatk_docker with CPX annotation changes to SVAnnotate
- Reduce redundant docker image inputs:
sv_pipeline_base_docker
,sv_pipeline_hail_docker
,sv_pipeline_updates_docker
,sv_pipeline_rdtest_docker
all now justsv_pipeline_docker
Downstream filtering
- Reduce SVConcordance memory usage
- Release AoU filtering model
- Make the ploidy table a workflow output instead of an intermediate
- Change Vapor bed preprocessing errors to warnings
- Genotype filtering training labels and cutoff optimization
Misc.
- Add ReshardVcf workflow
- Setup automatic updates to WDLs on Dockstore
- Trivial change to sv-pipeline-virtual-env
- Add a preview version of the de-novo pipeline
Full Changelog: v0.28.4-beta...v0.28.5-beta
v0.28.4-beta
Updates include
SV discovery algorithms:
- Update manta to 1.6.0
- Optimize Scramble
New functionality:
- Add PED file validation to GatherBatchEvidence and as standalone script
Documentation and CI/CD:
- Add Scramble docker to the build script & update docs
- Clarify single-batch vs. multi-batch workflows in cohort mode Terra workspace on dashboard
Updates and bug fixes:
- Remove redundant GetPed task in GatherBatchEvidence.CNMOPS
- Make genome_tracks optional in FilterGenotypes
- Toggle Collect PESR/Counts in GatherSampleEvidence with booleans
- Fix allosomes_list bug
- Filter mCNVs under 5kbp
- Remove hgdp_1kgp_ped test batch input from MainVcfQc template
- Fix melt insert size input type in EvidenceQC
- Correct use of optimize_vcf_records_per_shard in FilterGenotypes
- Update JoinRawCalls formatter arg in template
- Check for empty scatter in GenotypeDepthPart2
- Make the Cromwell root config in the MELT workflow portable
Full Changelog: v0.28.3-beta...v0.28.4-beta
v0.28.3-beta
Updates include
- Remove CHR2 and END2 for CPX at the end of CleanVcf
- Add TSV rename map option to RenameVcfSamples
- Add required index files to hg38 resources
- Add Docker images build instructions to the docs
- Add PlotSVCountsPerSample subworkflow to the end of ClusterBatch and FilterBatchSites
Full Changelog: v0.28.2-beta...v0.28.3-beta
v0.28.2-beta
Updates include
- Add high-level description of docker images and their build process
- Remove max GQ filtering from per sample QC
- Update protein-coding GTF to MANE v1.2
- Reference panel resources and critical bug fixes
- Add FilterGenotypes workflow
- Remove use of the file command
- Move PESR genotyping filter statuses to info fields
- Bug fix building samtools-cloud docker
- Bug fix Github actions out of disk error
- Sharded AnnotateVcf workflow
- Add Terra validation to Github actions
- Resolve new linting errors
- Deduplicate records and CPX variant IDs in ResolveComplexVariants
Full Changelog: v0.28.1-beta...v0.28.2-beta
v0.28.1-beta
This release incorporates the updated docker images from v0.28-beta.
Updates
- Fix a bug committing changes to dockers_*.json after an image is updated
- Delete some commented lines in drop_empty_records.py to prompt docker build
Full Changelog: v0.28-beta...v0.28.1-beta
v0.28-beta
Note: This version corresponding to v0.28-beta does not have updated docker images. The docker image updates are now reflected in v0.28.1-beta so that version should be used instead.
What's changed
Improvements to SR genotyping and INS breakpoints:
- Update SR counting and genotyping
Filtering workflows:
- Update ApplyManualVariantFilter json template
- Update VaPoR workflows
- SVConcordance workflows update
- Fix CTX END2 error in GATK formatting script
Other workflows:
- Fix non-deterministic errors in GetSampleIdsFromVcfTar
- make the RunMELT task a little more robust
- Improve STR workflow
- Remove DeleteIntermediateFiles task
- Remove unused ped_file input from GenotypeBatch and RegenotypeCNVs
Docker build:
- Manually install MASS R package in sv-pipeline-virtual-env
- Build and publish images to multiple registries in one job, & update the dependencies of the action
Other scripts, docs, and JSONs:
- Update copy_outputs.py
- Generate Terra workspace tsv files from transposed tables
- Update README to link to SV callers used
- Remove ped file from workspace data TSV for cohort mode terra workspace
Full Changelog: v0.27.3-beta...v0.28-beta
v0.27.3-beta
What's Changed
- Update README for docker build script
- Fix for tiny shard of IntegrateGQ in single sample pipeline
- Set minimum PE count of 1 during genotyping
Full Changelog: v0.27.2-beta...v0.27.3-beta
v0.27.2-beta
What's Changed
- Add missing space in functional annotation to enable optional settings
- Eliminate cram to bam conversion when possible
- Add ref panel inputs for MakeCohortVcf subworkflows
- Extend STR workflow to collect additional locus-level metrics
- Change ref allele to N if unsupported during vcf standardization
- Add sample renaming for SD files in GatherBatchEvidence
- Remove vcf header contig sorting in CleanVcf5
- Add support for building dockers for multiple registries
- Remove non-public images from the git-sha-based target determination in docker build script
Full Changelog: v0.27.1-beta...v0.27.2-beta