Releases: bioinform/somaticseq
Releases · bioinform/somaticseq
Check for VCF sorting
- The program is now designed to crash if the VCF file(s) are not sorted according to the reference FASTA file.
- Output are identical to the previous version, as long as the VCF input files are sorted correctly.
Make compatible with .cram
- No guarantee if cram files are compatible with the individual mutation callers.
- Also fixed a bug where variants called by Strelka only were not considered, though this would not change the results much as Strelka-only somatic calls are very rare.
minor changes and bug fixes
- Without --gatk $PATH/TO/GenomeAnalysisTK.jar in the SomaticSeq.Wrapper.sh script, it will use utilities/getUniqueVcfPositions.py and utilities/vcfsorter.pl to (in lieu of GATK3 CombineVariants) to combine all the VCF files.
- Fixed bugs in the docker/singularities scripts where extra arguments for the callers are not correctly passed onto the callers.
- Otherwise does not change results from previous version.
minor improvements
- Added another feature: consistent/inconsistent calls for paired reads if the position is covered by both forward and reverse reads. However, they're excluded as training features in SomaticSeq.Wrapper.sh script for the time being.
- Change non-GCTA characters to N in VarDict.vcf file to make it conform to VCF file specifications.
maintenance release
- Optimized memory for singularity scripts
- Updated bamQC.py and added trimSoftClippedReads.py in utilities
- Added some dockered scripts at utilities/dockered_pipelines/QC
- No change to core SomaticSeq algorithm
Incorporated TNscope output
- Incorporated TNscope's output VCF into SomaticSeq, although it's not a part of the dockerized somatic mutation workflow.
Added singularity-compatible scripts
For paired tumor-normal workflow and bam simulation workflow, singularity-compatible scripts are located at utilities/singularities, with the same commands as dockerized workflows at utilities/dockered_pipelines.
minor updates for pipeline scripts
- Additional passable parameters options to pass extra parameters to somatic mutation callers. Fixed a bug where the "two-pass" parameter is not passed onto Scalpel in multiThreads scripts (although I have extensively tested --two-pass parameter and found it to have ZERO effect).
- Ignore Strelka_QSS and Strelka_TQSS for indel training in the SomaticSeq.Wrapper.sh script.
few features and improvements
- Fixed the bug where "CD4" in the output VCF file where alternate concordant reads where grabbed twice, when it should've been alternate concordant and then alternate discordant read.
- Added (limited) tumor-only support.
- Convert VarDict's "Complex" variants into SNVs when appropriate.
- Slightly modified r_scripts/ada_model_builder_ntChange.R script, i.e., the arguments succeeding the input TSV file are features to be ignored in training.
maintenance release
- Updated some docker job scripts.
- Added a script that converts some items in the VCF's INFO field into the sample field, to precipitate the need to merge multiple VCF files into a single multi-sample VCF, i.e., utilities/reformat_VCF2SEQC2.py.
- No change to somaticseq algorithm.