Releases: hasindu2008/f5c
f5c-v1.5
f5c-v1.4
Changes from v1.3 are:
- shift and scale added to sam output (sc and sh tags). See here
- methylation compare and plotting scripts added compare_methylation.py and plot_methylation.R
- update slow5lib to latest
- inbuilt rna004 support (added inbuilt rna004 9-mer model invoked via option --pore rna004, autodetected for S/BLOW5]
f5c-v1.3
Changes/new features from f5c-v1.3-beta are:
-
adding a draft 5-mer pore model for upcoming RNA004 chemistry. To use this model:
First download the model file
wget https://raw.githubusercontent.com/hasindu2008/f5c/v1.3/test/rna004-models/rna004.nucleotide.5mer.model
Then execute event align with
--kmer-model
pointing to the path to the downloaded k-mer model as follows:f5c eventalign --rna -b reads.bam -r reads.fastq -g transciptome.fa -o eventalign.tsv --kmer-model /path/to/rna004.nucleotide.5mer.model # --slow5 reads.blow5 if using S/BLOW5 input
Thanks @GoekeLab for help with this model generation and testing.
-
Fixing a bug that affected the last line of SAM output being truncated sometimes (--sam in eventalign).
Compared f5c-v1.2, this f5c-v1.3 version (introduced in f5c-v1.3-beta) adds a new feature to f5c eventalign to write the output in PAF and SAM formats with signal-to-reference alignment information embedded as tags. This output is much more compact than the default TSV output, yet is sufficient for reconstructing the alignment (Note: As of this version signal samples corresponding to insertion are yet collapsed to the previous base. This will be improved in future versions). These SAM and PAF formats with signal alignment tags are the heart of the pileup view being implemented in Squigualiser.
- eventalign PAF output which is explained here (
-c
option) - default eventalign SAM output changed (
--sam
option). The new SAM format is explained here. To revert to the old SAM format (same format as when Nanopolish --sam is provided), please provide--sam-out-version 1
. The reason for the change in the SAM format is to support the upcoming pileup view in Squigualiser. See the screenshots below. -a
shorthand option for--sam
f5c-v1.3-beta
This beta version adds a new feature to f5c eventalign to write the output in PAF and SAM formats with signal-to-reference alignment information embedded as tags. This output is much more compact than the default TSV output, yet is sufficient for reconstructing the alignment. These SAM and PAF formats with signal alignment tags are the heart of the pileup view being implemented in Squigualiser.
- eventalign PAF output which is explained here (
-c
option) - default eventalign SAM output changed (
--sam
option). The new SAM format is explained here. To revert to the old SAM format (same format as when Nanopolish --sam is provided), please provide--sam-out-version 1
. The reason for the change in the SAM format is to support the upcoming pileup view in Squigualiser. See the screenshots below. -a
shorthand option for--sam
f5c-v1.2
No changes from v1.2-beta. Changes from v1.1 are as follows:
Major changes
- support for R10.4.1 flowcells (specify: --pore r10 option if FAST5, austodetected if S/BLOW5) for eventalign, call-methylation and resquiggle modules. See below for a benchmark.
Minor improvements
- sprintf_append dynamic (Sasha Jenner) #120
- prints stats on skipped reads due to low mapq, secondary mappings, etc. and warns when most reads are skipped due to lower mapq #122
- warn when more than half the reads fail to align/qc/calibration
- methylation models are unnecessarily not loaded for eventalign and resquiggle
- f4c3aaf: minor summary change
- 527ffc3: fix rsq
R10.4.1 f5c methylation calling benchmark
Correlation between whole genome NA12878 f5c CpG methylation calls (inhouse generated R10.4.1 PromethION data) and publicly available NA12878 bisulphite (no coverage filtering, that is, 1X also included):
Correlation between whole genome NA24385 f5c CpG methylation calls (inhouse generated R10.4.1 PromethION data) and publicly available NA24385 bisulphite data without any coverage filtering (no coverage filtering, that is, 1X also included):
The log-likelihood ratio for methylated vs unmethylated calls:
How R10.4.1 model training was done is detailed here. Anyone who have access to better training data may follow this tutorial to further improve the accuracy.
f5c-v1.2-beta
Major changes from 1.1 are:
- support for R10.4.1 flowcells (specify: --pore r10 option) for eventalign, call-methylation and resquigglemodules
Minor improvements:
-
sprintf_append dynamic (Sasha Jenner) #120
-
prints stats on skipped reads due to low mapq, secondary mappings, etc. and warns when most reads are skipped due to lower mapq #122
-
warn when more than half the reads fail to align/qc/calibration
-
methylation models are unnecessarily not loaded for eventalign and resquiggle
-
f4c3aaf: minor summary change (Hasindu Gamaarachchi)
-
527ffc3: fix rsq (Hasindu Gamaarachchi)
f5c-v1.1
-
the experimental resquiggle module (contributed by @hiruna72) for signal to basecall alignment of DNA and RNA. Output format is documented here. Usage:
Usage: f5c [OPTIONS] reads.fastq signals.blow5 options: -t INT number of processing threads [8] -K INT batch size (max number of reads loaded at once) [512] -B FLOAT[K/M/G] max number of bases loaded at once [5.0M] -h help -o FILE output to file [stdout] -x STR parameter profile to be used for better performance (always applied before other options) e.g., laptop, desktop, hpc; see https://f5c.page.link/profiles for the full list -c print in paf format --verbose INT verbosity level [0] --version print version --kmer-model FILE custom nucleotide k-mer model file (format similar to test/r9-models/r9.4_450bps.nucleotide.6mer.template.model) --rna the dataset is direct RNA
-
f5c can be now built without HDF5 support (
./configure --disable-hdf5 && make
) for anyone who wants to easily compile and work only with S/BLOW5 files -
new option
--skip-slow-idx
added to f5c index:, which disable f5c from building the .idx for the slow5 file (useful when a slow5 index is already available) -
error messages improved for slow5 index
f5c-v1.0
bump version to v1.0 as the interface has been stable for adequate time
Changes from v0.9:
- updates to readme and information messages
f5c-v0.9
- minor bug fixes that only affects usability
- update slow5lib to enjoy fast indexing
- added instructions for mac m1 compilation
f5c-v0.8
- update slow5lib to 0.3.0 and introduce support for zstd and svb compressed BLOW5
- fix a minor bug (#94)
- a new option --min-recalib-events that exposes the minimum number of events to recalibrate (still experimental)
- a new option --collapse-events that collapses events that stays on the same reference k-mer (see #95)