Themisto-v3.1.0 (13 April 2023)
New features
- Themisto now prints estimated input and output rates during pseudoalignment to help estimate how long a run will take and how large the output will be.
- Added a new command line option
--report-relevant-kmer-count
which reports for each read the number of relevant k-mers for the pseudoalignment. A k-mer is relevant if it is found in the index and has at least one color associated to it. - Added a new command line option
--relevant-kmers-fraction
to adjust the pseudoalignment algorithm so that it only reports pseudoalignments for reads for which the fraction of relevant k-mers was at least as large as a given threshold.
Performance
- Faster index construction by choosing as key k-mers the last k-mers of ggcat colored unitigs.
- Added parallelism for processing GGCAT unitigs.
- Some micro-optimizations in pseudoalignment.
- Fixed a bug that blew up the coloring index size by a factor of up to 64 if there was only one distinct color in the dataset.
Maintenance
There have been reports of crashes due to unknown instructions in the precompiled binaries (#24 and #25). We now compile the release binaries with native instructions disabled in SBWT and Roaring, which should fix these issues.