Skip to content

Themisto-v3.1.0 (13 April 2023)

Compare
Choose a tag to compare
@jnalanko jnalanko released this 13 Apr 09:41
· 76 commits to master since this release

New features

  • Themisto now prints estimated input and output rates during pseudoalignment to help estimate how long a run will take and how large the output will be.
  • Added a new command line option --report-relevant-kmer-count which reports for each read the number of relevant k-mers for the pseudoalignment. A k-mer is relevant if it is found in the index and has at least one color associated to it.
  • Added a new command line option --relevant-kmers-fraction to adjust the pseudoalignment algorithm so that it only reports pseudoalignments for reads for which the fraction of relevant k-mers was at least as large as a given threshold.

Performance

  • Faster index construction by choosing as key k-mers the last k-mers of ggcat colored unitigs.
  • Added parallelism for processing GGCAT unitigs.
  • Some micro-optimizations in pseudoalignment.
  • Fixed a bug that blew up the coloring index size by a factor of up to 64 if there was only one distinct color in the dataset.

Maintenance

There have been reports of crashes due to unknown instructions in the precompiled binaries (#24 and #25). We now compile the release binaries with native instructions disabled in SBWT and Roaring, which should fix these issues.