Skip to content

high sensitivity mapping by default

Latest
Compare
Choose a tag to compare
@ekg ekg released this 02 Sep 15:12
· 44 commits to main since this release
4521c10

Buildable source tarball: wfmash-v0.21.0.tar.gz

Previously, settings that might make runtime slightly better when aligning pangenomes hurt performance in comparative genomics contexts. Updates related to mashmap3 and alignment have made us much more robust to defaults that are more sensitive.

In this release, we're setting a bunch of defaults which have become standard in testing:

  • Default minimum mapping identity reduced from 90% to 70%.
  • Set maximum mapping length to 50k by default (previously unlimited).
  • Changed block length default from 5x segment length to 3x segment length.
  • Set default chain gap to 30kb (previously was 6x segment length, up to 30k).
  • Reduced default segment length from 5k to 1k.
  • Changed default kmer size from 19 to 15.
  • Modified wflign to run on all fragments except very small ones (less than 1000 bp).
  • Changed filtering logic to use Euclidean distance as an absolute cutoff instead of axis-weighted Euclidean distance, while still ranking based on axis-weighted distance.

These should tend to make wfmash more sensitive at the edges of its performance envelope with minimal costs for easy, low-divergence pangenome alignment problems.