Skip to content

Releases: bbuchfink/diamond

DIAMOND v2.1.11

25 Jan 09:37
Compare
Choose a tag to compare
  • Improved the performance and sensitivity of the cluster, deepclust and linclust workflows.
  • The --faster mode will by default use a minimizer sketch of fixed size per sequence instead of window-based minimizers.
  • Added the option --sketch-size to enable seeding using a minimizer sketch of the given size per sequence.
  • Cascaded clustering and iterated search will by default use the --fast mode with linearization in the second round.
  • The --round-coverage parameter is now also applied to uni-directional coverage clustering.
  • Cluster output files will correctly contain carriage returns on Windows.
  • Fixed generation of the Docker container against the latest version of the NCBI toolkit.
  • Fixed a bug that caused target coordinates not to be reported correctly in the tabular format in frameshift alignment mode.
  • Added the options --ungapped-evalue and --ungapped-evalue-short to set e-value thresholds for the ungapped hit filter.
  • Linearization of search or clustering rounds is limited to seeds of weight >= 10.
  • Fixed an issue that could cause an array size overflow error when using very large .dmnd databases with taxonomic annotation.
  • Fixed a bug that caused query letters to be printed as ARND instead of ACGT in the view workflow.
  • Fixed a bug that caused using paired end input files to malfunction with an error message.
  • Fixed a bug that could produce clustering errors when clustering at sequence identities >= 50% and processing the database in multiple super blocks.
  • Fixed a bug that could cause a crash in global ranking mode.
  • Accession parsing rules applied to database sequence accessions for the purpose of matching them to accessions in the taxonomy mapping file are now by default also applied to the accessions in the mapping file (disable using --no-parse-seqids).
  • Fixed an issue that could cause increased memory use in the hash join stage.
  • Added support for FASTA headers containing multiple sequence IDs separated by blank spaces (so far only the \1 character was supported as a separator).
  • Fixed an issue that could cause hanging or crashes in the Computing alignments stage.
  • --linsearch can now be used in conjunction with --iterate.
  • Fixed a compiler error for GCC 4.8.5.
  • Fixed a compiler error on Solaris.
  • Fixed compiler errors on systems that do not support the sysinfo function.
  • Fixed Bus error occuring on Sparc systems.
  • Compilation on Sparc systems can be performed without setting -DX86=OFF.
  • Fixed two issues that could cause increased memory use in the computing alignments stage.
  • Fixed a bug that caused superfluous quote characters in the JSON output format.
  • Linear search modes will by default use full-matrix extension.
  • Fixed an issue that could cause reduced performance in the masking sequences stage.
  • Fixed a bug that could cause a crash when using mutual coverage thresholds in blastx mode.
  • Fixed a bug that could cause a crash when the --include-lineage option was used.
  • When reading protein sequences that unexpectedly only contain DNA letters, an error message is only produced if the first 10 sequences in the input file all exhibit the problem.
  • Fixed a bug that caused setting --top 100 not to function correctly.
  • Fixed a bug that caused target coordinates not to be reported correctly in the output of the realign workflow.
  • Fixed a bug that did not permit using the --memory-limit/-M option for the realign workflow.
  • Fixed an issue that could cause non-deterministic output in frameshift alignment mode.
  • Fixed a bug that could cause a crash when using the XML output format in the view workflow.
  • Fixed an issue that could cause non-deterministic output for identically-scoring HSPs in the same target.
  • Disabled the default use of increased coverage and identity cutoffs in earlier clustering rounds.
  • Optimized the performance of the extension stage when coverage or approximate identity filters are used.
  • Optimized the performance of the extension stage when not using output fields that require alignment traceback.
  • Fixed an issue that could cause an incorrect order of cascaded clustering rounds.

DIAMOND v2.1.10

19 Oct 12:10
Compare
Choose a tag to compare
  • Fixed a bug that could cause a crash when using a bi-directional coverage cutoff in query-indexed mode.
  • Fixed a bug that caused the --include-lineage option to malfunction for targets with no taxonomic assignment available.

DIAMOND v2.1.9

31 Jan 14:20
Compare
Choose a tag to compare
  • Corrected the prefix of the query length field for the SAM format.
  • Added the size modifiers 'T', 'M' and 'K' for the --memory-limit/-M option.
  • Added the option --mutual-cover to cluster sequences by mutual coverage percentage of the cluster representative and member sequence.
  • Added the option --symmetric for computing greedy vertex cover with symmetric edges.
  • Fixed an issue that caused the --approx-id option and the approx_pident output field not to work correctly when using the --anchored-swipe option.
  • Added the option --no-reassign to prevent reassignment to closest representative for the greedy vertex cover and clustering workflows.
  • Added the option --connected-component-depth to activate clustering of connected components at a given maximum depth for the greedy vertex cover and the clustering workflows.
  • Fixed a compiler error for Clang v17.
  • Improved search performance when searching with mutual coverage threshold by filtering for sequence length ratio.
  • Added the sensitivity mode --shapes-30x10 with sensitivity approximately equivalent to --mid-sensitive.
  • Added the options --round-coverage and --round-approx-id to set per round cutoffs for cascaded clustering.
  • The CMake switch -DKEEP_TARGET_ID is now obsolete and the corresponding function is always available.
  • Added the option --include-lineage to the taxonomic classification format to include taxonomic lineage in the output.
  • Added native support for the ARM NEON instruction set (contributed by @althonos).

DIAMOND v2.1.8

21 Jun 07:48
Compare
Choose a tag to compare
  • Fixed an issue that could cause reduced performance when running in query-indexed mode.
  • Added support for the JSON output format (option -f json-flat).
  • Added the option --sam-query-len to output query length in SAM format.

DIAMOND v2.1.7

31 May 12:25
Compare
Choose a tag to compare
  • Fixed a bug that caused taxonomy names not to be loaded correctly for the makedb workflow.
  • Fixed a bug that caused a crash when using the --target-indexed option.
  • Fixed an error when using the --tmpdir option for the makedb workflow.
  • Added a warning message when sequence accessions are shortened due to parsing rules for the makedb workflow.
  • Added the option --no-parse-seqids to disable parsing of sequence accessions.
  • Changed the command line help to print options separated by command.
  • Fixed an issue that the --ignore-warnings option could not be used for the makedb workflow.

DIAMOND v2.1.6

18 Mar 11:13
Compare
Choose a tag to compare
  • Fixed compatibility issues on older systems without support for AVX2.
  • Fixed linker errors when compiled with -DX86=OFF.
  • Fixed a compiler error on macOS systems.
  • Fixed a bug that could cause missing tags in the XML output format and unaligned queries not to be reported correctly.
  • Fixed a bug that caused the PAF output format not to work correctly.

DIAMOND v2.1.5

10 Mar 09:40
Compare
Choose a tag to compare
  • Disabled the use of frequency based seed masking when using the linear-time search feature with respect to the targets.
  • Fixed a bug that caused a Database file is not a BLAST database error message for the prepdb workflow.
  • Fixed a bug that caused a segmentation fault when using BLAST databases.
  • Added line numbers for error messages when reading taxonomy mapping files.
  • Fixed a bug that could cause a crash when using the greedy-vertex-cover workflow without the --out and --centroid-out options.
  • Fixed a bug that caused the greedy-vertex-cover workflow to only produce a trivial clustering.
  • Fixed a bug that caused the last codon of the -2 reading frame to be translated incorrectly.
  • Reduced the memory use of the clustering workflow.
  • Updated the bundled NCBI toolkit to the latest version.

DIAMOND v2.1.4

27 Feb 13:23
Compare
Choose a tag to compare
  • Leading spaces are now trimmed and tabulator characters escaped as \t in sequence titles, and a warning message is produced.
  • Blank sequence titles are now replaced by N/A, and a warning message is produced.
  • Fixed a bug that could cause a Traceback error in certain cases.
  • Fixed a bug that caused the qlen and score output fields not to be reported correctly for the realign workflow.
  • Added an error message when using unsupported output fields for the realign workflow.
  • Fixed an issue that could cause a Missing fields in input line error when clustering.
  • Optimized the performance of the linclust workflow.
  • Reduced the memory use of the clustering workflow.
  • Fixed a bug that caused using standard input as the query not to work.

DIAMOND v2.1.3

22 Feb 11:33
Compare
Choose a tag to compare
  • Fixed compiler errors for GCC 4.8.
  • Fixed a GCC compiler error.
  • Fixed a segfault issue occuring when compiled using GCC 12 on ARM64 systems.
  • Fixed an issue that caused missing support for AVX2.

DIAMOND v2.1.2

20 Feb 10:58
Compare
Choose a tag to compare
  • The iterated search mode (option --iterate) now uses a linear-time feature as the first search round.
  • Added the linclust command to cluster using only a single linear-time search round.
  • Fixed compiler errors on macOS.
  • Fixed a bug that caused invalid alignment traceback output for the DAA view workflow.
  • Added the merge-daa workflow to merge DAA files.
  • Fixed an error when using the --max-target-seqs/-k option for the DAA view workflow.
  • Removed AVX2 support from the Windows release binary to ensure compatibility with older systems.
  • Permitted the --ignore-warnings option for the cluster and deepclust workflows.
  • Use unlinked temporary files for database blocks in clustering workflows.
  • Fixed a bug that could cause invalid results when using a clustering step with linearization as the final round in combination with database processing in multiple super blocks.
  • The --lin-stage1 option can now be used without compilation using the -DEXTRA=ON cmake option.
  • Added the option to specify the _lin suffix for sensitivity keywords for the --iterate option to activate linear-time feature.
  • Added the option --linsearch to activate linear-time feature for the search workflows.
  • Fixed a bug that caused the ppos and positive output fields not to work for the realign workflow.
  • Fixed an issue that caused motif masking not to work when compiled with link time optimization.