Releases: bbuchfink/diamond
Releases · bbuchfink/diamond
DIAMOND v2.1.11
- Improved the performance and sensitivity of the
cluster
,deepclust
andlinclust
workflows. - The
--faster
mode will by default use a minimizer sketch of fixed size per sequence instead of window-based minimizers. - Added the option
--sketch-size
to enable seeding using a minimizer sketch of the given size per sequence. - Cascaded clustering and iterated search will by default use the
--fast
mode with linearization in the second round. - The
--round-coverage
parameter is now also applied to uni-directional coverage clustering. - Cluster output files will correctly contain carriage returns on Windows.
- Fixed generation of the Docker container against the latest version of the NCBI toolkit.
- Fixed a bug that caused target coordinates not to be reported correctly in the tabular format in frameshift alignment mode.
- Added the options
--ungapped-evalue
and--ungapped-evalue-short
to set e-value thresholds for the ungapped hit filter. - Linearization of search or clustering rounds is limited to seeds of weight >= 10.
- Fixed an issue that could cause an
array size overflow
error when using very large.dmnd
databases with taxonomic annotation. - Fixed a bug that caused query letters to be printed as
ARND
instead ofACGT
in theview
workflow. - Fixed a bug that caused using paired end input files to malfunction with an error message.
- Fixed a bug that could produce clustering errors when clustering at sequence identities >= 50% and processing the database in multiple super blocks.
- Fixed a bug that could cause a crash in global ranking mode.
- Accession parsing rules applied to database sequence accessions for the purpose of matching them to accessions in the taxonomy mapping file are now by default also applied to the accessions in the mapping file (disable using
--no-parse-seqids
). - Fixed an issue that could cause increased memory use in the hash join stage.
- Added support for FASTA headers containing multiple sequence IDs separated by blank spaces (so far only the
\1
character was supported as a separator). - Fixed an issue that could cause hanging or crashes in the
Computing alignments
stage. --linsearch
can now be used in conjunction with--iterate
.- Fixed a compiler error for GCC 4.8.5.
- Fixed a compiler error on Solaris.
- Fixed compiler errors on systems that do not support the sysinfo function.
- Fixed
Bus error
occuring on Sparc systems. - Compilation on Sparc systems can be performed without setting
-DX86=OFF
. - Fixed two issues that could cause increased memory use in the computing alignments stage.
- Fixed a bug that caused superfluous quote characters in the JSON output format.
- Linear search modes will by default use full-matrix extension.
- Fixed an issue that could cause reduced performance in the masking sequences stage.
- Fixed a bug that could cause a crash when using mutual coverage thresholds in blastx mode.
- Fixed a bug that could cause a crash when the
--include-lineage
option was used. - When reading protein sequences that unexpectedly only contain DNA letters, an error message is only produced if the first 10 sequences in the input file all exhibit the problem.
- Fixed a bug that caused setting
--top 100
not to function correctly. - Fixed a bug that caused target coordinates not to be reported correctly in the output of the
realign
workflow. - Fixed a bug that did not permit using the
--memory-limit/-M
option for therealign
workflow. - Fixed an issue that could cause non-deterministic output in frameshift alignment mode.
- Fixed a bug that could cause a crash when using the XML output format in the
view
workflow. - Fixed an issue that could cause non-deterministic output for identically-scoring HSPs in the same target.
- Disabled the default use of increased coverage and identity cutoffs in earlier clustering rounds.
- Optimized the performance of the extension stage when coverage or approximate identity filters are used.
- Optimized the performance of the extension stage when not using output fields that require alignment traceback.
- Fixed an issue that could cause an incorrect order of cascaded clustering rounds.
DIAMOND v2.1.10
- Fixed a bug that could cause a crash when using a bi-directional coverage cutoff in query-indexed mode.
- Fixed a bug that caused the
--include-lineage
option to malfunction for targets with no taxonomic assignment available.
DIAMOND v2.1.9
- Corrected the prefix of the query length field for the SAM format.
- Added the size modifiers 'T', 'M' and 'K' for the
--memory-limit
/-M
option. - Added the option
--mutual-cover
to cluster sequences by mutual coverage percentage of the cluster representative and member sequence. - Added the option
--symmetric
for computing greedy vertex cover with symmetric edges. - Fixed an issue that caused the
--approx-id
option and theapprox_pident
output field not to work correctly when using the--anchored-swipe
option. - Added the option
--no-reassign
to prevent reassignment to closest representative for the greedy vertex cover and clustering workflows. - Added the option
--connected-component-depth
to activate clustering of connected components at a given maximum depth for the greedy vertex cover and the clustering workflows. - Fixed a compiler error for Clang v17.
- Improved search performance when searching with mutual coverage threshold by filtering for sequence length ratio.
- Added the sensitivity mode
--shapes-30x10
with sensitivity approximately equivalent to--mid-sensitive
. - Added the options
--round-coverage
and--round-approx-id
to set per round cutoffs for cascaded clustering. - The CMake switch
-DKEEP_TARGET_ID
is now obsolete and the corresponding function is always available. - Added the option
--include-lineage
to the taxonomic classification format to include taxonomic lineage in the output. - Added native support for the ARM NEON instruction set (contributed by @althonos).
DIAMOND v2.1.8
- Fixed an issue that could cause reduced performance when running in query-indexed mode.
- Added support for the JSON output format (option
-f json-flat
). - Added the option
--sam-query-len
to output query length in SAM format.
DIAMOND v2.1.7
- Fixed a bug that caused taxonomy names not to be loaded correctly for the
makedb
workflow. - Fixed a bug that caused a crash when using the
--target-indexed
option. - Fixed an error when using the
--tmpdir
option for themakedb
workflow. - Added a warning message when sequence accessions are shortened due to parsing rules for the
makedb
workflow. - Added the option
--no-parse-seqids
to disable parsing of sequence accessions. - Changed the command line help to print options separated by command.
- Fixed an issue that the
--ignore-warnings
option could not be used for themakedb
workflow.
DIAMOND v2.1.6
- Fixed compatibility issues on older systems without support for AVX2.
- Fixed linker errors when compiled with
-DX86=OFF
. - Fixed a compiler error on macOS systems.
- Fixed a bug that could cause missing tags in the XML output format and unaligned queries not to be reported correctly.
- Fixed a bug that caused the PAF output format not to work correctly.
DIAMOND v2.1.5
- Disabled the use of frequency based seed masking when using the linear-time search feature with respect to the targets.
- Fixed a bug that caused a
Database file is not a BLAST database
error message for theprepdb
workflow. - Fixed a bug that caused a segmentation fault when using BLAST databases.
- Added line numbers for error messages when reading taxonomy mapping files.
- Fixed a bug that could cause a crash when using the
greedy-vertex-cover
workflow without the--out
and--centroid-out
options. - Fixed a bug that caused the
greedy-vertex-cover
workflow to only produce a trivial clustering. - Fixed a bug that caused the last codon of the -2 reading frame to be translated incorrectly.
- Reduced the memory use of the clustering workflow.
- Updated the bundled NCBI toolkit to the latest version.
DIAMOND v2.1.4
- Leading spaces are now trimmed and tabulator characters escaped as
\t
in sequence titles, and a warning message is produced. - Blank sequence titles are now replaced by
N/A
, and a warning message is produced. - Fixed a bug that could cause a
Traceback error
in certain cases. - Fixed a bug that caused the
qlen
andscore
output fields not to be reported correctly for therealign
workflow. - Added an error message when using unsupported output fields for the
realign
workflow. - Fixed an issue that could cause a
Missing fields in input line
error when clustering. - Optimized the performance of the
linclust
workflow. - Reduced the memory use of the clustering workflow.
- Fixed a bug that caused using standard input as the query not to work.
DIAMOND v2.1.3
- Fixed compiler errors for GCC 4.8.
- Fixed a GCC compiler error.
- Fixed a segfault issue occuring when compiled using GCC 12 on ARM64 systems.
- Fixed an issue that caused missing support for AVX2.
DIAMOND v2.1.2
- The iterated search mode (option
--iterate
) now uses a linear-time feature as the first search round. - Added the
linclust
command to cluster using only a single linear-time search round. - Fixed compiler errors on macOS.
- Fixed a bug that caused invalid alignment traceback output for the DAA
view
workflow. - Added the
merge-daa
workflow to merge DAA files. - Fixed an error when using the
--max-target-seqs/-k
option for the DAAview
workflow. - Removed AVX2 support from the Windows release binary to ensure compatibility with older systems.
- Permitted the
--ignore-warnings
option for thecluster
anddeepclust
workflows. - Use unlinked temporary files for database blocks in clustering workflows.
- Fixed a bug that could cause invalid results when using a clustering step with linearization as the final round in combination with database processing in multiple super blocks.
- The
--lin-stage1
option can now be used without compilation using the-DEXTRA=ON
cmake option. - Added the option to specify the
_lin
suffix for sensitivity keywords for the--iterate
option to activate linear-time feature. - Added the option
--linsearch
to activate linear-time feature for the search workflows. - Fixed a bug that caused the
ppos
andpositive
output fields not to work for therealign
workflow. - Fixed an issue that caused motif masking not to work when compiled with link time optimization.