Skip to content

Releases: rhysnewell/Lorikeet

v0.4.0

06 Oct 08:43
Compare
Choose a tag to compare

Version 0.4.0
A move to 0.3.x to 0.4.0 is not done lightly. Version 0.4.0 marks a major milestone in the development of lorikeet and with it comes many feature updates that are either polish mechanics of previous releases or brand new features that I hope users will find useful in understanding what lorikeet is doing.

Major changes:
SNP calling: ✨
- Lorikeet now has an inbuilt snp calling algorithm that is paired with freebayes to help extract SNPs for each input sample and help with the guided variant calling

SPEED: 🏃 💨
One of the guiding principals I had in mind when developing lorikeet was speed. Speed is a partial inspiration behind the name "Lorikeet". Lorikeets are strikingly fast birds that tend to fly in groups. Much the same that Lorikeet "flies" in parallel threads. This update reaches what I think is the optimal balance between speed and memory restrictions.
- You can now specify how many genomes to run in parallel.
- Contigs for each genome now run in parallel.
- Multiple iterators have been optimized to better utilize the capabilities of rayon

Progress: 🔢 👀
No longer will you be bombarded by a ridiculous amount of info messages that won't make much sense to anyone but me. Thanks to indicatif, Lorikeet now has a bunch of fancy progress bars with associated ETA timers which - albeit sometimes inaccurately - provide the user with a better understanding of what is happening under the hood for each sample and each reference in their current run.
Additionally, if a run for whatever reason crashes before completion Lorikeet will now pick up from specific checkpoints and avoid rerunning entire anlayses for a specific genomes. This can be overwritten with the --force command

Outputs: :suspect: 👽
An additional file is now output for all major modes that helps tell the user how distant a specific reference might be between samples. The adjacency matrix tells the user how many variants are shared between samples for a specific reference. This will provide output similar to the trees that can be generated by taking the consensus genomes generated by polish and parsing them to a tool like parsnp.
Speaking of polish, a bug has been fixed which prevented the vcf file being output for any mode other than genotype

Genotyping: 🐀 🐁 🐩 🐕
The genotyping algorithm has seen a bunch of changes. Not all of them will be listed here as it is quite a lot.
- DBSCAN now updates parameters for each reference genome based on whether or not the supplied parameters generate clusters that make sense. i.e. Not every variant can cluster by itself, not all variants can be in the same cluster (usually)
- The read phasing linkage algorithm now happens after DBSCAN. So DBSCAN is seeding the linkage algorithm now. This will provide much the same results as before but at much faster speeds.

In addition, there have been a BUNCH of bug fixes.

v0.3.7

02 Sep 03:38
Compare
Choose a tag to compare

Multiple bug fixes
- Multiple instances of index out of bound errors
- Identified cause of freebayes failure on large metagenomes

EM algorithm for strain coverage detection implemented and working.
Updated read phasing to and clustering to prevent too highly similar clustering to occur

v0.3.6

23 Jul 03:19
Compare
Choose a tag to compare

New features:
Guided variant calling now working on some MNVs, INS and DEL events
Can now parse directory of genomes for easier use
Various bug fixes

v0.3.5

12 Jul 06:21
Compare
Choose a tag to compare

NEW RELEASE
Evolve outputs GFF with dNdS values per reference
Uses Prokka and Prodigal
Faster compute times
Updated help commands
Using Phi-D as proportionality metric

BUG FIXES
Update contig ID bug preventing contigs being output into strain genotypes

v0.3.4

25 Jun 22:31
Compare
Choose a tag to compare
  • valid version string in the Cargo.toml (We didn't think we had the technology for this, but we did it)

v0.3.3a

25 Jun 00:57
Compare
Choose a tag to compare

Small update to variant calling. No longer filter out soft and hard clips using samclip.

v0.3.3

24 Jun 09:27
Compare
Choose a tag to compare

Updated to using Freebayes for SNP calling and SVIM for structural variant calling.
Added in guided variant calling algorithm to rescue low abudance variants.
Added in seeded fuzzy DBSCAN algorithm.
Updated some help messages, many flags still hidden for testing purposes.

0.3.2

25 May 06:23
Compare
Choose a tag to compare

Updated Lorikeet to use both short and long read variant callers: Snippy and SVIM
VCF files are now generated for each BAM, reads are used to phase variants between samples

0.2.9

19 Dec 02:43
Compare
Choose a tag to compare

Added experimental genotype method.
Updated help messages.
included extra flags:
include-supplementary
include-secondary

v0.2.5

23 Oct 04:28
Compare
Choose a tag to compare
v0.2.5 Pre-release
Pre-release

First release of Lorikeet with current implemented modes:
Polymorph - Variant calling pipeline
Summarize - Summarize contig statistics
Evolve - Calculates dN/dS values of genes present in reference based on read mappings

May contain bugs