Releases: phaverty/GenomicVectors.jl
v1.1.1
Back to Julia General registry
Back to Julia General registry
v1.0.0
GenomicVectors v1.0.0
Closed issues:
- should genome info be a named tuple rather than an ordered dict (#17)
- Using symbol for genome name causes some slowdowns (#19)
- @JuliaRegistrator register() (#20)
Merged pull requests:
Fix performance regressions from NamedTuple in GenomeInfo
The way I used NamedTuple in GenomeInfo caused type instability and performance regressions. I have switched to an NTuple of chromosome ends and an OrderedDict of chromosome name (symbols) to chromosome number. The NTuple N param is now part of the GenomeInfo, GenomicPositions and GenomicRanges type definitions, so we are type stable and fast again.
Also, I have some dependency troubles and do not seem to be able to get a version of DataFrames that works with my other dependencies and julia 0.7, so the minimum julia version is now 1.0.
For julia 1.0
First version for Julia 0.7+
To avoid a tricky dependency, I have switched the data structure of GenomeInfo from AxisArray to NamedTuple. The user may see this in a few places, such as the accessors on GenomeInfo:
chr_ends
chr_lengths
chr_offsets
where a NamedTuple is returned rather than an array. I don't think users would need to work with the GenomeInfo directly, so I don't think this will be much of an issue.
More importantly, the genome name and chromosome names are now symbols. The user-facing functions that work with these still take strings. I hope this also is invisible to the user.
Having the chromosome names in the type of the chr_info part of GenomeInfo gives some opportunities for optimization. Probably it would be necessary to raise these to be part of the type of the GenomeInfo. I'm thinking of the same_genome
function in particular.
Make GenomicRanges from BAM file
Adds the ability to make GenomeInfo and GenomicRanges objects from a BAM file.
v0.2.1, more range operations
- Added disjoin, gaps, coverage
- Added resize!
theRe and back again
Adds RCall.jl conversion functions to convert julia GenomicRanges and GenomeInfos to R/Bioconductor GRanges and Seqinfo objects and vice versa
Dropping julia 0.6
Version 0.1.1
Some of the "GenoPos Interface" methods (e.g. chromosomes, eachrange, etc.) were defined for AbstractVector rather than AbstractGenomicVector. Thanks to @tkelman for finding this mistake.
Version 0.1.0: Introducing `GenomicTable`
A GenomicTable is a DataTable with one of the concrete AbstractGenomicVector subtypes as a row index. It is similar, in spirit, to Bioconductor's GenomicRanges, but a GenomicTable isa DataTable, while Bioconductor's GenomicRanges is always a vector of ranges that may, or may not, also have an associated DataFrame of metadata. GenomicTable is experimental and is not yet fully-featured.
This version includes more range overlap features including discrimination between exact- and overlap-matches using an exact argument to findoverlaps, in, etc..
There are a number of internal changes including faster genopos, chrpos, chromosomes, etc. functions and a re-organization of the range search functions as methods on AbstractGenomicVector.
The names of range-related functions have changed. The each function for iterating over tuples of (genostart, genoend) pairs is now called eachrange mirroring the change in RLEVectors. Range searching functions are built on the findoverlaps kernel function. Rather than using separate vocabularies for exact- and overlap-matches, the usual search functions (indexin, findin, in, etc.) gain an exact argument.
This version is fully compatible with julia 0.6. (There are a few deprecation warnings about the new "where" syntax that will eventually be fixed by Compat.jl).