bd.sim
now allows for simulation up to a number of species as well as the original implementation simulating up to a certain time. To choose which realization of a certain number of species we choose, we use the method proposed by Stadler 2011 (seebd.sim
documentation for full reference), whereby we simulate to a much larger number of species, then go back to check which periods had the desired number, and choose one through weighted sampling using the amount of time spent in each as the weights. Finally, we sample a uniform distribution to decide the length of the final period between when the species number was achieve and the end of the simulation.
Added functions to simulate trait evolution and trait-dependent birth death models, in particular State Speciation and Extinction (SSE) models BiSSE, MuSSE, and QuaSSE. See documentation for full reference.
bd.sim.traits
simulates species diversification following a MuSSE model, where traits evolve from an Mk model and change speciation and/or extinction in a discrete fashion. Allows for the simulation of multiple traits, though currently the rates can only depend on one of them (each rate can depend on a different trait, though). Can simulate HiSSE as well, by setting thenHidden
parameter, representing the number of hidden states, to something higher than 1. Can set separate number of states (observed or hidden), initial trait values, and transition matrices for each trait. See?bd.sim.traits
for details. In the future, other trait-dependent diversification models will be implemented, including relaxing the assumption of discrete traits.sample.clade.traits
simulates state-dependent fossil sampling, following a similar algorithm asbd.sim.traits
. It allows for a hidden state model as well, but note that trait information (usually coming frombd.sim.traits
) needs to include all states as observed for that (see examples in?sample.clade.traits
and vignettes for details).draw.sim
can now color longevity segments and fossil occurrences based on the trait values of a given trait for each species through time. Customization options include which colors to use for each state, whether to plot fossils as true occurrence times or ranges (already present in 1.0, but now with a dedicated argument for such,fossilsFormat
), and where to place trait value legend. See?draw.sim
,?bd.sim.traits
,?sample.clade.traits
, and theoverview
vignette for examples.
make.phylo
now allows for an optionalfossils
input representing a fossil record, and will then add these fossils to the tree as sampled ancestors. These SAs are added as 0-length branches if thesaFormat
input is set to"branch"
, or as degree-2 nodes if it is set to"node"
. ThereturnTrueExt
input optionally drops the tip representing the true extinction time of a species, and make the last sampled fossil of that species the fossil tip, if set toFALSE
. Note that this last functionality required a limited version of thedrop.tip
function of theAPE
package to be copied. See?make.phylo
for reference and credits.
- The Per Capita method estimation was changed to have a more accurate estimate.
- A complex example was added to the end to respond to comments from a reviewer in the manuscript.
- Examples of new features for version 1.1 (see above) were added.
bin.occurrences
allows for post-hoc binning of fossil occurrences, so one can take a fossil record including only true times, i.e. an output ofsample.clade(..., returnTrue = TRUE)
, and bin it to produce the uncertainty in fossil ages ubiquitous in the true fossil record.
sample.clade.R
: added a small bit on help page to explain the complication of usingadFun
with extant species.make.phylo.R
: corrected bug in node labels.
This is the first release of paleobuddy
, an R package dedicated to
flexible simulations of diversification, fossil records, and
phylogenetic trees. Below we list current features, and above sections
will be filled as new features and fixes are implemented.
rexp.var
generalizesrexp
to take any function of time as a rate. Also allows for ashape
parameter, in which case it similarly generalizesrweibull
.bd.sim
simulates species diversification with high flexibility in allowed speciation and extinction rate scenarios. Produces asim
object.sample.clade
simulates fossil sampling. Similar flexibility tobd.sim
, though even more so in the case of age-dependent rates.make.phylo
creates a bifurcating phylogenetic tree as aphylo
object (see APE) from asim
object. Can take fossils to be added as length0
branches.draw.sim
plots asim
by drawing species durations and relationships, and optionally adding fossils as time points or ranges.
find.lineages
creates subsets of asim
defined as the clades descended from one or more species present in the simulation. Needed to e.g. generate phylogenetic trees from simulations with more than one starting species.make.rate
creates a purely time-dependent (or constant) rate based on optional inputs. Used internally to allow for users to define rate scenarios easily inbd.sim
.phylo.to.sim
creates asim
object from aphylo
object, provided the user makes choices to solve ambiguities on bifurcating phylogenetic trees.var.rate.div
calculates expected diversity for a given diversification rate and set of times. Useful for testingbd.sim
and planning rate scenarios.
sim
a class returned bybd.sim
and used as an input for many functions in the package. Formally, it is a named list of vectors recording speciation time, extinction time, status (extant or extinct), and parent information for each lineage in the simulation. It contains the following methods: **print
gives some quick details about number of extant and total species, and the first few members of each vector. **head
andtail
return thesim
object containing only a given number of species from the beginning and end of its vectors, respectively. **summary
gives quantitative details, e.g. quantiles of durations and speciation waiting times. **plot
plots lineages through time (LTT) plots for births, deaths, and diversity. **sim.counts
counts numbers of births, deaths, and diversity for some given timet
. **is.sim
checks the object is a validsim
object. Used internally for error checking.
temp
temperature data during the Cenozoic. Modified from RPANDA.co2
CO2 data during the Jurassic. Modified from RPANDA.
overview
gives a reasonably in depth look at the main features of the package, including examples of workflows using most available rate scenarios, and examples of applications.
The question of how to structure time came up a lot during development
of the package. Most of the literature in macroevolution and
paleontology considers absolute geological time, i.e. t = 0
at the
present and t = 5
five million years ago. It becomes challenging,
however, to visualize and program complex rates going backwards. As
such, the code is structured such that all functions are considered to
go from 0
to the maximum simulation time tMax
—i.e. the inverse of
absolute geological time. There is only one exception to this rule, the
adFun
parameter describing age-dependent distribution of fossil
occurrences in sample.clade
. In any case, all returned objects in the
package are set to follow absolute geological time, so as to conform to
the literature.
-
The
integrate
function occasionally fails when given functions that vary suddenly and rashly, usually happening in the case of environmentally-dependent rates. I have tracked this error down to numerical problems inintegrate
, and testing seems to indicate the error does not preventintegrate
from finding the correct result. As such this is not currently something I intend to fix, though if issues are found that indicate this could be apaleobuddy
problem, not anintegrate
problem, that could change. -
paleobuddy
is the first package to implement time-dependent parameters for Weibull-distributed waiting times. Since the authors currently are not aware of an analytical solution to important quantities in the BD process in this case, it is challenging to test exactly. Simulation tests indicate pretty strongly that the algorithm works, however, with one exception—in cases where shape is time-dependent and varies dramatically, especially when close to0
,rexp.var
seems to have a hard time finding the correct waiting time distribution. When maintained within levels generally accepted as sensible throughout the literature—around 0.8 to 3, say—, and even a reasonable amount outside of that range, tests indicate the algorithm functions as it should. Atestthat
routine will be implemented in the future to formalize these claims, and this issue is one I plan to work on soon, especially if users report it as more prevalent than I thought.