Input data

Analysis type

RAxML-NG supports several types of analysis, which can be selected by specifying a corresponding command:

Command	RAxML 8.x equivalent	Meaning
`--search`	`-f d`	Run topology search to find the best-scoring ML tree (default)
`--evaluate`	`-f e`	Optimize model parameters and/or branch lengths on a fixed tree topology
`--loglh`	`N/A`	Compute log-likelihood of a given tree without any optimization.
`--bootstrap`	`-b`	Run non-parametric bootstrap analysis (equivalent to 'slow' bootstrapping in RAxML). Number of bootstrap replicates and other parameters can be changed with respective options.
`--all`	`-f a`*	Combined tree search and bootstrapping analysis; bootstrap support values will be plotted onto the best-scoring ML tree.
`--support`	`-f b`	Compute bipartition support for a given reference tree (e.g., best ML tree) using an existing set of replicate trees (e.g., bootstrap trees obtained with `--bootstrap` option above). Usage: `raxml-ng --support --tree bestML.tree --bs-trees bootstraps.tree`
`--bsconverge`	`-I`	A posteriori bootstrap convergence test. Usage: `raxml-ng --bsconverge --bs-trees bootstraps.tree --bs-cutoff 0.03`
`--check`	`-f c`	Check alignment file and remove any columns consisting entirely of gaps
`--parse`	`N/A`	Parse alignment, compress patterns and create binary MSA file
`--start`	`-y`	Generate parsimony/random starting trees and exit
`--terrace`	`N/A`	Check whether a tree lies on a phylogenetic terrace. Usage: `raxml-ng --terrace --tree best.tre --msa ali.fa --model partition.txt`

* Unlike in RAxML 8.x, this command will perform 'slow' bootstrapping procedure.

Multiple sequence alignment

Option: --msa FILE (mandatory)

RAxML-NG supports alignments in FASTA, non-interleaved PHYLIP and CATG formats.

By default, RAxML-NG will try to automatically detect alignment format based on the file contents. Usually this works just fine, but you can also specify the alignment format explicitly with the --msa-format option.

Evolutionary model

Option: --model STRING | FILE (mandatory)

Evolutionary model can be specified globally (i.e., for the whole alignment), or multiple models can be selected for different subsets of alignment columns (so called partitioned analysis).

Single model

Global per-alignment evolutionary model can be given as a string on the command line. Model specification always starts with a substitution matrix name, e.g., GTR for DNA data or LG for protein data. Several optional modifiers can be added, separated by + and in arbitrary order. This notation is inspired by -- and mostly compatible with -- model specification in the IQ-Tree program (Nguyen et al. 2015).

NOTE: all per-state values (e.g. base frequencies) must be given in the following order.

All substitution matrices and modifiers are summarized in the following table:

Modifier	Possible values
Substitution matrix	DNA data: `JC`, `K80`, `F81`, `HKY`, `TN93ef`, `TN93`, `K81`, `K81uf`, `TPM2`, `TPM2uf`, `TPM3`, `TPM3uf`, `TIM1`, `TIM1uf`, `TIM2`, `TIM2uf`, `TIM3`, `TIM3uf`,`TVMef`, `TVM`, `SYM`, `GTR` Protein data: `Dayhoff`, `LG`, `DCMut`, `JTT`, `mtREV`, `WAG`, `RtREV`, `CpREV`, `VT`, `Blosum62`, `MtMam`, `MtArt`, `MtZoa`, `PMB`, `HIVb`,`HIVw`, `JTT-DCMut`, `FLU`, `StmtREV`, `LG4M` (implies `+G4`), `LG4X` (implies `+R4`), `PROTGTR` Binary data (0/1): `BIN` Morphological/multistate:* `MULTIx_MK`, `MULTIx_GTR` (where `x` = number of states, e.g.: `MULTI8_MK` for a 8-state model with equal rates) state encoding Unphased diploid genotypes (10 states): `GTJC` `GTHKY4` `GTGTR4` `GTGTR` Fixed user-defined rates: e.g. `HKY{1.0/2.5}` or `GTR{0.5/2.0/1.0/1.2/0.1/1.0}`
Stationary frequencies	`+F` or `+FC` (empirical) `+FO` (ML estimate) `+FE` (equal) `+FU{f1/f2/../fn}` (user-defined: `f1 f2 ... fn`)
Proportion of invariant sites	`+I` or `+IO` (ML estimate) `+IC` (empirical) `+IU{p}` (user-defined: `p`)
Among-site rate heterogeneity model	`+G` (discrete GAMMA with 4 categories, mean category rates, ML estimate of alpha) `+GA` (as above, but with median category rates) `+Gn` (discrete GAMMA with `n` categories, ML estimate of alpha) `+Gn{a}` (discrete GAMMA with `n` categories and user-defined alpha `a`) `+Rn` (FreeRate with `n` categories, ML estimate of rates and weights) `+Rn{r1/r2/../rn}{w1/w2/../wn}` (FreeRate with `n` categories, user-defined rates `r1 r2 ... rn` and weights `w1 w2 ... wn`)
Ascertainment bias correction	`+ASC_LEWIS` (Lewis' method) `+ASC_FELS{w}` (Felsenstein's method with total number of invariable sites `w`) `+ASC_STAM{w1/w2/../wn}` (Stamatakis' method with per-state invariable site numbers `w1 w2 ... wn`)

* see libpll wiki for details & references

Multiple models

Multiple models can be defined in a RAxML-style partition file. Example:

JC+G, p1 = 1-100, 252-400
HKY+F, p2 = 101-180, 251
GTR+I, p3 = 181-250

Here, each line defines a partition and consist of three elements:

model specification (see above)
partition name
range of alignment columns

NOTE: In RAxML, certain model modifiers were global (e.g., GAMMA model of rate heterogeneity), and thus they were specified on the command line and not in partition file. In RAxML-NG, this limitation was lifted, i.e. it is now possible to combine partitions with and without GAMMA, proportion of invariant sites etc. (as in example above). However, this means that RAxML partition files might need to be adjusted for RAxML-NG (e.g., by adding+G for the partitions where GAMMA model of rate heterogeneity should be used).

Branch length linkage

In case of partitioned analysis, three branch length estimation modes are available:

Command	Meaning
`--brlen linked`	Branch lengths are identical for all partitions (default)
`--brlen scaled`	Joint branch length estimation with individual per-partition scalers (i.e., branch lengths are proportional)
`--brlen unlinked`	Branch lengths are estimated independently for each partition (cf. RAxML `-M` option)

Starting tree(s)

Option: --tree rand{N} | pars{N} | FILE

RAxML-NG supports three types of starting trees:

rand(om): start from a random topology
pars(imony): start from a tree generated by the parsimony-based randomized stepwise addition algorithm
user-defined: load a custom starting tree from the NEWICK file

For random and parsimony, you can specify the number of trees to generate in curly brackets (e.g., pars{10} or rand{20}). In this case, RAxML-NG will perform multiple tree searches (one per each starting tree), and pick the best-scoring topology as the final ML tree. You can also combine both parsimony and random starting trees in one run, e.g. --tree pars{10},rand{10}.

Default number of starting trees depends on RAxML-NG version and command:

RAxML-NG v0.7.0b

Command	Meaning
`--search`	1 random
`--all`	10 random + 10 parsimony

RAxML-NG v0.7.0git >= 13.11.2018

Command	Meaning
`--search`	10 random + 10 parsimony
`--search1`	1 random
`--all`	10 random + 10 parsimony

Topological constraint

Option: --constraint-tree FILE

You can specify a constraint tree to e.g. enforce monophyly of certain groups (equivalent to the -g option in RAxML8). If the constraint tree is comprehensive (i.e., it includes all taxa found in the MSA), then RAxML will simply resolve polytomies in the way that maximizes the likelihood. Conversely, if some taxa are missing from the constraint, they will be placed freely in the resulting ML tree.

State encoding & order

Data type	Order
DNA	`A C G T`
PROTEIN	`A R N D C Q E G H I L K M F P S T W Y V`
MULTISTATE	`0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ! \ " # $ % & ' ( ) * + , / : ; < = > @ [ \ ] ^ _ { \| } ~`
GENOTYPE (diploid unphased)	`A C G T M R W S Y K` (Meaning: `A/A C/C G/G T/T A/C A/G A/T C/G C/T G/T`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly