CODEML

🔧 `CODEML`

We are still working on a more interactive tutorial to navigate the settings and usage of the PAML program CODEML. In the meantime, you can consult the PAML documentation in PDF format for details on the settings you can enable in the control file to run the program. In addition, you may want to consult various resources and tutorials that provide users with guidelines and practical examples to run CODEML -- we highly recommend you check them out!

Estimating non-synonymous to synonymous rate ratio of protein-coding genes

RESOURCES AND CITATIONS

A Beginners guide to estimating the non-synonymous to synonymous rate ratio of all protein-coding genes in a genome
Jeffares DC, Tomiczek B, Sojo V, dos Reis M (2015). A Beginners Guide to Estimating the Non-synonymous to Synonymous Rate Ratio of all Protein-Coding Genes in a Genome. In: Peacock, C. (eds) Parasite Genomics Protocols. Methods in Molecular Biology, vol 1201. Humana Press, New York, NY..

The protocol paper above includes all the theoretical and practical details you need to know to estimate the value of $\omega$ for all protein-coding genes in a genome. As shown in their Fig. 1, the protocol guides users throughout a possible workflow of data preparation: gathering sequences, ortholog assignment, alignment, possible post-alignment filtering, and tree construction. Please note that, depending on the type of data you are analysing, you may want to follow another workflow and/or use other programs that have been made available after the publication of Jeffares et al. 2014. Then, the protocol illustrates how CODEML is to be run to estimate the value of $\omega$ (as well as $d_{N}$ and $d_{S}$) and running likelihood ratio tests (LRTs) for positive selection. They also show how to test for adaptive selection on their supplementary material.

Detecting positive selection

RESOURCES AND CITATIONS

Beginner's guide on the use of PAML to detect positive selection
Álvarez-Carretero S, Kapli P, Yang Z (2023). Beginner's guide on the use of PAML to detect positive selection, Mol Biol Evol, 40(4):msad041.

Important

Remember to read the supplementary material where we discuss (i) analyses and checks you should carry out before running tests of positive selection with CODEML, (ii) gene tree VS species tree, and (iii) the usage of rooted and unrooted trees.

If you are looking for step-by-step guidelines that guides you through the usage of CODEML to test for positive selection, this is the protocol you have been looking for! You will specifically learn how to run the following models:

Homogenous model: all alignment sites and taxa have evolved under the same evolutionary pressure. This model, also known as M0 model, assumes that $\omega$ is constant across all sites and lineages.
Site models assume that different (amino acid or codon) sites are under different selective pressures and have different $\omega$ values. Positive selection is detected when a subset of sites in the protein-coding gene have $\omega > 1$.
Branch models assume that $\omega$ varies among branches of the phylogeny and positive selection is detected along specific lineages if $\omega$ for the branches is $> 1$.
Branch-site models assume that $\omega$ varies among branches of the phylogeny and across sites of the gene, and positive selection is detected if a subset of sites for specific branches of the phylogeny have $\omega > 1$.

You can navigate the positive-selection GitHub repository to follow a step-by-step tutorial from data collection and filtering to the usage of CODEML to detect positive selection under the four models mentioned above. We suggest you first try to run CODEML with the examples in the GitHub repository while going through the paper, which may help better integrate the workflow of this type of analysis with CODEML.

The software package is provided "as is" without warranty of any kind. In no event shall the author or their employer be held responsible for any damage resulting from the use of this software, including but not limited to the frustration that you may experience in using the package. The program package, including source codes, example data sets, executables, and this documentation is maintained by Ziheng Yang and distributed under the GNU GPL v3.

Ziheng Yang
Department of Genetics, Evolution, and Environment
University College London
Gower Street
WC1E 6BT, London, United Kingdom

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CODEML

🔧 `CODEML`

Estimating non-synonymous to synonymous rate ratio of protein-coding genes

Detecting positive selection

Clone this wiki locally

CODEML

🔧 CODEML

Estimating non-synonymous to synonymous rate ratio of protein-coding genes

Detecting positive selection

Clone this wiki locally

🔧 `CODEML`