-
Notifications
You must be signed in to change notification settings - Fork 2
2. Format your data
BayesCode requires a tree and an alignment file to run.
Your alignment file must follow the Phylip format.
In addition, the number of bases should be a multiple of 3 (as it will be interpreted as codons).
For example, the following file would be a valid alignment for BayesCode (nodemutsel
or mutselomega
):
8 6
S0 TCCTGA
S1 AATAGT
S2 GGATTT
S3 AATTCA
S4 CGAAGG
S5 AACGCT
S6 ACGAGT
S7 AATATT
A python3 script to convert Fasta to Phylip is available:
python3 fasta_to_ali.py --input ENSG00000000457_SCYL3_NT.fasta --output ENSG00000000457_SCYL3_NT.phy
Your tree file must follow the newick format. The tree does not need to have branch lengths. In addition, the leaves of the tree should have the same names as the sequences in your alignment file. For example, the following file would be a valid tree file for BayesCode matching the alignment file above:
((((((((S0,S1),(S2,S3)),(S4,S5),(S6,S7))),(S8,S9),(S10,S11)),(S12,S13),(S14,S15))))
The data
folder in the BayesCode root folder contains examples of data files usable with BayesCode.
The whole folder can be downloaded here: github.com/ThibaultLatrille/bayescode/releases/download/v1.1.6/data.zip.