2023-03-02 lecture 9
Probability model P(x)
Assumptions:
1.symmertic around the mean?
2.Not synmmertic aorund the mean?
3.How variable? concentrated around the mean or not
Using possion model, only have one parameter (lambda)
The accuracy of probability is mainly depends on the function we assumned, and the input parameter.
even if it is a good model, it will never be perfect. only one paratmenter controls mean and shape of distribution.
"All models are wrong but some models are useful." --- George Box
It will depend on the assumptions of the model and the quality of input data.
We need a probability model for the evolution of sequences along a phylogentic tree. --- we only focus mutation process between two seqeunces
- we need to assume the mutation process is same at every branch of tree --- we only focus mutation process between two seqeunces
- we assume all the sites evolve independently ---- we can focus on the mutation process between two sites
- all sites evolve the same -- we can choose any site to model the mutation process
Even if we are seeing A - A, doesn't means there is no mutation at the beginning.number of mutations on time t is assumed to folow a possion distribution
Jukes Cantor(JC69) assume AGCT have the same frequency
Felsenstein model(F81) more realistic model becuase not setting AGCT in same frequency, based on observed sequence
General Time Reversible (GTR) used a lot by recent people, it is flexible.
We will cover model selection in the future class. we will to choose the model best fit for our data. it is not necessary to use complicated model. simplest model will be the best
building phylogenetic inference
step 1: choose the criterion to use distances, parsimony,llikelihood
step 2: search the space of trees until you find the optimum.
1.choose a substitution model
2.for a given tree,calucalte the likelihood give the data and the subtituide model
3.sear the space of trees using the tree moves until you find the maximum likelihood tree.
calculate the likelihood for this tree dependes on parameters. Q matrix