Skip to content

Latest commit

 

History

History
37 lines (31 loc) · 2.47 KB

Lecture.md

File metadata and controls

37 lines (31 loc) · 2.47 KB

Lecture material

2023-03-02 lecture 9

Probability model P(x)
Assumptions:
1.symmertic around the mean?
2.Not synmmertic aorund the mean?
3.How variable? concentrated around the mean or not
Using possion model, only have one parameter (lambda)
Screen Shot 2023-03-02 at 1 14 09 PM The accuracy of probability is mainly depends on the function we assumned, and the input parameter.
even if it is a good model, it will never be perfect. only one paratmenter controls mean and shape of distribution.

"All models are wrong but some models are useful." --- George Box

It will depend on the assumptions of the model and the quality of input data.
We need a probability model for the evolution of sequences along a phylogentic tree. --- we only focus mutation process between two seqeunces

  1. we need to assume the mutation process is same at every branch of tree --- we only focus mutation process between two seqeunces
  2. we assume all the sites evolve independently ---- we can focus on the mutation process between two sites
  3. all sites evolve the same -- we can choose any site to model the mutation process
    Even if we are seeing A - A, doesn't means there is no mutation at the beginning.number of mutations on time t is assumed to folow a possion distribution

Jukes Cantor(JC69) assume AGCT have the same frequency
Felsenstein model(F81) more realistic model becuase not setting AGCT in same frequency, based on observed sequence
General Time Reversible (GTR) used a lot by recent people, it is flexible.
We will cover model selection in the future class. we will to choose the model best fit for our data. it is not necessary to use complicated model. simplest model will be the best

2023-03-09

Maximum likelihood

building phylogenetic inference step 1: choose the criterion to use distances, parsimony,llikelihood
step 2: search the space of trees until you find the optimum.
1.choose a substitution model
2.for a given tree,calucalte the likelihood give the data and the subtituide model
3.sear the space of trees using the tree moves until you find the maximum likelihood tree.
calculate the likelihood for this tree dependes on parameters. Q matrix