Replies: 2 comments 3 replies
-
If we could also incorporate variance into the smooshed coalescent time estimates, and also uncertainty in the node times, we would have a way of integrating over dating uncertainty and, to some extent, topological uncertainty too (because of the polytomies). |
Beta Was this translation helpful? Give feedback.
-
I'm definitely sympathetic to trying to improve estimates where there are polytomies. However, I'm not too keen on the idea of complexifying the current coalescence rate API. The idea here is foremost to provide the raw ingredients for coalescence rates via You could imagine much better estimators could come from taking the raw weights and doing smoothing somehow (like fitting a weighted KDE and evaluating the survival function) but these of course will involve modelling decisions. And, this could be done by downstream functions using the output of So my opinion is that the focus of the tskit |
Beta Was this translation helpful? Give feedback.
-
@nspope developed the nice routines that calculate pair coalescence rates over time. However, when run on an inferred tree sequence (especially when dated with match_segregating_sites=False), any polytomies in the local trees represent the coalescence of the oldest node in the polytomy, and hence the rates do not fit expectations from genetic diversity, etc.
On a tree-by-tree basis, I wonder if it would be possible to "smooth" the coalescence rates in the presence of polytomies such that the rate at the polytomy is smooshed towards more recent time, using the coalescent as a model. More specifically, take the shortest edge under a polytomy, and distribute the rate exponentially from the child to the parent time, rather than putting all the weight at the parent.
E.g. in the following tree, instead of putting all the coalescences for node 13 at time 2, the average coalescence rate would be distributed between times 1 and 2 as if there were a 4-tip coalescent process between those times.
If we are only worried about rates, and not which lineages are actually coalescing at any one point, this seems like a reasonable thing to do. I wonder if the adjustment to the rate calculator is an easy one to make or not (I'm not sure what the algorithm would be to do this over the entire tree sequence: I can only picture it tree-to-tree at the moment, so it could be rather slow.
Beta Was this translation helpful? Give feedback.
All reactions